Zhuoran Yang
Title
Cited by
Cited by
Year
Fully decentralized multi-agent reinforcement learning with networked agents
K Zhang, Z Yang, H Liu, T Zhang, T Başar
arXiv preprint arXiv:1802.08757, 2018
1572018
A theoretical analysis of deep Q-learning
J Fan, Z Wang, Y Xie, Z Yang
Learning for Dynamics and Control, 486-489, 2020
682020
Multi-agent reinforcement learning via double averaging primal-dual optimization
HT Wai, Z Yang, Z Wang, M Hong
Advances in Neural Information Processing Systems, 9649-9660, 2018
662018
Multi-agent reinforcement learning: A selective overview of theories and algorithms
K Zhang, Z Yang, T Başar
arXiv preprint arXiv:1911.10635, 2019
642019
Sparse nonlinear regression: Parameter estimation and asymptotic inference
Z Yang, Z Wang, H Liu, YC Eldar, T Zhang
arXiv preprint arXiv:1511.04514, 2015
57*2015
Provably efficient reinforcement learning with linear function approximation
C Jin, Z Yang, Z Wang, MI Jordan
Conference on Learning Theory, 2137-2143, 2020
522020
Networked multi-agent reinforcement learning in continuous spaces
K Zhang, Z Yang, T Basar
2018 IEEE Conference on Decision and Control (CDC), 2771-2776, 2018
312018
On semiparametric exponential family graphical models
Z Yang, Y Ning, H Liu
arXiv preprint arXiv:1412.8697, 2014
30*2014
Provably efficient exploration in policy optimization
Q Cai, Z Yang, C Jin, Z Wang
arXiv preprint arXiv:1912.05830, 2019
262019
Misspecified nonconvex statistical optimization for phase retrieval
Z Yang, LF Yang, EX Fang, T Zhao, Z Wang, M Neykov
arXiv preprint arXiv:1712.06245, 2017
25*2017
High-dimensional non-Gaussian single index models via thresholded score function estimation
Z Yang, K Balasubramanian, H Liu
International Conference on Machine Learning, 3851-3860, 2017
242017
Neural policy gradient methods: Global optimality and rates of convergence
L Wang, Q Cai, Z Yang, Z Wang
arXiv preprint arXiv:1909.01150, 2019
232019
Neural proximal/trust region policy optimization attains globally optimal policy
B Liu, Q Cai, Z Yang, Z Wang
arXiv preprint arXiv:1906.10306, 2019
222019
Neural temporal-difference learning converges to global optima
Q Cai, Z Yang, JD Lee, Z Wang
Advances in Neural Information Processing Systems, 11315-11326, 2019
222019
Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games
K Zhang, Z Yang, T Basar
Advances in Neural Information Processing Systems, 11602-11614, 2019
222019
Finite-sample analyses for fully decentralized multi-agent reinforcement learning
K Zhang, Z Yang, H Liu, T Zhang, T Basar
arXiv preprint arXiv:1812.02783, 2018
212018
Learning non-gaussian multi-index model via second-order stein’s method
Z Yang, K Balasubramanian, Z Wang, H Liu
Advances in Neural Information Processing Systems 30, 6097-6106, 2017
20*2017
On the global convergence of actor-critic: A case for linear quadratic regulator with ergodic cost
Z Yang, Y Chen, M Hong, Z Wang
arXiv preprint arXiv:1907.06246, 2019
142019
Actor-critic provably finds Nash equilibria of linear-quadratic mean-field games
Z Fu, Z Yang, Y Chen, Z Wang
arXiv preprint arXiv:1910.07498, 2019
132019
A multi-agent off-policy actor-critic algorithm for distributed reinforcement learning
W Suttle, Z Yang, K Zhang, Z Wang, T Basar, J Liu
arXiv preprint arXiv:1903.06372, 2019
132019
The system can't perform the operation now. Try again later.
Articles 1–20