Minimax-optimal off-policy evaluation with linear function approximation Y Duan, M Wang International Conference on Machine Learning, 2701-2709, 2020 | 97 | 2020 |
State aggregation learning from Markov transition data Y Duan, T Ke, M Wang Advances in Neural Information Processing Systems, 4486-4495, 2019 | 46 | 2019 |
Risk bounds and Rademacher complexity in batch reinforcement learning Y Duan, C Jin, Z Li International Conference on Machine Learning, 2892-2902, 2021 | 32 | 2021 |
Near-optimal offline reinforcement learning with linear representation: Leveraging variance information with pessimism M Yin, Y Duan, M Wang, YX Wang International Conference on Learning Representations, 2022 | 23 | 2022 |
Sparse feature selection makes batch reinforcement learning more sample efficient B Hao, Y Duan, T Lattimore, C Szepesvári, M Wang International Conference on Machine Learning, 4063-4073, 2021 | 23 | 2021 |
Optimal policy evaluation using kernel-based temporal difference methods Y Duan, M Wang, MJ Wainwright arXiv preprint arXiv:2109.12002, 2021 | 19 | 2021 |
Bootstrapping statistical inference for off-policy evaluation B Hao, X Ji, Y Duan, H Lu, C Szepesvári, M Wang arXiv preprint arXiv:2102.03607, 2021 | 13 | 2021 |
Learning low-dimensional state embeddings and metastable clusters from time series data Y Sun, Y Duan, H Gong, M Wang Advances in Neural Information Processing Systems, 4561-4570, 2019 | 10 | 2019 |
Adaptive and robust multi-task learning Y Duan, K Wang arXiv preprint arXiv:2202.05250, 2022 | 9 | 2022 |
Adaptive low-nonnegative-rank approximation for state aggregation of Markov chains Y Duan, M Wang, Z Wen, Y Yuan SIAM Journal on Matrix Analysis and Applications 41 (1), 244-278, 2020 | 8 | 2020 |
Learning good state and action representations via tensor decomposition C Ni, A Zhang, Y Duan, M Wang 2021 IEEE International Symposium on Information Theory (ISIT), 1682-1687, 2021 | 7 | 2021 |
Bootstrapping fitted Q-evaluation for off-policy inference B Hao, X Ji, Y Duan, H Lu, C Szepesvari, M Wang International Conference on Machine Learning, 4074-4084, 2021 | 7 | 2021 |
Policy evaluation from a single path: Multi-step methods, mixing and mis-specification Y Duan, MJ Wainwright arXiv preprint arXiv:2211.03899, 2022 | | 2022 |