Follow
Yi Su
Yi Su
Google Deepmind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ...
arXiv preprint arXiv:2403.05530, 2024
4112024
Doubly robust off-policy evaluation with shrinkage
Y Su, M Dimakopoulou, A Krishnamurthy, M Dudík
International Conference on Machine Learning, 2020, 2019
1012019
Cab: Continuous adaptive blending for policy evaluation and learning
Y Su, L Wang, M Santacatterina, T Joachims
International Conference on Machine Learning, 6005-6014, 2019
802019
Offline rl for natural language generation with implicit language q learning
C Snell, I Kostrikov, Y Su, M Yang, S Levine
arXiv preprint arXiv:2206.11871, 2022
772022
Off-policy bandits with deficient support
N Sachdeva, Y Su, T Joachims
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
762020
Online adaptation to label distribution shift
R Wu, C Guo, Y Su, KQ Weinberger
Advances in Neural Information Processing Systems 34, 11340-11351, 2021
542021
Adaptive Estimator Selection for Off-Policy Evaluation
Y Su, P Srinath, A Krishnamurthy
International Conference on Machine Learning, 2020, 2020
412020
Optimizing Rankings for Recommendation in Matching Markets
Y Su, M Bayoumi, T Joachims
Proceedings of the ACM Web Conference 2022, 328-338, 2022
272022
Context-Aware Language Modeling for Goal-Oriented Dialogue Systems
C Snell, S Yang, J Fu, Y Su, S Levine
NAACL, 2022, 2022
232022
Recommendations as treatments
T Joachims, B London, Y Su, A Swaminathan, L Wang
AI Magazine 42 (3), 19-30, 2021
192021
Data-driven offline decision-making via invariant representation learning
H Qi, Y Su, A Kumar, S Levine
Advances in Neural Information Processing Systems 35, 13226-13237, 2022
142022
Training language models to self-correct via reinforcement learning
A Kumar, V Zhuang, R Agarwal, Y Su, JD Co-Reyes, A Singh, K Baumli, ...
arXiv preprint arXiv:2409.12917, 2024
82024
Data-driven model-based optimization via invariant representation learning
H Qi, Y Su, A Kumar, S Levine
Proc. Adv. Neur. Inf. Proc. Syst (NeurIPS), 2022
52022
Learning from logged bandit feedback of multiple loggers
Y Su, A Agarwal, T Joachims
ICML Workshop on Machine Learning for Causal Inference, Counterfactual …, 2018
32018
Unified off-policy learning to rank: a reinforcement learning perspective
Z Zhang, Y Su, H Yuan, Y Wu, R Balasubramanian, Q Wu, H Wang, ...
Advances in Neural Information Processing Systems 36, 2024
22024
International Conference on Machine Learning
Y Su, L Wang, M Santacatterina, T Joachims
22019
Long-Term Value of Exploration: Measurements, Findings and Algorithms
Y Su, X Wang, EY Le, L Liu, Y Li, H Lu, B Lipshitz, S Badam, L Heldt, S Bi, ...
Proceedings of the 17th ACM International Conference on Web Search and Data …, 2024
12024
Value of exploration: Measurements, findings and algorithms
Y Su, X Wang, EY Le, L Liu, Y Li, H Lu, B Lipshitz, S Badam, L Heldt, S Bi, ...
arXiv preprint arXiv:2305.07764, 2023
12023
EVOLvE: Evaluating and Optimizing LLMs For Exploration
A Nie, Y Su, B Chang, JN Lee, EH Chi, QV Le, M Chen
arXiv preprint arXiv:2410.06238, 2024
2024
Multi-Task Neural Linear Bandit for Exploration in Recommender Systems
Y Su, H Lu, Y Li, L Liu, S Bi, EH Chi, M Chen
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20