Följ
John Aslanides
John Aslanides
DeepMind
Verifierad e-postadress på google.com - Startsida
Titel
Citeras av
Citeras av
År
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ...
arXiv preprint arXiv:2112.11446, 2021
7232021
Randomized Prior Functions for Deep Reinforcement Learning
I Osband, J Aslanides, A Cassirer
Neural Information Processing Systems 32, 2018
3852018
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
3482023
Red Teaming Language Models with Language Models
E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, ...
arXiv preprint arXiv:2202.03286, 2022
2832022
Improving alignment of dialogue agents via targeted human judgements
A Glaese, N McAleese, M Trębacz, J Aslanides, V Firoiu, T Ewalds, ...
arXiv preprint arXiv:2209.14375, 2022
2762022
Acme: A Research Framework for Distributed Reinforcement Learning
M Hoffman, B Shahriari, J Aslanides, G Barth-Maron, F Behbahani, ...
arXiv preprint arXiv:2006.00979, 2020
2272020
When to use parametric models in reinforcement learning?
H van Hasselt, M Hessel, J Aslanides
Neural Information Processing Systems 33, 2019
1942019
Behaviour Suite for Reinforcement Learning
I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ...
International Conference on Learning Representations 8, 2020
1712020
Teaching language models to support answers with verified quotes
J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ...
arXiv preprint arXiv:2203.11147, 2022
1312022
Fine-tuning language models to find agreement among humans with diverse preferences
M Bakker, M Chadwick, H Sheahan, M Tessler, L Campbell-Gillingham, ...
Advances in Neural Information Processing Systems 35, 38176-38189, 2022
872022
Relativity concept inventory: Development, analysis, and results
JS Aslanides, CM Savage
Physical Review Special Topics-Physics Education Research 9 (1), 010118, 2013
762013
A general approach to fairness with optimal transport
S Chiappa, R Jiang, T Stepleton, A Pacchiano, H Jiang, J Aslanides
AAAI, 2020
67*2020
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning
G Parascandolo, L Buesing, J Merel, L Hasenclever, J Aslanides, ...
arXiv preprint arXiv:2004.11410, 2020
302020
TF-Replicator: Distributed Machine Learning for Researchers
P Buchlovsky, D Budden, D Grewe, C Jones, J Aslanides, F Besse, ...
arXiv preprint arXiv:1902.00465, 2019
242019
Universal Reinforcement Learning Algorithms: Survey and Experiments
J Aslanides, J Leike, M Hutter
International Joint Conference on Artificial Intelligence 26, 1403-1410, 2017
232017
Fine-Tuning Language Models via Epistemic Neural Networks
I Osband, SM Asghari, B Van Roy, N McAleese, J Aslanides, G Irving
arXiv preprint arXiv:2211.01568, 2022
72022
AIXIjs: A software demo for general reinforcement learning
J Aslanides
arXiv preprint arXiv:1705.07615, 2017
52017
Generalised discount functions applied to a Monte-Carlo AImu implementation
S Lamont, J Aslanides, J Leike, M Hutter
Autonomous Agents and Multiagent Systems, 2017, 2017
42017
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–18