Jan Leike

Cited by

	All	Since 2019
Citations	13734	13287
h-index	26	22
i10-index	31	26

7000

3500

1750

5250

201520162017201820192020202120222023202446 60 87 190 295 365 499 1125 6943 4014

Public access

View all

10 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Jeffrey WuOpenAIVerified email at openai.com
Paul ChristianoAlignment Research CenterVerified email at alignment.org
John SchulmanResearch Scientist, OpenAIVerified email at openai.com
Ryan LoweOpenAIVerified email at openai.com
Marcus HutterResearcher@DeepMind & Professor at ANUVerified email at anu.edu.au
Dario AmodeiCEO and Co-Founder at AnthropicVerified email at anthropic.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerified email at cam.ac.uk
Matthias HeizmannUniversity of Freiburg, GermanyVerified email at heizmann.name
Tom EverittStaff Research Scientist at Google DeepMindVerified email at google.com
Ilya SutskeverCo-Founder and Chief Scientist of OpenAIVerified email at openai.com
Pushmeet KohliDeepMindVerified email at google.com
Andreas PodelskiProfessor of Computer Science, Freiburg UniversityVerified email at informatik.uni-freiburg.de
Geoffrey IrvingUK AI Safety Institute (AISI)Verified email at naml.us
Tegan MaharajAssistant Professor at University of TorontoVerified email at polymtl.ca
William SaundersOpenAIVerified email at cs.toronto.edu
Adam GleaveCEO at FAR AIVerified email at far.ai
Collin BurnsResearcher, OpenAIVerified email at openai.com
Andrew TraskUniversity of Oxford and OpenMinedVerified email at openmined.org

Jan Leike

OpenAI

Verified email at openai.com - Homepage

reinforcement learning deep learning agent alignment


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in Neural Information Processing Systems 35, 27730-27744, 2022	5982	2022
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017	2058	2017
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021	1935	2021
GPT-4 technical report OpenAI arXiv, 2023	1251*	2023
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018	328	2018
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	320	2017
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018	240	2018
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018	191*	2018
Let's Verify Step by Step H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ... arXiv preprint arXiv:2305.20050, 2023	186	2023
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	180	2021
Self-critiquing models for assisting human evaluators W Saunders, C Yeh, J Wu, S Bills, L Ouyang, J Ward, J Leike arXiv preprint arXiv:2206.05802, 2022	116	2022
Language models can explain neurons in language models S Bills, N Cammarata, D Mossing, H Tillman, L Gao, G Goh, I Sutskever, ... URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper …, 2023	98	2023
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015	95	2015
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020	78	2020
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013	60	2013
Institutionalizing ethics in AI through broader impact requirements CEA Prunkl, C Ashurst, M Anderljung, H Webb, J Leike, A Dafoe Nature Machine Intelligence 3 (2), 104-110, 2021	53	2021
Geometric nontermination arguments J Leike, M Heizmann International Conference on Tools and Algorithms for the Construction and …, 2018	53*	2018
Quantifying Differences in Reward Functions A Gleave, M Dennis, S Legg, S Russell, J Leike arXiv preprint arXiv:2006.13900, 2020	52	2020
Hidden Incentives for Auto-Induced Distributional Shift D Krueger, T Maharaj, J Leike arXiv preprint arXiv:2009.09153, 2020	51*	2020
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision C Burns, P Izmailov, JH Kirchner, B Baker, L Gao, L Aschenbrenner, ... arXiv preprint arXiv:2312.09390, 2023	43	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors