Jan Leike

Citeras av

	Alla	Sedan 2019
Citat	14106	13658
h-index	26	22
i10-index	31	26

7000

3500

1750

5250

201520162017201820192020202120222023202446 60 87 190 295 365 500 1130 6932 4389

Offentlig åtkomst

Visa alla

10 artiklar

0 artiklar

tillgänglig

inte tillgänglig

Enligt krav från finansiärer

Medförfattare

Jeffrey WuOpenAIVerifierad e-postadress på openai.com
Paul ChristianoNational Institute of Standards and TechnologyVerifierad e-postadress på nist.gov
John SchulmanResearch Scientist, OpenAIVerifierad e-postadress på openai.com
Ryan LoweOpenAIVerifierad e-postadress på openai.com
Marcus HutterResearcher@DeepMind & Professor at ANUVerifierad e-postadress på anu.edu.au
Dario AmodeiCEO and Co-Founder at AnthropicVerifierad e-postadress på anthropic.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerifierad e-postadress på cam.ac.uk
Matthias HeizmannUniversity of Freiburg, GermanyVerifierad e-postadress på heizmann.name
Tom EverittStaff Research Scientist at Google DeepMindVerifierad e-postadress på google.com
Ilya SutskeverCo-Founder and Chief Scientist of OpenAIVerifierad e-postadress på openai.com
Pushmeet KohliDeepMindVerifierad e-postadress på google.com
Andreas PodelskiProfessor of Computer Science, Freiburg UniversityVerifierad e-postadress på informatik.uni-freiburg.de
Geoffrey IrvingUK AI Safety Institute (AISI)Verifierad e-postadress på naml.us
Tegan MaharajAssistant Professor at University of TorontoVerifierad e-postadress på polymtl.ca
William SaundersOpenAIVerifierad e-postadress på cs.toronto.edu
Adam GleaveCEO at FAR AIVerifierad e-postadress på far.ai
Collin BurnsResearcher, OpenAIVerifierad e-postadress på openai.com
Andrew TraskUniversity of Oxford and OpenMinedVerifierad e-postadress på openmined.org

Följ

Jan Leike

OpenAI

Verifierad e-postadress på openai.com - Startsida

reinforcement learning deep learning agent alignment


Titel Sortera efter citat Sortera efter år Sortera efter titel	Citeras av Citeras av	År
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in Neural Information Processing Systems 35, 27730-27744, 2022	6122	2022
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017	2089	2017
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021	1976	2021
GPT-4 technical report OpenAI arXiv, 2023	1372*	2023
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018	331	2018
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	321	2017
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018	243	2018
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018	192*	2018
Let's Verify Step by Step H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ... arXiv preprint arXiv:2305.20050, 2023	190	2023
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	186	2021
Self-critiquing models for assisting human evaluators W Saunders, C Yeh, J Wu, S Bills, L Ouyang, J Ward, J Leike arXiv preprint arXiv:2206.05802, 2022	118	2022
Language models can explain neurons in language models S Bills, N Cammarata, D Mossing, H Tillman, L Gao, G Goh, I Sutskever, ... URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper …, 2023	103	2023
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015	95	2015
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020	78	2020
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013	60	2013
Institutionalizing ethics in AI through broader impact requirements CEA Prunkl, C Ashurst, M Anderljung, H Webb, J Leike, A Dafoe Nature Machine Intelligence 3 (2), 104-110, 2021	54	2021
Geometric nontermination arguments J Leike, M Heizmann International Conference on Tools and Algorithms for the Construction and …, 2018	54*	2018
Hidden Incentives for Auto-Induced Distributional Shift D Krueger, T Maharaj, J Leike arXiv preprint arXiv:2009.09153, 2020	52*	2020
Quantifying Differences in Reward Functions A Gleave, M Dennis, S Legg, S Russell, J Leike arXiv preprint arXiv:2006.13900, 2020	52	2020
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision C Burns, P Izmailov, JH Kirchner, B Baker, L Gao, L Aschenbrenner, ... arXiv preprint arXiv:2312.09390, 2023	46	2023

Systemet kan inte utföra åtgärden just nu. Försök igen senare.

Artiklar 1–20

Citat per år

Dubblettcitat

Sammanfogade citat

Lägg till medförfattareMedförfattare

Följ

Citeras av

Medförfattare