Miljan Martic
Miljan Martic
DeepMind
Verified email at google.com
TitleCited byYear
Deep reinforcement learning from human preferences
PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei
Advances in Neural Information Processing Systems, 4299-4307, 2017
2112017
Ai safety gridworlds
J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ...
arXiv preprint arXiv:1711.09883, 2017
732017
Scalable agent alignment via reward modeling: a research direction
J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg
arXiv preprint arXiv:1811.07871, 2018
152018
Measuring and avoiding side effects using relative reachability
V Krakovna, L Orseau, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
82018
Deep reinforcement learning from human preferences, 2017
P Christiano, J Leike, TB Brown, M Martic, S Legg, D Amodei
URL https://arxiv. org/abs/1706 3741, 0
5
Scaling shared model governance via model splitting
M Martic, J Leike, A Trask, M Hessel, S Legg, P Kohli
arXiv preprint arXiv:1812.05979, 2018
12018
Penalizing side effects using stepwise relative reachability
V Krakovna, L Orseau, R Kumar, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
12018
The system can't perform the operation now. Try again later.
Articles 1–7