Tor Lattimore
Tor Lattimore
DeepMind
Verifierad e-postadress på google.com - Startsida
Titel
Citeras av
Citeras av
År
Bandit algorithms
T Lattimore, C Szepesvári
Cambridge University Press, 2020
4592020
Unifying PAC and regret: Uniform PAC bounds for episodic reinforcement learning
C Dann, T Lattimore, E Brunskill
Advances in Neural Information Processing Systems, 5713-5723, 2017
1062017
Optimal cluster recovery in the labeled stochastic block model
SY Yun, A Proutiere
Advances in Neural Information Processing Systems, 965-973, 2016
100*2016
PAC bounds for discounted MDPs
T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 320-334, 2012
722012
The end of optimism? an asymptotic analysis of finite-armed linear bandits
T Lattimore, C Szepesvari
Artificial Intelligence and Statistics, 728-737, 2017
652017
Optimal cluster recovery in the labeled stochastic block model
SY Yun, A Proutiere
Advances in Neural Information Processing Systems, 965-973, 2016
532016
On explore-then-commit strategies
A Garivier, T Lattimore, E Kaufmann
Advances in Neural Information Processing Systems 29, 784-792, 2016
482016
Conservative bandits
Y Wu, R Shariff, T Lattimore, C Szepesvári
International Conference on Machine Learning, 1254-1262, 2016
462016
Behaviour suite for reinforcement learning
I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ...
arXiv preprint arXiv:1908.03568, 2019
412019
Near-optimal PAC bounds for discounted MDPs
T Lattimore, M Hutter
Theoretical Computer Science 558, 125-143, 2014
382014
Universal knowledge-seeking agents for stochastic environments
L Orseau, T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 158-172, 2013
372013
The sample-complexity of general reinforcement learning
T Lattimore, M Hutter, P Sunehag
Proceedings of the 30th International Conference on Machine Learning, 2013
342013
No free lunch versus Occam’s razor in supervised learning
T Lattimore, M Hutter
Algorithmic Probability and Friends. Bayesian Prediction and Artificial …, 2013
342013
Degenerate feedback loops in recommender systems
R Jiang, S Chiappa, T Lattimore, A György, P Kohli
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 383-390, 2019
322019
Optimally confident UCB: Improved regret for finite-armed bandits
T Lattimore
arXiv preprint arXiv:1507.07880, 2015
322015
Bounded Regret for Finite-Armed Structured Bandits
T Lattimore, R Munos
322014
Learning with good feature representations in bandits and in rl with a generative model
T Lattimore, C Szepesvari, G Weisz
International Conference on Machine Learning, 5662-5670, 2020
312020
A geometric perspective on optimal representations for reinforcement learning
M Bellemare, W Dabney, R Dadashi, AA Taiga, PS Castro, N Le Roux, ...
Advances in Neural Information Processing Systems, 4358-4369, 2019
282019
Refined lower bounds for adversarial bandits
S Gerchinovitz, T Lattimore
Advances in Neural Information Processing Systems 29, 1198-1206, 2016
282016
Asymptotically optimal agents
T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 368-382, 2011
282011
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–20