Alessandro Lazaric
Alessandro Lazaric
Research Scientist, Facebook Artificial Intelligence Research
Verifierad e-postadress på inria.fr - Startsida
Titel
Citeras av
Citeras av
År
Best arm identification: A unified approach to fixed budget and fixed confidence
V Gabillon, M Ghavamzadeh, A Lazaric
Advances in Neural Information Processing Systems, 3212-3220, 2012
1812012
Transfer in reinforcement learning: a framework and a survey
A Lazaric
Reinforcement Learning, 143-173, 2012
1562012
Transfer of samples in batch reinforcement learning
A Lazaric, M Restelli, A Bonarini
Proceedings of the 25th international conference on Machine learning, 544-551, 2008
1322008
Reinforcement learning in continuous action spaces through sequential monte carlo methods
A Lazaric, M Restelli, A Bonarini
Advances in neural information processing systems, 833-840, 2008
1142008
Bayesian multi-task reinforcement learning
A Lazaric, M Ghavamzadeh
892010
Risk-aversion in multi-armed bandits
A Sani, A Lazaric, R Munos
Advances in Neural Information Processing Systems, 3275-3283, 2012
882012
Finite-sample analysis of least-squares policy iteration
A Lazaric, M Ghavamzadeh, R Munos
The Journal of Machine Learning Research 13 (1), 3041-3074, 2012
822012
Multi-bandit best arm identification
V Gabillon, M Ghavamzadeh, A Lazaric, S Bubeck
Advances in Neural Information Processing Systems, 2222-2230, 2011
822011
Analysis of a classification-based policy iteration algorithm
A Lazaric, M Ghavamzadeh, R Munos
792010
Linear Thompson sampling revisited
M Abeille, A Lazaric
arXiv preprint arXiv:1611.06534, 2016
752016
Finite-sample analysis of LSTD
A Lazaric, M Ghavamzadeh, R Munos
722010
Best-arm identification in linear bandits
M Soare, A Lazaric, R Munos
Advances in Neural Information Processing Systems, 828-836, 2014
612014
LSTD with random projections
M Ghavamzadeh, A Lazaric, O Maillard, R Munos
Advances in Neural Information Processing Systems, 721-729, 2010
602010
Reinforcement learning of POMDPs using spectral methods
K Azizzadenesheli, A Lazaric, A Anandkumar
arXiv preprint arXiv:1602.07764, 2016
562016
Upper-confidence-bound algorithms for active learning in multi-armed bandits
A Carpentier, A Lazaric, M Ghavamzadeh, R Munos, P Auer
International Conference on Algorithmic Learning Theory, 189-203, 2011
552011
Reinforcement distribution in fuzzy Q-learning
A Bonarini, A Lazaric, F Montrone, M Restelli
Fuzzy sets and systems 160 (10), 1420-1443, 2009
552009
A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities
N Gatti, A Lazaric, F Trovò
Proceedings of the 13th ACM Conference on Electronic Commerce, 605-622, 2012
532012
Sequential transfer in multi-armed bandit with finite set of models
A Lazaric, E Brunskill
Advances in Neural Information Processing Systems, 2220-2228, 2013
512013
Finite-sample analysis of Lasso-TD
M Ghavamzadeh, A Lazaric, R Munos, M Hoffman
502011
Knowledge transfer in reinforcement learning
A Lazaric
PhD thesis, Politecnico di Milano, 2008
502008
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–20