Csaba Szepesvari

Citeras av

	Alla	Sedan 2019
Citat	33306	21688
h-index	78	70
i10-index	245	192

4900

2450

1225

3675

2003200420052006200720082009201020112012201320142015201620172018201920202021202220232024114 96 129 95 216 319 383 521 769 843 927 1102 1146 1361 1315 1742 2413 3393 4270 4705 4884 1902

Offentlig åtkomst

Visa alla

68 artiklar

0 artiklar

tillgänglig

inte tillgänglig

Enligt krav från finansiärer

Medförfattare

Tor LattimoreDeepMindVerifierad e-postadress på google.com
Rémi MunosDeepMindVerifierad e-postadress på inria.fr
Yasin Abbasi YadkoriDeepMindVerifierad e-postadress på google.com
Branislav KvetonAmazonVerifierad e-postadress på amazon.com
Dale SchuurmansUniversity of Alberta, Google DeepMindVerifierad e-postadress på cs.ualberta.ca
Kocsis LeventeMTA SZTAKIVerifierad e-postadress på sztaki.hu
Richard S. SuttonKeen, Amii, and University of AlbertaVerifierad e-postadress på richsutton.com
Dávid PálStaff Machine Learning Engineer, InstacartVerifierad e-postadress på instacart.com
Mohammad GhavamzadehAmazonVerifierad e-postadress på amazon.com
András AntosBudapest University of Technology and EconomicsVerifierad e-postadress på cs.bme.hu
Amir-massoud FarahmandUniversity of TorontoVerifierad e-postadress på cs.toronto.edu
Zheng WenGoogle DeepMindVerifierad e-postadress på google.com
Shalabh BhatnagarProfessor in the Department of Computer Science and Automation, Indian Institute of ScienceVerifierad e-postadress på iisc.ac.in
Lorincz, AndrasEotvos Lorand UniversityVerifierad e-postadress på inf.elte.hu
Hamid MaeiNetflixVerifierad e-postadress på netflix.com
Mengdi WangCenter for Statistics & Machine Learning, ECE, Princeton UniversityVerifierad e-postadress på princeton.edu
Nevena LazicDeepMindVerifierad e-postadress på google.com
Michael LittmanBrown UniversityVerifierad e-postadress på brown.edu
Jincheng MeiResearch Scientist, Google BrainVerifierad e-postadress på google.com
Doina PrecupDeepMind and McGill UniversityVerifierad e-postadress på cs.mcgill.ca

Följ

Csaba Szepesvari

DeepMind & University of Alberta

Verifierad e-postadress på cs.ualberta.ca - Startsida

machine learning learning theory online learning reinforcement learning Markov Decision Processes


Titel Sortera efter citat Sortera efter år Sortera efter titel	Citeras av Citeras av	År
Bandit based monte-carlo planning L Kocsis, C Szepesvári European conference on machine learning, 282-293, 2006	4169	2006
Bandit algorithms T Lattimore, C Szepesvári Cambridge University Press, 2020	2608	2020
Algorithms for Reinforcement Learning C Szepesvari Morgan and Claypool, 2010	2096*	2010
Improved algorithms for linear stochastic bandits Y Abbasi-Yadkori, C Szepesvári, D Pál Advances in Neural Information Processing Systems, 2312-2320, 2011	1879	2011
Convergence results for single-step on-policy reinforcement-learning algorithms S Singh, T Jaakkola, ML Littman, C Szepesvári Machine learning 38, 287-308, 2000	987	2000
Exploration–exploitation tradeoff using variance estimates in multi-armed bandits JY Audibert, R Munos, C Szepesvári Theoretical Computer Science 410 (19), 1876-1902, 2009	761	2009
Fast gradient-descent methods for temporal-difference learning with linear function approximation RS Sutton, HR Maei, D Precup, S Bhatnagar, D Silver, C Szepesvári, ... Proceedings of the 26th annual international conference on machine learning …, 2009	698	2009
Finite-Time Bounds for Fitted Value Iteration. R Munos, C Szepesvári Journal of Machine Learning Research 9 (5), 2008	612	2008
Parametric bandits: The generalized linear case S Filippi, O Cappe, A Garivier, C Szepesvári Advances in neural information processing systems 23, 2010	522	2010
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path A Antos, C Szepesvári, R Munos Machine Learning 71, 89-129, 2008	490	2008
X-Armed Bandits. S Bubeck, R Munos, G Stoltz, C Szepesvári Journal of Machine Learning Research 12 (5), 2011	489	2011
Learning with a strong adversary R Huang, B Xu, D Schuurmans, C Szepesvári arXiv preprint arXiv:1511.03034, 2015	430	2015
Regret bounds for the adaptive control of linear quadratic systems Y Abbasi-Yadkori, C Szepesvári Proceedings of the 24th Annual Conference on Learning Theory, 1-26, 2011	410	2011
A generalized reinforcement-learning model: Convergence and applications ML Littman, C Szepesvári ICML 96, 310-318, 1996	344	1996
Toward off-policy learning control with function approximation. HR Maei, C Szepesvári, S Bhatnagar, RS Sutton ICML 10, 719-726, 2010	332	2010
Convergent temporal-difference learning with arbitrary smooth function approximation H Maei, C Szepesvari, S Bhatnagar, D Precup, D Silver, RS Sutton Advances in neural information processing systems 22, 2009	329	2009
Apprenticeship learning using inverse reinforcement learning and gradient methods G Neu, C Szepesvári arXiv preprint arXiv:1206.5264, 2012	317	2012
The grand challenge of computer Go: Monte Carlo tree search and extensions S Gelly, L Kocsis, M Schoenauer, M Sebag, D Silver, C Szepesvári, ... Communications of the ACM 55 (3), 106-113, 2012	315	2012
Multi-criteria reinforcement learning. Z Gábor, Z Kalmár, C Szepesvári ICML 98, 197-205, 1998	309	1998
Cascading bandits: Learning to rank in the cascade model B Kveton, C Szepesvari, Z Wen, A Ashkan International conference on machine learning, 767-776, 2015	307	2015

Systemet kan inte utföra åtgärden just nu. Försök igen senare.

Artiklar 1–20

Citat per år

Dubblettcitat

Sammanfogade citat

Lägg till medförfattareMedförfattare

Följ

Citeras av

Medförfattare