Sub-sampled cubic regularization for non-convex optimization JM Kohler, A Lucchi International Conference on Machine Learning (ICML) 2017, 2017 | 86 | 2017 |
Escaping Saddles with Stochastic Gradients H Daneshmand, J Kohler, A Lucchi, T Hofmann International Conference on Machine Learning (ICML) 2018, 2018 | 66 | 2018 |
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization J Kohler, H Daneshmand, A Lucchi, M Zhou, K Neymeyr, T Hofmann International Conference on Artificial Intelligence and Statistics (AISTATS …, 2019 | 62* | 2019 |
A Stochastic Tensor Method for Non-convex Optimization A Lucchi, J Kohler arXiv preprint arXiv:1911.10367, 2019 | 6 | 2019 |
The Role of Memory in Stochastic Optimization A Orvieto, J Kohler, A Lucchi UAI, 2019, 2019 | 5* | 2019 |
Adaptive norms for deep learning with regularised Newton methods J Kohler, L Adolphs, A Lucchi NeurIPS 2019 Workshop: Beyond First-Order Optimization Methods in Machine …, 2019 | 4* | 2019 |
Batch normalization provably avoids ranks collapse for randomly initialised deep networks H Daneshmand, J Kohler, F Bach, T Hofmann, A Lucchi Neural Information Processing Systems (NeurIPS 2020), 2020 | 2* | 2020 |
Two-Level K-FAC Preconditioning for Deep Learning N Tselepidis, J Kohler, A Orvieto NeurIPS 2020 Workshop on Optimization for Machine Learning (OPT2020), 2020 | | 2020 |