Dheevatsa Mudigere
Dheevatsa Mudigere
Distinguished Engineer, NVIDIA
Verified email at - Homepage
Cited by
Cited by
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
NS Keskar, D Mudigere, J Nocedal, M Smelyanskiy, PTP Tang
International Conference on Learning Representations (ICLR), 2017, 2016
Deep learning recommendation model for personalization and recommendation systems
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
A study of BFLOAT16 for deep learning training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
The architectural implications of Facebook's DNN-based personalized recommendation
U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ...
2020 IEEE International Symposium on High Performance Computer Architecture …, 2020
Distributed deep learning using synchronous stochastic gradient descent
D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ...
arXiv preprint arXiv:1602.06709, 2016
Mixed Precision Training of Convolutional Neural Networks using Integer Operations
D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ...
International Conference on Learning Representations (ICLR), 2018, 2018
Recnmp: Accelerating personalized recommendation with near-memory processing
L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ...
2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020
Vector evaluated particle swarm optimization (VEPSO) for multi-objective design optimization of composite structures
SN Omkar, D Mudigere, GN Naik, S Gopalakrishnan
Computers & structures 86 (1-2), 1-14, 2008
A progressive batching L-BFGS method for machine learning
R Bollapragada, J Nocedal, D Mudigere, HJ Shi, PTP Tang
International Conference on Machine Learning, 620-629, 2018
Ternary neural networks with fine-grained quantization
N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey
arXiv preprint arXiv:1705.01462, 2017
Compositional embeddings using complementary partitions for memory-efficient recommendation systems
HJM Shi, D Mudigere, M Naumov, J Yang
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
Mixed dimension embeddings with application to memory-efficient recommendation systems
AA Ginart, M Naumov, D Mudigere, J Yang, J Zou
2021 IEEE International Symposium on Information Theory (ISIT), 2786-2791, 2021
Machine learning accelerator mechanism
A Bleiweiss, A Ramesh, A Mishra, D Marr, J Cook, S Sridharan, ...
US Patent 11,373,088, 2022
Deep learning training in facebook data centers: Design of scale-up and scale-out systems
M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ...
arXiv preprint arXiv:2003.09518, 2020
Crop classifieation using bj010 eally—inspired techniques with hi resolution satelliteimage
Journal oftheIndian SocietyofRemote Sensing, 2OO8 36 (2), 175-182, 2008
Fine-grain compute communication execution for deep learning frameworks
S Sridharan, D Mudigere
US Patent App. 15/869,502, 2018
Unity: Accelerating {DNN} training through joint optimization of algebraic transformations and parallelization
C Unger, Z Jia, W Wu, S Lin, M Baines, CEQ Narvaez, V Ramakrishnaiah, ...
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 10,643,297, 2020
Performance optimizations for scalable implicit RANS calculations with SU2
TD Economon, D Mudigere, G Bansal, A Heinecke, F Palacios, J Park, ...
Computers & Fluids 129, 146-158, 2016
The system can't perform the operation now. Try again later.
Articles 1–20