Följ
Jianyu Huang
Jianyu Huang
Meta Platforms, Inc.
Verifierad e-postadress på meta.com - Startsida
Titel
Citeras av
Citeras av
År
Deep Learning Recommendation Model for Personalization and Recommendation Systems
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
6042019
A Study of BFLOAT16 for Deep Learning Training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
2902019
Strassen's algorithm reloaded
J Huang, TM Smith, GM Henry, RA van de Geijn
High Performance Computing, Networking, Storage and Analysis, SC16 …, 2016
812016
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
692022
Performance optimization for the k-nearest neighbors kernel on x86 architectures
CD Yu, J Huang, W Austin, B Xiao, G Biros
Proceedings of the International Conference for High Performance Computing …, 2015
422015
Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2021. Software-Hardware Co-design …
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
arXiv preprint arXiv:2104.05158, 2022
36*2022
FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
D Khudia, J Huang, P Basu, S Deng, H Liu, J Park, M Smelyanskiy
arXiv preprint arXiv:2101.05615, 0
36
Generating families of practical fast matrix multiplication algorithms
J Huang, L Rice, DA Matthews, RA van de Geijn
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
342017
Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs/1906.00091 (2019)
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
33*2019
High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
302021
Mixed-Precision Embedding Using a Cache
JA Yang, J Huang, J Park, PTP Tang, A Tulloch
arXiv preprint arXiv:2010.11305, 2020
222020
Strassen's Algorithm for Tensor Contraction
J Huang, DA Matthews, RA van de Geijn
SIAM Journal on Scientific Computing 40 (3), C305-C326, 2018
212018
Implementing Strassen's Algorithm with CUTLASS on NVIDIA Volta GPUs
J Huang, CD Yu, RA van de Geijn
arXiv preprint arXiv:1808.07984, 2018
202018
Strassen’s Algorithm Reloaded on GPUs
J Huang, CD Yu, RA Geijn
ACM Transactions on Mathematical Software (TOMS) 46 (1), 1-22, 2020
192020
BLISlab: A Sandbox for Optimizing GEMM
J Huang, RA van de Geijn
arXiv preprint arXiv:1609.00076, 2016
152016
Efficient soft-error detection for low-precision deep learning recommendation models
S Li, J Huang, PTP Tang, D Khudia, J Park, HD Dixit, Z Chen
2022 IEEE International Conference on Big Data (Big Data), 1556-1563, 2022
122022
A study of BFLOAT16 for deep learning training (2019)
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 1905
121905
Low-precision hardware architectures meet recommendation model inference at scale
Z Deng, J Park, PTP Tang, H Liu, J Yang, H Yuen, J Huang, D Khudia, ...
IEEE Micro 41 (5), 93-100, 2021
102021
Implementing Strassen’s Algorithm with BLIS
FW Note, J Huang, TM Smith, GM Henry, RA van de Geijn
arXiv preprint arXiv:1605.01078, 2016
10*2016
Practical fast matrix multiplication algorithms
J Huang
The University of Texas at Austin, 2018
72018
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–20