Leyuan Wang
Leyuan Wang
University of California, Davis
Verifierad e-postadress på amazon.com - Startsida
Titel
Citeras av
Citeras av
År
{TVM}: An automated end-to-end optimizing compiler for deep learning
T Chen, T Moreau, Z Jiang, L Zheng, E Yan, H Shen, M Cowan, L Wang, ...
13th {USENIX} Symposium on Operating Systems Design and Implementation …, 2018
5032018
TVM: end-to-end optimization stack for deep learning
T Chen, T Moreau, Z Jiang, H Shen, EQ Yan, L Wang, Y Hu, L Ceze, ...
arXiv preprint arXiv:1802.04799 11, 20, 2018
1512018
Gunrock: GPU graph analytics
Y Wang, Y Pan, A Davidson, Y Wu, C Yang, L Wang, M Osama, C Yuan, ...
ACM Transactions on Parallel Computing (TOPC) 4 (1), 1-49, 2017
752017
A comparative study on exact triangle counting algorithms on the gpu
L Wang, Y Wang, C Yang, JD Owens
Proceedings of the ACM Workshop on High Performance Graph Processing, 1-8, 2016
432016
A unified optimization approach for cnn model inference on integrated gpus
L Wang, Z Chen, Y Liu, Y Wang, L Zheng, M Li, Y Wang
Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019
222019
Fast parallel suffix array on the GPU
L Wang, S Baxter, JD Owens
European Conference on Parallel Processing, 573-587, 2015
182015
Fast parallel skew and prefix‐doubling suffix array construction on the GPU
L Wang, S Baxter, JD Owens
Concurrency and Computation: Practice and Experience 28 (12), 3466-3484, 2016
132016
Plink: Discovering and exploiting locality for accelerated distributed training on the public cloud
L Luo, P West, J Nelson, A Krishnamurthy, L Ceze
Proceedings of Machine Learning and Systems 2, 82-97, 2020
72020
HAWQ-V3: Dyadic Neural Network Quantization
Z Yao, Z Dong, Z Zheng, A Gholami, J Yu, E Tan, L Wang, Q Huang, ...
International Conference on Machine Learning, 11875-11886, 2021
62021
Optimal message scheduling for aggregation
L Wang, M Li, E Liberty, AJ Smola
NETWORKS 2 (3), 2-3, 2018
52018
Fast bfs-based triangle counting on gpus
L Wang, JD Owens
2019 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2019
42019
Fast parallel subgraph matching on the gpu
L Wang, Y Wang, JD Owens
HPDC, 2016
42016
UNIT: Unifying Tensorized Instruction Compilation
J Weng, A Jain, J Wang, L Wang, Y Wang, T Nowatzki
2021 IEEE/ACM International Symposium on Code Generation and Optimization …, 2021
2021
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–13