Shigang Li
Shigang Li
Postdoc, ETH Zurich, Department of Computer Science, SPCL
Verified email at inf.ethz.ch - Homepage
Title
Cited by
Cited by
Year
NUMA-aware shared-memory collective communication for MPI
S Li, T Hoefler, M Snir
Proceedings of the 22nd international symposium on High-performance parallel …, 2013
842013
Automatic tuning of sparse matrix-vector multiplication on multicore clusters
SG Li, CJ Hu, JC Zhang, YQ Zhang
Science China Information Sciences 58 (9), 1-14, 2015
792015
Parallel processing systems for big data: a survey
Y Zhang, T Cao, S Li, X Tian, L Yuan, H Jia, AV Vasilakos
Proceedings of the IEEE 104 (11), 2114-2136, 2016
632016
Improved MPI collectives for MPI processes in shared address spaces
S Li, T Hoefler, C Hu, M Snir
Cluster Computing 17 (4), 1139-1155, 2014
162014
Cache-oblivious MPI all-to-all communications based on Morton order
S Li, Y Zhang, T Hoefler
IEEE Transactions on Parallel and Distributed Systems, 2017
122017
Kernel optimization for short-range molecular dynamics
C Hu, X Wang, J Li, X He, S Li, Y Feng, S Yang, H Bai
Computer Physics Communications, 2016
122016
Asynchronous work stealing on distributed memory systems
S Li, J Hu, X Cheng, C Zhao
2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013
122013
A Cross-Platform SpMV Framework on Many-Core Architectures
Y Zhang, S Li, S Yan, H Zhou
ACM Transactions on Architecture and Code Optimization (TACO) 13 (4), 33, 2016
102016
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation
B Wu, S Li, Y Zhang, N Nie
Computer Physics Communications, 2016
102016
Efficient parallel optimizations of a high-performance SIFT on GPUs
Z Li, H Jia, Y Zhang, S Liu, S Li, X Wang, H Zhang
Journal of Parallel and Distributed Computing, 2018
82018
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer
S Li, B Wu, Y Zhang, X Wang, J Li, C Hu, J Wang, Y Feng, N Nie
Proceedings of the 47th International Conference on Parallel Processing, 47, 2018
82018
Fast Convolution Operations on Many-Core Architectures
S Li, Y Zhang, C Xiang, L Shi
High Performance Computing and Communications (HPCC), 2015 IEEE 7th …, 2015
82015
Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations
X Zhu, J Zhang, K Yoshii, S Li, Y Zhang, P Balaji
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International …, 2015
82015
POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures
S Li, Y Zhang, T Hoefler
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017
52017
Support for OpenMP tasks on cell architecture
Q Cao, C Hu, H He, X Huang, S Li
International Conference on Algorithms and Architectures for Parallel …, 2010
52010
Taming unbalanced training workloads in deep learning with partial collective operations
S Li, T Ben-Nun, SD Girolamo, D Alistarh, T Hoefler
Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of …, 2020
42020
Extending synchronization constructs in openMP to exploit pipeline parallelism on heterogeneous multi-core
S Li, S Yao, H He, L Sun, Y Chen, Y Peng
International Conference on Algorithms and Architectures for Parallel …, 2011
42011
Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model
J Xiao, S Li, B Wu, H Zhang, K Li, E Yao, Y Zhang, G Tan
Proceedings of the 47th International Conference on Parallel Processing, 12, 2018
32018
FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations
K Li, S Li, S Huang, Y Chen, Y Zhang
The Journal of Supercomputing, 1-20, 2019
22019
Optimizing Parallel Kinetic Monte Carlo Simulation by Communication Aggregation and Scheduling
B Wu, S Li, Y Zhang
National Conference on Big Data Technology and Applications, 282-297, 2015
22015
The system can't perform the operation now. Try again later.
Articles 1–20