Ching-Hsiang Chu
Ching-Hsiang Chu
Research Scientist, Facebook
Verifierad e-postadress på fb.com - Startsida
Titel
Citeras av
Citeras av
År
Optimized broadcast for deep learning workloads on dense-GPU InfiniBand clusters: MPI or NCCL?
AA Awan, CH Chu, H Subramoni, DK Panda
Proceedings of the 25th European MPI Users' Group Meeting, 1-9, 2018
282018
Improving SCTP performance by jitter-based congestion control over wired-wireless networks
JM Chen, CH Chu, EHK Wu, MF Tsai, JR Wang
EURASIP Journal on Wireless Communications and Networking 2011, 1-13, 2011
202011
Scalable distributed dnn training using tensorflow and cuda-aware mpi: Characterization, designs, and performance evaluation
AA Awan, J Bédorf, CH Chu, H Subramoni, DK Panda
2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2019
172019
A collision-aware backoff mechanism for IEEE 802.11 WLANs
YC Chan, MC Liao, CH Chu
2009 IEEE International Conference on Intelligent Computing and Intelligent …, 2009
172009
OC-DNN: Exploiting advanced unified memory capabilities in CUDA 9 and volta GPUs for out-of-core DNN training
AA Awan, CH Chu, H Subramoni, X Lu, DK Panda
2018 IEEE 25th International Conference on High Performance Computing (HiPC …, 2018
162018
Cuda kernel based collective reduction operations on large-scale gpu clusters
CH Chu, K Hamidouche, A Venkatesh, AA Awan, DK Panda
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016
162016
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters
K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ...
2015 IEEE International Conference on Cluster Computing, 78-87, 2015
132015
IVC: Imperceptible video communication
R Carvalho, CH Chu, LJ Chen
Proc. of HotMobile (poster), 2014
132014
Efficient and scalable multi-source streaming broadcast on GPU clusters for deep learning
CH Chu, X Lu, AA Awan, H Subramoni, J Hashmi, B Elton, DK Panda
2017 46th International Conference on Parallel Processing (ICPP), 161-170, 2017
122017
Distributed topology control for energy-efficient and reliable wireless communications
MT Sun, CH Chu, EHK Wu, CS Hsiao, AAK Jeng
IEEE Systems Journal 12 (3), 2152-2161, 2017
122017
A novel congestion control mechanism on tfrc for streaming applications over wired-wireless networks
YC Huang, CH Chu, EHK Wu
Proceedings of the 6th ACM workshop on Wireless multimedia networking and …, 2011
92011
Characterizing cuda unified memory (um)-aware mpi designs on modern gpu architectures
KV Manian, AA Ammar, A Ruhela, CH Chu, H Subramoni, DK Panda
Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, 43-52, 2019
72019
Exploiting hardware multicast and GPUDirect RDMA for efficient broadcast
CH Chu, X Lu, AA Awan, H Subramoni, B Elton, DK Panda
IEEE Transactions on Parallel and Distributed Systems 30 (3), 575-588, 2018
72018
CUDA-Aware OpenSHMEM: Extensions and Designs for High Performance OpenSHMEM on GPU Clusters
K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ...
Parallel Computing 58, 27-36, 2016
72016
Exploiting maximal overlap for non-contiguous data movement processing on modern gpu-enabled systems
CH Chu, K Hamidouche, A Venkatesh, DS Banerjee, H Subramoni, ...
2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016
72016
Performance evaluation of MPI libraries on GPU-enabled OpenPOWER architectures: Early experiences
KS Khorassani, CH Chu, H Subramoni, DK Panda
International Conference on High Performance Computing, 361-378, 2019
62019
Designing high performance heterogeneous broadcast for streaming applications on GPU clusters
CH Chu, K Hamidouche, H Subramoni, A Venkatesh, B Elton, DK Panda
2016 28th International Symposium on Computer Architecture and High …, 2016
52016
Efficient articulation point collaborative exploration for reliable communications in wireless sensor networks
MT Sun, CH Chu, EHK Wu, CS Hsiao
IEEE Sensors Journal 16 (23), 8578-8588, 2016
52016
Designing a Profiling and Visualization Tool for Scalable and In-Depth Analysis of High-Performance GPU Clusters
P Kousha, B Ramesh, KK Suresh, CH Chu, A Jain, N Sarkauskas, ...
2019 IEEE 26th International Conference on High Performance Computing, Data …, 2019
42019
Optimized large-message broadcast for deep learning workloads: MPI, MPI+ NCCL, or NCCL2?
AA Awan, KV Manian, CH Chu, H Subramoni, DK Panda
parallel computing 85, 141-152, 2019
42019
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–20