Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
T Ben-Nun, T Hoefler
ACM Computing Surveys (CSUR) 52 (4), 1-43, 2019
A package for OpenCL based heterogeneous computing on clusters with many GPU devices
A Barak, T Ben-Nun, E Levy, A Shiloh
2010 IEEE international conference on cluster computing workshops and …, 2010
Groute: An asynchronous multi-GPU programming model for irregular computations
T Ben-Nun, M Sutton, S Pai, K Pingali
ACM SIGPLAN Notices 52 (8), 235-248, 2017
Solution X-ray scattering form factors of supramolecular self-assembled structures
P Székely, A Ginsburg, T Ben-Nun, U Raviv
Langmuir 26 (16), 13110-13129, 2010
X+: a comprehensive computationally accelerated structure analysis tool for solution X-ray scattering from supramolecular self-assemblies
T Ben-Nun, A Ginsburg, P Székely, U Raviv
Journal of Applied Crystallography 43 (6), 1522-1531, 2010
Neural Code Comprehension: A Learnable Representation of Code Semantics
T Ben-Nun, AS Jakobovits, T Hoefler
Advances in Neural Information Processing Systems 31, 2018
Memory access patterns: the missing piece of the multi-GPU puzzle
T Ben-Nun, E Levy, A Barak, E Rubin
SC'15: Proceedings of the International Conference for High Performance …, 2015
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning
T Ben-Nun, M Besta, S Huber, AN Ziogas, D Peter, T Hoefler
The 33rd IEEE International Parallel & Distributed Processing Symposium …, 2019
MAPS: Optimizing Massively Parallel Applications using Device-Level Memory Abstraction
E Rubin, E Levy, A Barak, T Ben-Nun
ACM Transactions on Architecture and Code Optimization (TACO) 11 (4), 44, 2015
Augment Your Batch: Improving Generalization Through Instance Repetition
E Hoffer, T Ben-Nun, I Hubara, N Giladi, T Hoefler, D Soudry
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020
Graph processing on FPGAs: Taxonomy, survey, challenges
M Besta, D Stanojevic, JDF Licht, T Ben-Nun, T Hoefler
arXiv preprint arXiv:1903.06697, 2019
Reciprocal grids: a hierarchical algorithm for computing solution x-ray scattering curves from supramolecular complexes at high resolution
A Ginsburg, T Ben-Nun, R Asor, A Shemesh, I Ringel, U Raviv
Journal of chemical information and modeling 56 (8), 1518-1527, 2016
Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling
M Sutton, T Ben-Nun, A Barak
IEEE International Parallel and Distributed Processing Symposium, 2018
A global scheduling framework for virtualization environments
Y Etsion, T Ben-Nun, DG Feitelson
2009 IEEE International Symposium on Parallel & Distributed Processing, 1-8, 2009
Stateful Dataflow Multigraphs: A data-centric model for performance portability on heterogeneous architectures
T Ben-Nun, J de Fine Licht, AN Ziogas, T Schneider, T Hoefler
Proceedings of the International Conference for High Performance Computing …, 2019
Accelerating Deep Learning Frameworks with Micro-batches
Y Oyama, T Ben-Nun, T Hoefler, S Matsuoka
Substream-Centric Maximum Matchings on FPGA
M Besta, M Fischer, T Ben-Nun, J de Fine Licht, T Hoefler
27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays …, 2018
Solution X‐ray Scattering Form‐Factors with Arbitrary Electron Density Profiles and Polydispersity Distributions
T Ben‐Nun, R Asor, A Ginsburg, U Raviv
Israel Journal of Chemistry 56 (8), 622-628, 2016
Big data causing big (TLB) problems: taming random memory accesses on the GPU
T Karnagel, T Ben-Nun, M Werner, D Habich, W Lehner
Proceedings of the 13th International Workshop on Data Management on New …, 2017
FFMK: A fast and fault-tolerant microkernel-based system for exascale computing
C Weinhold, A Lackorzynski, J Bierbaum, M Küttler, M Planeta, H Härtig, ...
Software for Exascale Computing-SPPEXA 2013-2015, 405-426, 2016
