PARTANS: An autotuning framework for stencil computation on multi-GPU systems T Lutz, C Fensch, M Cole ACM Transactions on Architecture and Code Optimization (TACO) 9 (4), 1-24, 2013 | 102 | 2013 |
Performance portable GPU code generation for matrix multiplication T Remmelg, T Lutz, M Steuwer, C Dubach Proceedings of the 9th Annual Workshop on General Purpose Processing using …, 2016 | 41 | 2016 |
Dynamic compiler parallelism techniques V Grover, T Lutz US Patent 10,152,312, 2018 | 18 | 2018 |
LambdaJIT: a dynamic compiler for heterogeneous optimizations of STL algorithms T Lutz, V Grover Proceedings of the 3rd ACM SIGPLAN Workshop on Functional High-performance …, 2014 | 18 | 2014 |
Helium: a transparent inter-kernel optimizer for opencl T Lutz, C Fensch, M Cole Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 70-80, 2015 | 16 | 2015 |
Partial program specialization at runtime V Grover, T Lutz US Patent 9,952,843, 2018 | 7 | 2018 |
Just in time compilation using link time optimization M Murphy, SG Dsouza, S Nagori, T Lutz US Patent App. 18/637,355, 2024 | | 2024 |
Just in time compilation using link time optimization M Murphy, SG Dsouza, S Nagori, T Lutz US Patent 11,972,281, 2024 | | 2024 |
Dynamic compiler parallelism techniques V Grover, T Lutz US Patent App. 16/215,508, 2019 | | 2019 |
Enhancing productivity and performance portability of opencl applications on heterogeneous systems using runtime optimizations T Lutz The University of Edinburgh, 2015 | | 2015 |