Qing Yi
TitleCited byYear
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs
Q Wang, X Zhang, Y Zhang, Q Yi
SC'13: Proceedings of the International Conference on High Performance …, 2013
1532013
POET: Parameterized optimizations for empirical tuning
Q Yi, K Seymour, H You, R Vuduc, D Quinlan
2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007
1272007
Transforming loops to recursion for multi-level memory hierarchies
Q Yi, V Adve, K Kennedy
Proceedings of the ACM SIGPLAN 2000 conference on Programming language …, 2000
1052000
High performance Fortran compilation techniques for parallelizing scientific codes
V Adve, G Jin, J Mellor-Crummey, Q Yi
SC'98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, 11-11, 1998
1021998
POET: a scripting language for applying parameterized source‐to‐source program transformations
Q Yi
Software: Practice and Experience 42 (6), 675-706, 2012
672012
Transforming complex loop nests for locality
Q Yi, K Kennedy, V Adve
The Journal Of Supercomputing 27 (3), 219-264, 2004
672004
Understanding stencil code performance on multicore architectures
SMF Rahman, Q Yi, A Qasem
Proceedings of the 8th ACM International Conference on Computing Frontiers, 1-10, 2011
542011
Improving memory hierarchy performance through combined loop interchange and multi-level fusion
Q Yi, K Kennedy
The International Journal of High Performance Computing Applications 18 (2 …, 2004
542004
Automated empirical tuning of scientific codes for performance and power consumption
SF Rahman, J Guo, Q Yi
Proceedings of the 6th International Conference on High Performance and …, 2011
512011
Advanced optimization strategies in the Rice dHPF compiler
J Mellor‐Crummey, V Adve, B Broom, D Chavarría‐Miranda, R Fowler, ...
Concurrency and Computation: Practice and Experience 14 (8‐9), 741-767, 2002
432002
Semantic-driven parallelization of loops operating on user-defined containers
D Quinlan, M Schordan, Q Yi, BR De Supinski
International Workshop on Languages and Compilers for Parallel Computing …, 2003
292003
Applying loop optimizations to object-oriented abstractions through general classification of array semantics
Q Yi, D Quinlan
International Workshop on Languages and Compilers for Parallel Computing …, 2004
282004
A highly parallel reuse distance analysis algorithm on gpus
H Cui, Q Yi, J Xue, L Wang, Y Yang, X Feng
2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012
272012
Exploring the optimization space of dense linear algebra kernels
Q Yi, A Qasem
International Workshop on Languages and Compilers for Parallel Computing …, 2008
232008
Classification and utilization of abstractions for optimization
D Quinlan, M Schordan, Q Yi, A Saebjornsen
International Symposium On Leveraging Applications of Formal Methods …, 2004
232004
Automatic blocking of QR and LU factorizations for locality
Q Yi, K Kennedy, H You, K Seymour, J Dongarra
Proceedings of the 2004 workshop on Memory system performance, 12-22, 2004
222004
Effective use of non-blocking data structures in a deduplication application
SD Feldman, A Bhat, P LaBorde, Q Yi, D Dechev
Proceedings of the 2013 companion publication for conference on Systems …, 2013
192013
Studying the impact of application-level optimizations on the power consumption of multi-core architectures
SMF Rahman, J Guo, A Bhat, C Garcia, MH Sujon, Q Yi, C Liao, ...
Proceedings of the 9th conference on Computing Frontiers, 123-132, 2012
192012
Automated transformation for performance-critical kernels
Q Yi, RC Whaley
Department of Computer Science, University of Texas at San Antonio, 2007
192007
Parameterizing loop fusion for automated empirical tuning
Y Zhao, Q Yi, K Kennedy, D Quinlan, R Vuduc
Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States), 2005
182005
The system can't perform the operation now. Try again later.
Articles 1–20