Keith D Underwood
Keith D Underwood
HPC Interconnect Architect, HPE
Verified email at
Cited by
Cited by
FPGAs vs. CPUs: trends in peak floating-point performance
K Underwood
Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field …, 2004
Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance
KD Underwood, KS Hemmert
12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2004
IntelŪ Omni-path Architecture: Enabling Scalable, High Performance Fabrics
MS Birrittella, M Debbage, R Huggahalli, J Kunz, T Lovett, T Rimmer, ...
2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, 1-9, 2015
A re-evaluation of the practicality of floating-point operations on FPGAs
WB Ligon III, S McMillan, G Monn, K Schoonover, F Stivers, ...
FPGAs for Custom Computing Machines, 1998. Proceedings. IEEE Symposium on …, 1998
SeaStar interconnect: Balanced bandwidth for scalable performance
R Brightwell, KT Pedretti, KD Underwood, T Hudson
IEEE Micro 26 (3), 41-57, 2006
Remote memory access programming in MPI-3
T Hoefler, J Dinan, R Thakur, B Barrett, P Balaji, W Gropp, K Underwood
ACM Transactions on Parallel Computing (TOPC) 2 (2), 1-26, 2015
Embedded floating-point units in FPGAs
MJ Beauchamp, S Hauck, KD Underwood, KS Hemmert
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field …, 2006
A hardware acceleration unit for MPI queue processing
KD Underwood, KS Hemmert, A Rodrigues, R Murphy, R Brightwell
Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE …, 2005
An analysis of the double-precision floating-point FFT on FPGAs
KS Hemmert, KD Underwood
13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2005
RC-BLAST: Towards a portable, cost-effective open source hardware implementation
K Muriki, KD Underwood, R Sass
19th IEEE International Parallel and Distributed Processing Symposium, 8 pp., 2005
The impact of MPI queue usage on message latency
KD Underwood, R Brightwell
Parallel Processing, 2004. ICPP 2004. International Conference on, 152-160, 2004
A comparison of floating point and logarithmic number systems for FPGAs
M Haselman, M Beauchamp, A Wood, S Hauck, K Underwood, ...
13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2005
An analysis of NIC resource usage for offloading MPI
R Brightwell, KD Underwood
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th …, 2004
The Portals 4.0 network programming interface
BW Barrett, R Brightwell, S Hemmert, K Pedretti, K Wheeler, K Underwood, ...
Sandia National Laboratories, 2012
Architectural modifications to enhance the floating-point performance of FPGAs
MJ Beauchamp, S Hauck, KD Underwood, KS Hemmert
IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16 (2), 177-187, 2008
An analysis of the impact of MPI overlap and independent progress
R Brightwell, KD Underwood
Proceedings of the 18th annual international conference on Supercomputing …, 2004
Initial performance evaluation of the Cray SeaStar interconnect
R Brightwell, K Pedretti, KD Underwood
13th Symposium on High Performance Interconnects (HOTI'05), 51-57, 2005
Analyzing the impact of overlap, offload, and independent progress for message passing interface applications
R Brightwell, R Riesen, KD Underwood
The International Journal of High Performance Computing Applications 19 (2 …, 2005
Mitigating MPI message matching misery
M Flajslik, J Dinan, KD Underwood
High Performance Computing: 31st International Conference, ISC High …, 2016
Simulating Red Storm: Challenges and Successes in Building a System Simulation
KD Underwood, M Levenhagen, A Rodrigues
2007 IEEE International Parallel and Distributed Processing Symposium, 45-45, 2007
The system can't perform the operation now. Try again later.
Articles 1–20