Data partitioning on multicore and multi-GPU platforms using functional performance models

Z Zhong, V Rychkov… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org
Heterogeneous multiprocessor systems, which are composed of a mix of processing
elements, such as commodity multicore processors, graphics processing units (GPUs), and
others, have been widely used in scientific computing community. Software applications …

Data partitioning on heterogeneous multicore and multi-GPU systems using functional performance models of data-parallel applications

Z Zhong, V Rychkov… - 2012 IEEE International …, 2012 - ieeexplore.ieee.org
Transition to hybrid CPU/GPU platforms in high performance computing is challenging in the
aspect of efficient utilisation of the heterogeneous hardware and existing optimised software.
During recent years, scientific software has been ported to multicore and GPU architectures …

New model-based methods and algorithms for performance and energy optimization of data parallel applications on homogeneous multicore clusters

A Lastovetsky, RR Manumachu - IEEE Transactions on Parallel …, 2016 - ieeexplore.ieee.org
Modern homogeneous parallel platforms are composed of tightly integrated multicore CPUs.
This tight integration has resulted in the cores contending for various shared on-chip
resources such as Last Level Cache (LLC) and interconnect, leading to resource contention …

Model-based optimization of EULAG kernel on Intel Xeon Phi through load imbalancing

A Lastovetsky, L Szustak… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Load balancing is a widely accepted technique for performance optimization of scientific
applications on parallel architectures. Indeed, balanced applications do not waste processor
cycles on waiting at points of synchronization and data exchange, maximizing this way the …

Bi-objective optimization of data-parallel applications on homogeneous multicore clusters for performance and energy

RR Manumachu, A Lastovetsky - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Performance and energy are now the most dominant objectives for optimization on modern
parallel platforms composed of multicore CPU nodes. The existing intra-node and inter-node
optimization methods employ a large set of decision variables but do not consider problem …

A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous HPC platforms

H Khaleghzadeh, RR Manumachu… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Modern HPC platforms have become highly heterogeneous owing to tight integration of
multicore CPUs and accelerators (such as Graphics Processing Units, Intel Xeon Phis, or
Field-Programmable Gate Arrays) empowering them to maximize the dominant objectives of …

Hierarchical partitioning algorithm for scientific computing on highly heterogeneous cpu+ gpu clusters

D Clarke, A Ilic, A Lastovetsky, L Sousa - European Conference on …, 2012 - Springer
Hierarchical level of heterogeneity exists in many modern high performance clusters in the
form of heterogeneity between computing nodes, and within a node with the addition of
specialized accelerators, such as GPUs. To achieve high performance of scientific …

Rate-efficiency and straggler-robustness through partition in distributed two-sided secure matrix computation

J Kakar, S Ebadifar, A Sezgin - arXiv preprint arXiv:1810.13006, 2018 - arxiv.org
Computationally efficient matrix multiplication is a fundamental requirement in various fields,
including and particularly in data analytics. To do so, the computation task of a large-scale
matrix multiplication is typically outsourced to multiple servers. However, due to data …

Fupermod: A framework for optimal data partitioning for parallel scientific applications on dedicated heterogeneous hpc platforms

D Clarke, Z Zhong, V Rychkov… - … Conference on Parallel …, 2013 - Springer
Optimisation of data-parallel scientific applications for modern HPC platforms is challenging
in terms of efficient use of heterogeneous hardware and software. It requires partitioning the
computations in proportion to the speeds of computing devices. Implementation of data …

FuPerMod: a software tool for the optimization of data-parallel applications on heterogeneous platforms

D Clarke, Z Zhong, V Rychkov… - The Journal of …, 2014 - Springer
Optimization of data-parallel applications for modern HPC platforms requires partitioning the
computations between the heterogeneous computing devices in proportion to their speed.
Heterogeneous data partitioning algorithms are based on computation performance models …