Showing 1–2 of 2 results for author: von Kirchbach, K
-
Efficient Process-to-Node Mapping Algorithms for Stencil Computations
Authors:
Sascha Hunold,
Konrad von Kirchbach,
Markus Lehr,
Christian Schulz,
Jesper Larsson Träff
Abstract:
Good process-to-compute-node mappings can be decisive for well performing HPC applications. A special, important class of process-to-node mapping problems is the problem of mapping processes that communicate in a sparse stencil pattern to Cartesian grids. By thoroughly exploiting the inherently present structure in this type of problem, we devise three novel distributed algorithms that are able to…
▽ More
Good process-to-compute-node mappings can be decisive for well performing HPC applications. A special, important class of process-to-node mapping problems is the problem of mapping processes that communicate in a sparse stencil pattern to Cartesian grids. By thoroughly exploiting the inherently present structure in this type of problem, we devise three novel distributed algorithms that are able to handle arbitrary stencil communication patterns effectively. We analyze the expected performance of our algorithms based on an abstract model of inter- and intra-node communication. An extensive experimental evaluation on several HPC machines shows that our algorithms are up to two orders of magnitude faster in running time than a (sequential) high-quality general graph mapping tool, while obtaining similar results in communication performance. Furthermore, our algorithms also achieve significantly better mapping quality compared to previous state-of-the-art Cartesian grid mapping algorithms. This results in up to a threefold performance improvement of an MPI_Neighbor_alltoall exchange operation. Our new algorithms can be used to implement the MPI_Cart_create functionality.
△ Less
Submitted 20 May, 2020; v1 submitted 19 May, 2020;
originally announced May 2020.
-
Better Process Mapping and Sparse Quadratic Assignment
Authors:
Christian Schulz,
Jesper Larsson Träff,
Konrad von Kirchbach
Abstract:
Communication and topology aware process mapping is a powerful approach to reduce communication time in parallel applications with known communication patterns on large, distributed memory systems. We address the problem as a quadratic assignment problem (QAP), and present algorithms to construct initial mappings of processes to processors, and fast local search algorithms to further improve the m…
▽ More
Communication and topology aware process mapping is a powerful approach to reduce communication time in parallel applications with known communication patterns on large, distributed memory systems. We address the problem as a quadratic assignment problem (QAP), and present algorithms to construct initial mappings of processes to processors, and fast local search algorithms to further improve the mappings. By exploiting assumptions that typically hold for applications and modern supercomputer systems such as sparse communication patterns and hierarchically organized communication systems, we obtain significantly more powerful algorithms for these special QAPs. Our multilevel construction algorithms employ perfectly balanced graph partitioning techniques and exploit the given communication system hierarchy in significant ways. We present improvements to a local search algorithm of Brandfass et al. (2013), and further decrease the running time by reducing the time needed to perform swaps in the assignment as well as by carefully constraining local search neighborhoods. We also investigate different algorithms to create the communication graph that is mapped onto the processor network. Experiments indicate that our algorithms not only dramatically speed up local search, but due to the multilevel approach also find much better solutions in practice.
△ Less
Submitted 22 July, 2019; v1 submitted 14 February, 2017;
originally announced February 2017.