Search | arXiv e-print repository

doi 10.1007/978-3-031-62245-8_4

Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on x86 Architectures

Authors: Sergio Calderón, Enzo Rucci, Franco Chichizola

Abstract: Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path between all pairs of vertices in a graph and is employed in areas like communication networking, traffic routing, bioinformatics, among others. However, FW is computationally and spatially expensive since it requires O(n^3) operations and O(n^2) memory s… ▽ More Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path between all pairs of vertices in a graph and is employed in areas like communication networking, traffic routing, bioinformatics, among others. However, FW is computationally and spatially expensive since it requires O(n^3) operations and O(n^2) memory space. As the graph gets larger, parallel computing becomes necessary to provide a solution in an acceptable time range. In this paper, we studied a FW code developed for Xeon Phi KNL processors and adapted it to run on any Intel x86 processors, losing the specificity of the former. To do so, we verified one by one the optimizations proposed by the original code, making adjustments to the base code where necessary, and analyzing its performance on two Intel servers under different test scenarios. In addition, a new optimization was proposed to increase the concurrency degree of the parallel algorithm, which was implemented using two different synchronization mechanisms. The experimental results show that all optimizations were beneficial on the two x86 platforms selected. Last, the new optimization proposal improved performance by up to 23%. △ Less

Submitted 28 June, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: Computer Science - CACIC 2023 (Revised selected papers)

arXiv:2105.07298 [pdf, other]

doi 10.1007/978-3-030-75836-3_3

Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal

Authors: Manuel Costanzo, Enzo Rucci, Ulises Costi, Franco Chichizola, Marcelo Naiouf

Abstract: Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared… ▽ More Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared. Selecting the Floyd-Warshall algorithm as a representative case of graph and memory-bound applications, optimized implementations were developed to analyze and compare performance and energy efficiency on both devices. As it was expected, Xeon Phi showed superior when considering double-precision data. However, contrary to what was considered in our preliminary analysis, it was found that the performance and energy efficiency of both devices were comparable using single-precision datatype. △ Less

Submitted 15 May, 2021; originally announced May 2021.

Comments: Computer Science - CACIC 2020. CACIC 2020. Communications in Computer and Information Science, vol 1409. Springer, Cham

arXiv:1004.3254 [pdf]

Automatic Mapping Tasks to Cores - Evaluating AMTHA Algorithm in Multicore Architectures

Authors: Laura De Giusti, Franco Chichizola, Marcelo Naiouf, Armando De Giusti, Emilio Luque

Abstract: The AMTHA (Automatic Mapping Task on Heterogeneous Architectures) algorithm for task-to-processors assignment and the MPAHA (Model of Parallel Algorithms on Heterogeneous Architectures) model are presented. The use of AMTHA is analyzed for multicore processor-based architectures, considering the communication model among processes in use. The results obtained in the tests carried out are presented… ▽ More The AMTHA (Automatic Mapping Task on Heterogeneous Architectures) algorithm for task-to-processors assignment and the MPAHA (Model of Parallel Algorithms on Heterogeneous Architectures) model are presented. The use of AMTHA is analyzed for multicore processor-based architectures, considering the communication model among processes in use. The results obtained in the tests carried out are presented, comparing the real execution times on multicores of a set of synthetic applications with the predictions obtained with AMTHA. Finally current lines of research are presented, focusing on clusters of multicores and hybrid programming paradigms. △ Less

Submitted 19 April, 2010; originally announced April 2010.

Comments: http://ijcsi.org/articles/Automatic-Mapping-Tasks-to-Cores-Evaluating-AMTHA-Algorithm-in-Multicore-Architectures.php

Journal ref: IJCSI, Volume 7, Issue 2, March 2010

Showing 1–3 of 3 results for author: Chichizola, F