Skip to main content

Showing 1–2 of 2 results for author: Halawa, M S

.
  1. arXiv:2312.06546  [pdf

    cs.DC cs.AI

    Unsupervised KPIs-Based Clustering of Jobs in HPC Data Centers

    Authors: Mohamed S. Halawa, Rebeca P. Díaz-Redondo, Ana Fernández-Vilas

    Abstract: Performance analysis is an essential task in High-Performance Computing (HPC) systems and it is applied for different purposes such as anomaly detection, optimal resource allocation, and budget planning. HPC monitoring tasks generate a huge number of Key Performance Indicators (KPIs) to supervise the status of the jobs running in these systems. KPIs give data about CPU usage, memory usage, network… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 22 pages, 6 figures, journal

    Journal ref: Sensors, 2020, vol. 20, no 15, p. 4111

  2. KPIs-Based Clustering and Visualization of HPC jobs: a Feature Reduction Approach

    Authors: Mohamed Soliman Halawa, Rebeca P. Díaz-Redondo, Ana Fernández-Vilas

    Abstract: High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. The monitoring systems collect a tremendous amount of data about different parameters or Key Performance Indicators (KPIs), such as resource usage, IO waiting time, etc. A proper analysis of this data, usually stored as time series, can provide insight in choosing the right management strategies as… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 23 pages, 11 figures

    Journal ref: IEEE Access, 2021, vol. 9, p. 25522-25543