Skip to main content

Showing 1–10 of 10 results for author: Firoz, J

.
  1. arXiv:2504.10700  [pdf, other

    cs.DC cs.AI

    Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE

    Authors: Jesun Firoz, Franco Pellegrini, Mario Geiger, Darren Hsu, Jenna A. Bilbrey, Han-Yi Chou, Maximilian Stadler, Markus Hoehnerbach, Tingyu Wang, Dejun Lin, Emine Kucukbenli, Henry W. Sprueill, Ilyes Batatia, Sotiris S. Xantheas, MalSoon Lee, Chris Mundy, Gabor Csanyi, Justin S. Smith, Ponnuswamy Sadayappan, Sutanay Choudhury

    Abstract: Chemistry Foundation Models (CFMs) that leverage Graph Neural Networks (GNNs) operating on 3D molecular graph structures are becoming indispensable tools for computational chemists and materials scientists. These models facilitate the understanding of matter and the discovery of new molecules and materials. In contrast to GNNs operating on a large homogeneous graphs, GNNs used by CFMs process a la… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted at The 34th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2025)

  2. arXiv:2410.16093  [pdf, other

    cs.DC cs.CV cs.PF eess.SY

    Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security

    Authors: Nathan Tallent, Jan Strube, Luanzheng Guo, Hyungro Lee, Jesun Firoz, Sayan Ghosh, Bo Fang, Oceane Bel, Steven Spurgeon, Sarah Akers, Christina Doty, Erol Cromwell

    Abstract: Automating the theory-experiment cycle requires effective distributed workflows that utilize a computing continuum spanning lab instruments, edge sensors, computing resources at multiple facilities, data sets distributed across multiple information sources, and potentially cloud. Unfortunately, the obvious methods for constructing continuum platforms, orchestrating workflow tasks, and curating dat… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Report number: Pacific Northwest National Laboratory, PNNL-36859 ACM Class: C.2.4; C.4; D.1.3; J.2; K.6.4

  3. arXiv:2211.13853  [pdf, other

    cs.LG cs.AR physics.chem-ph

    Extreme Acceleration of Graph Neural Network-based Prediction Models for Quantum Chemistry

    Authors: Hatem Helal, Jesun Firoz, Jenna Bilbrey, Mario Michael Krell, Tom Murray, Ang Li, Sotiris Xantheas, Sutanay Choudhury

    Abstract: Molecular property calculations are the bedrock of chemical physics. High-fidelity \textit{ab initio} modeling techniques for computing the molecular properties can be prohibitively expensive, and motivate the development of machine-learning models that make the same predictions more efficiently. Training graph neural networks over large molecular databases introduces unique computational challeng… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  4. arXiv:2209.07552  [pdf, other

    cs.DC

    MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems

    Authors: Jieyang Chen, Chenhao Xie, Jesun S Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li

    Abstract: Sparse linear algebra kernels play a critical role in numerous applications, covering from exascale scientific simulation to large-scale data analytics. Offloading linear algebra kernels on one GPU will no longer be viable in these applications, simply because the rapidly growing data volume may exceed the memory capacity and computing power of a single GPU. Multi-GPU systems nowadays being ubiqui… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  5. arXiv:2201.11326  [pdf, other

    cs.DC cs.DS

    High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, Applications, and Experimental Analysis

    Authors: Xu T. Liu, Jesun Firoz, Sinan Aksoy, Ilya Amburg, Andrew Lumsdaine, Cliff Joslyn, Assefaw H. Gebremedhin, Brenda Praggastis

    Abstract: Hypergraphs offer flexible and robust data representations for many applications, but methods that work directly on hypergraphs are not readily available and tend to be prohibitively expensive. Much of the current analysis of hypergraphs relies on first performing a graph expansion -- either based on the nodes (clique expansion), or on the edges (line graph) -- and then running standard graph anal… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted at "36th IEEE International Parallel & Distributed Processing Symposium (IPDPS '22)"

    Report number: PNNL-SA-167812 MSC Class: 05C65(Primary); 05C85(Secondary); 68W10(Secondary) ACM Class: G.2.2

  6. arXiv:2104.11725  [pdf, other

    cs.NI cs.AR cs.DC cs.DM

    SpectralFly: Ramanujan Graphs as Flexible and Efficient Interconnection Networks

    Authors: Stephen Young, Sinan Aksoy, Jesun Firoz, Roberto Gioiosa, Tobias Hagge, Mark Kempton, Juan Escobedo, Mark Raugas

    Abstract: In recent years, graph theoretic considerations have become increasingly important in the design of HPC interconnection topologies. One approach is to seek optimal or near-optimal families of graphs with respect to a particular graph theoretic property, such as diameter. In this work, we consider topologies which optimize the spectral gap. We study a novel HPC topology, SpectralFly, designed aroun… ▽ More

    Submitted 14 February, 2022; v1 submitted 23 April, 2021; originally announced April 2021.

  7. arXiv:2012.06959  [pdf, other

    cs.DC cs.AR

    Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures

    Authors: Chenhao Xie, Jieyang Chen, Jesun S Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li

    Abstract: Designing efficient and scalable sparse linear algebra kernels on modern multi-GPU based HPC systems is a daunting task due to significant irregular memory references and workload imbalance across the GPUs. This is particularly the case for Sparse Triangular Solver (SpTRSV) which introduces additional two-dimensional computation dependencies among subsequent computation steps. Dependency informati… ▽ More

    Submitted 12 December, 2020; originally announced December 2020.

  8. arXiv:2010.11448  [pdf, other

    cs.DM

    Parallel Algorithms and Heuristics for Efficient Computation of High-Order Line Graphs of Hypergraphs

    Authors: Xu T. Liu, Jesun Firoz, Andrew Lumsdaine, Cliff Joslyn, Sinan Aksoy, Brenda Praggastis, Assefaw Gebremedhin

    Abstract: This paper considers structures of systems beyond dyadic (pairwise) interactions and investigates mathematical modeling of multi-way interactions and connections as hypergraphs, where captured relationships among system entities are set-valued. To date, in most situations, entities in a hypergraph are considered connected as long as there is at least one common "neighbor". However, minimal commona… ▽ More

    Submitted 15 July, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: 11 pages

    Report number: PNNL-SA-164086 MSC Class: 05C65 ACM Class: G.2.2

  9. arXiv:1507.06702  [pdf, other

    cs.DC cs.DS cs.PF cs.SE

    The Anatomy of Large-Scale Distributed Graph Algorithms

    Authors: Jesun Sahariar Firoz, Thejaka Amila Kanewala, Marcin Zalewski, Martina Barnas, Andrew Lumsdaine

    Abstract: The increasing complexity of the software/hardware stack of modern supercomputers results in explosion of parameters. The performance analysis becomes a truly experimental science, even more challenging in the presence of massive irregularity and data dependency. We analyze how the existing body of research handles the experimental aspect in the context of distributed graph algorithms (DGAs). We d… ▽ More

    Submitted 23 July, 2015; originally announced July 2015.

    ACM Class: D.1.3

  10. arXiv:0910.3292  [pdf, ps, other

    cs.DS cs.DM

    The 1.375 Approximation Algorithm for Sorting by Transpositions Can Run in $O(n\log n)$ Time

    Authors: Jesun Sahariar Firoz, Masud Hasan, Ashik Zinnat Khan, M. Sohel Rahman

    Abstract: Sorting a Permutation by Transpositions (SPbT) is an important problem in Bioinformtics. In this paper, we improve the running time of the best known approximation algorithm for SPbT. We use the permutation tree data structure of Feng and Zhu and improve the running time of the 1.375 Approximation Algorithm for SPbT of Elias and Hartman to $O(n\log n)$. The previous running time of EH algorithm… ▽ More

    Submitted 17 October, 2009; originally announced October 2009.

    Comments: 5 pages