-
Fine-tuning machine-learned particle-flow reconstruction for new detector geometries in future colliders
Authors:
Farouk Mokhtar,
Joosep Pata,
Dolores Garcia,
Eric Wulff,
Mengke Zhang,
Michael Kagan,
Javier Duarte
Abstract:
We demonstrate transfer learning capabilities in a machine-learned algorithm trained for particle-flow reconstruction in high energy particle colliders. This paper presents a cross-detector fine-tuning study, where we initially pretrain the model on a large full simulation dataset from one detector design, and subsequently fine-tune the model on a sample with a different collider and detector desi…
▽ More
We demonstrate transfer learning capabilities in a machine-learned algorithm trained for particle-flow reconstruction in high energy particle colliders. This paper presents a cross-detector fine-tuning study, where we initially pretrain the model on a large full simulation dataset from one detector design, and subsequently fine-tune the model on a sample with a different collider and detector design. Specifically, we use the Compact Linear Collider detector (CLICdet) model for the initial training set and demonstrate successful knowledge transfer to the CLIC-like detector (CLD) proposed for the Future Circular Collider in electron-positron mode. We show that with an order of magnitude less samples from the second dataset, we can achieve the same performance as a costly training from scratch, across particle-level and event-level performance metrics, including jet and missing transverse momentum resolution. Furthermore, we find that the fine-tuned model achieves comparable performance to the traditional rule-based particle-flow approach on event-level metrics after training on 100,000 CLD events, whereas a model trained from scratch requires at least 1 million CLD events to achieve similar reconstruction performance. To our knowledge, this represents the first full-simulation cross-detector transfer learning study for particle-flow reconstruction. These findings offer valuable insights towards building large foundation models that can be fine-tuned across different detector designs and geometries, helping to accelerate the development cycle for new detectors and opening the door to rapid detector design and optimization using machine learning.
△ Less
Submitted 25 June, 2025; v1 submitted 28 February, 2025;
originally announced March 2025.
-
Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture
Authors:
Subash Katel,
Haoyang Li,
Zihan Zhao,
Raghav Kansal,
Farouk Mokhtar,
Javier Duarte
Abstract:
In high energy physics, self-supervised learning (SSL) methods have the potential to aid in the creation of machine learning models without the need for labeled datasets for a variety of tasks, including those related to jets -- narrow sprays of particles produced by quarks and gluons in high energy particle collisions. This study introduces an approach to learning jet representations without hand…
▽ More
In high energy physics, self-supervised learning (SSL) methods have the potential to aid in the creation of machine learning models without the need for labeled datasets for a variety of tasks, including those related to jets -- narrow sprays of particles produced by quarks and gluons in high energy particle collisions. This study introduces an approach to learning jet representations without hand-crafted augmentations using a jet-based joint embedding predictive architecture (J-JEPA), which aims to predict various physical targets from an informative context. As our method does not require hand-crafted augmentation like other common SSL techniques, J-JEPA avoids introducing biases that could harm downstream tasks. Since different tasks generally require invariance under different augmentations, this training without hand-crafted augmentation enables versatile applications, offering a pathway toward a cross-task foundation model. We finetune the representations learned by J-JEPA for jet tagging and benchmark them against task-specific representations.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
Large-Scale Pretraining and Finetuning for Efficient Jet Classification in Particle Physics
Authors:
Zihan Zhao,
Farouk Mokhtar,
Raghav Kansal,
Haoyang Li,
Javier Duarte
Abstract:
This study introduces an innovative approach to analyzing unlabeled data in high-energy physics (HEP) through the application of self-supervised learning (SSL). Faced with the increasing computational cost of producing high-quality labeled simulation samples at the CERN LHC, we propose leveraging large volumes of unlabeled data to overcome the limitations of supervised learning methods, which heav…
▽ More
This study introduces an innovative approach to analyzing unlabeled data in high-energy physics (HEP) through the application of self-supervised learning (SSL). Faced with the increasing computational cost of producing high-quality labeled simulation samples at the CERN LHC, we propose leveraging large volumes of unlabeled data to overcome the limitations of supervised learning methods, which heavily rely on detailed labeled simulations. By pretraining models on these vast, mostly untapped datasets, we aim to learn generic representations that can be finetuned with smaller quantities of labeled data. Our methodology employs contrastive learning with augmentations on jet datasets to teach the model to recognize common representations of jets, addressing the unique challenges of LHC physics. Building on the groundwork laid by previous studies, our work demonstrates the critical ability of SSL to utilize large-scale unlabeled data effectively. We showcase the scalability and effectiveness of our models by gradually increasing the size of the pretraining dataset and assessing the resultant performance enhancements. Our results, obtained from experiments on two datasets -- JetClass, representing unlabeled data, and Top Tagging, serving as labeled simulation data -- show significant improvements in data efficiency, computational efficiency, and overall performance. These findings suggest that SSL can greatly enhance the adaptability of ML models to the HEP domain. This work opens new avenues for the use of unlabeled data in HEP and contributes to a better understanding the potential of SSL for scientific discovery.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Improved particle-flow event reconstruction with scalable neural networks for current and future particle detectors
Authors:
Joosep Pata,
Eric Wulff,
Farouk Mokhtar,
David Southwick,
Mengke Zhang,
Maria Girone,
Javier Duarte
Abstract:
Efficient and accurate algorithms are necessary to reconstruct particles in the highly granular detectors anticipated at the High-Luminosity Large Hadron Collider and the Future Circular Collider. We study scalable machine learning models for event reconstruction in electron-positron collisions based on a full detector simulation. Particle-flow reconstruction can be formulated as a supervised lear…
▽ More
Efficient and accurate algorithms are necessary to reconstruct particles in the highly granular detectors anticipated at the High-Luminosity Large Hadron Collider and the Future Circular Collider. We study scalable machine learning models for event reconstruction in electron-positron collisions based on a full detector simulation. Particle-flow reconstruction can be formulated as a supervised learning task using tracks and calorimeter clusters. We compare a graph neural network and kernel-based transformer and demonstrate that we can avoid quadratic operations while achieving realistic reconstruction. We show that hyperparameter tuning significantly improves the performance of the models. The best graph neural network model shows improvement in the jet transverse momentum resolution by up to 50% compared to the rule-based algorithm. The resulting model is portable across Nvidia, AMD and Habana hardware. Accurate and fast machine-learning based reconstruction can significantly improve future measurements at colliders.
△ Less
Submitted 16 July, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Progress towards an improved particle flow algorithm at CMS with machine learning
Authors:
Farouk Mokhtar,
Joosep Pata,
Javier Duarte,
Eric Wulff,
Maurizio Pierini,
Jean-Roch Vlimant
Abstract:
The particle-flow (PF) algorithm, which infers particles based on tracks and calorimeter clusters, is of central importance to event reconstruction in the CMS experiment at the CERN LHC, and has been a focus of development in light of planned Phase-2 running conditions with an increased pileup and detector granularity. In recent years, the machine learned particle-flow (MLPF) algorithm, a graph ne…
▽ More
The particle-flow (PF) algorithm, which infers particles based on tracks and calorimeter clusters, is of central importance to event reconstruction in the CMS experiment at the CERN LHC, and has been a focus of development in light of planned Phase-2 running conditions with an increased pileup and detector granularity. In recent years, the machine learned particle-flow (MLPF) algorithm, a graph neural network that performs PF reconstruction, has been explored in CMS, with the possible advantages of directly optimizing for the physical quantities of interest, being highly reconfigurable to new conditions, and being a natural fit for deployment to heterogeneous accelerators. We discuss progress in CMS towards an improved implementation of the MLPF reconstruction, now optimized using generator/simulation-level particle information as the target for the first time. This paves the way to potentially improving the detector response in terms of physical quantities of interest. We describe the simulation-based training target, progress and studies on event-based loss terms, details on the model hyperparameter tuning, as well as physics validation with respect to the current PF algorithm in terms of high-level physical quantities such as the jet and missing transverse momentum resolutions. We find that the MLPF algorithm, trained on a generator/simulator level particle information for the first time, results in broadly compatible particle and jet reconstruction performance with the baseline PF, setting the stage for improving the physics performance by additional training statistics and model tuning.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
FAIR AI Models in High Energy Physics
Authors:
Javier Duarte,
Haoyang Li,
Avik Roy,
Ruike Zhu,
E. A. Huerta,
Daniel Diaz,
Philip Harris,
Raghav Kansal,
Daniel S. Katz,
Ishaan H. Kavoori,
Volodymyr V. Kindratenko,
Farouk Mokhtar,
Mark S. Neubauer,
Sang Eon Park,
Melissa Quinnan,
Roger Rusack,
Zhizhen Zhao
Abstract:
The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning (ML) models -- algorithms that have been trained on data without being explicitly…
▽ More
The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning (ML) models -- algorithms that have been trained on data without being explicitly programmed -- and more generally, artificial intelligence (AI) models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template's use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability.
△ Less
Submitted 29 December, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Machine Learning for Particle Flow Reconstruction at CMS
Authors:
Joosep Pata,
Javier Duarte,
Farouk Mokhtar,
Eric Wulff,
Jieun Yoo,
Jean-Roch Vlimant,
Maurizio Pierini,
Maria Girone
Abstract:
We provide details on the implementation of a machine-learning based particle flow algorithm for CMS. The standard particle flow algorithm reconstructs stable particles based on calorimeter clusters and tracks to provide a global event reconstruction that exploits the combined information of multiple detector subsystems, leading to strong improvements for quantities such as jets and missing transv…
▽ More
We provide details on the implementation of a machine-learning based particle flow algorithm for CMS. The standard particle flow algorithm reconstructs stable particles based on calorimeter clusters and tracks to provide a global event reconstruction that exploits the combined information of multiple detector subsystems, leading to strong improvements for quantities such as jets and missing transverse energy. We have studied a possible evolution of particle flow towards heterogeneous computing platforms such as GPUs using a graph neural network. The machine-learned PF model reconstructs particle candidates based on the full list of tracks and calorimeter clusters in the event. For validation, we determine the physics performance directly in the CMS software framework when the proposed algorithm is interfaced with the offline reconstruction of jets and missing transverse energy. We also report the computational performance of the algorithm, which scales approximately linearly in runtime and memory usage with the input size.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Particle Graph Autoencoders and Differentiable, Learned Energy Mover's Distance
Authors:
Steven Tsan,
Raghav Kansal,
Anthony Aportela,
Daniel Diaz,
Javier Duarte,
Sukanya Krishna,
Farouk Mokhtar,
Jean-Roch Vlimant,
Maurizio Pierini
Abstract:
Autoencoders have useful applications in high energy physics in anomaly detection, particularly for jets - collimated showers of particles produced in collisions such as those at the CERN Large Hadron Collider. We explore the use of graph-based autoencoders, which operate on jets in their "particle cloud" representations and can leverage the interdependencies among the particles within a jet, for…
▽ More
Autoencoders have useful applications in high energy physics in anomaly detection, particularly for jets - collimated showers of particles produced in collisions such as those at the CERN Large Hadron Collider. We explore the use of graph-based autoencoders, which operate on jets in their "particle cloud" representations and can leverage the interdependencies among the particles within a jet, for such tasks. Additionally, we develop a differentiable approximation to the energy mover's distance via a graph neural network, which may subsequently be used as a reconstruction loss function for autoencoders.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Explaining machine-learned particle-flow reconstruction
Authors:
Farouk Mokhtar,
Raghav Kansal,
Daniel Diaz,
Javier Duarte,
Joosep Pata,
Maurizio Pierini,
Jean-Roch Vlimant
Abstract:
The particle-flow (PF) algorithm is used in general-purpose particle detectors to reconstruct a comprehensive particle-level view of the collision by combining information from different subdetectors. A graph neural network (GNN) model, known as the machine-learned particle-flow (MLPF) algorithm, has been developed to substitute the rule-based PF algorithm. However, understanding the model's decis…
▽ More
The particle-flow (PF) algorithm is used in general-purpose particle detectors to reconstruct a comprehensive particle-level view of the collision by combining information from different subdetectors. A graph neural network (GNN) model, known as the machine-learned particle-flow (MLPF) algorithm, has been developed to substitute the rule-based PF algorithm. However, understanding the model's decision making is not straightforward, especially given the complexity of the set-to-set prediction task, dynamic graph building, and message-passing steps. In this paper, we adapt the layerwise-relevance propagation technique for GNNs and apply it to the MLPF algorithm to gauge the relevant nodes and features for its predictions. Through this process, we gain insight into the model's decision-making.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Applications and Techniques for Fast Machine Learning in Science
Authors:
Allison McCarn Deiana,
Nhan Tran,
Joshua Agar,
Michaela Blott,
Giuseppe Di Guglielmo,
Javier Duarte,
Philip Harris,
Scott Hauck,
Mia Liu,
Mark S. Neubauer,
Jennifer Ngadiuba,
Seda Ogrenci-Memik,
Maurizio Pierini,
Thea Aarrestad,
Steffen Bahr,
Jurgen Becker,
Anne-Sophie Berthold,
Richard J. Bonventre,
Tomas E. Muller Bravo,
Markus Diefenthaler,
Zhen Dong,
Nick Fritzsche,
Amir Gholami,
Ekaterina Govorkova,
Kyle J Hazelwood
, et al. (62 additional authors not shown)
Abstract:
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML ac…
▽ More
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.