-
Reconstructing hadronically decaying tau leptons with a jet foundation model
Authors:
Laurits Tani,
Joosep Pata,
Joschka Birk
Abstract:
The limited availability and accuracy of simulated data has motivated the use of foundation models in high energy physics, with the idea to first train a task-agnostic model on large and potentially unlabeled datasets. This enables the subsequent fine-tuning of the learned representation for specific downstream tasks, potentially requiring much smaller dataset sizes to reach the performance of mod…
▽ More
The limited availability and accuracy of simulated data has motivated the use of foundation models in high energy physics, with the idea to first train a task-agnostic model on large and potentially unlabeled datasets. This enables the subsequent fine-tuning of the learned representation for specific downstream tasks, potentially requiring much smaller dataset sizes to reach the performance of models trained from scratch. We study how OmniJet-$α$, one of the proposed foundation models for particle jets, can be used on a new set of tasks, and in a new dataset, in order to reconstruct hadronically decaying $τ$ leptons. We show that the pretraining can successfully be utilized for this multi-task problem, improving the resolution of momentum reconstruction by about 50\% when the pretrained weights are fine-tuned, compared to training the model from scratch. While much work remains ahead to develop generic foundation models for high-energy physics, this early result of generalizing an existing model to a new dataset and to previously unconsidered tasks highlights the importance of testing the approaches on a diverse set of datasets and tasks.
△ Less
Submitted 23 May, 2025; v1 submitted 24 March, 2025;
originally announced March 2025.
-
OmniJet-${α_{ C}}$: Learning point cloud calorimeter simulations using generative transformers
Authors:
Joschka Birk,
Frank Gaede,
Anna Hallin,
Gregor Kasieczka,
Martina Mozzanica,
Henning Rose
Abstract:
We show the first use of generative transformers for generating calorimeter showers as point clouds in a high-granularity calorimeter. Using the tokenizer and generative part of the OmniJet-$α$ model, we represent the hits in the detector as sequences of integers. This model allows variable-length sequences, which means that it supports realistic shower development and does not need to be conditio…
▽ More
We show the first use of generative transformers for generating calorimeter showers as point clouds in a high-granularity calorimeter. Using the tokenizer and generative part of the OmniJet-$α$ model, we represent the hits in the detector as sequences of integers. This model allows variable-length sequences, which means that it supports realistic shower development and does not need to be conditioned on the number of hits. Since the tokenization represents the showers as point clouds, the model learns the geometry of the showers without being restricted to any particular voxel grid.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics
Authors:
Oz Amram,
Luca Anzalone,
Joschka Birk,
Darius A. Faroughy,
Anna Hallin,
Gregor Kasieczka,
Michael Krämer,
Ian Pang,
Humberto Reyes-Gonzalez,
David Shih
Abstract:
Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how data collected by the CMS experiment at the Large Hadron Collider can be useful in pre-training foundation models for HEP. Specifically, we introduce the AspenOpenJets dataset, consisting of approximately 180M hig…
▽ More
Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how data collected by the CMS experiment at the Large Hadron Collider can be useful in pre-training foundation models for HEP. Specifically, we introduce the AspenOpenJets dataset, consisting of approximately 180M high $p_T$ jets derived from CMS 2016 Open Data. We show how pre-training the OmniJet-$α$ foundation model on AspenOpenJets improves performance on generative tasks with significant domain shift: generating boosted top and QCD jets from the simulated JetClass dataset. In addition to demonstrating the power of pre-training of a jet-based foundation model on actual proton-proton collision data, we provide the ML-ready derived AspenOpenJets dataset for further public use.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
OmniJet-$α$: The first cross-task foundation model for particle physics
Authors:
Joschka Birk,
Anna Hallin,
Gregor Kasieczka
Abstract:
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training…
▽ More
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data.
We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-$α$ model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.
△ Less
Submitted 7 September, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information
Authors:
Joschka Birk,
Erik Buhmann,
Cedric Ewen,
Gregor Kasieczka,
David Shih
Abstract:
We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g…
▽ More
We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a generative model that goes beyond the kinematic features of jet constituents. The JetClass dataset includes more features, such as particle-ID and track impact parameter, and we demonstrate that our CNF can accurately model all of these additional features as well. Our generative model for JetClass expands on the versatility of existing jet generation techniques, enhancing their potential utility in high-energy physics research, and offering a more comprehensive understanding of the generated jets.
△ Less
Submitted 26 March, 2025; v1 submitted 30 November, 2023;
originally announced December 2023.