Skip to main content

Showing 1–50 of 103 results for author: Shih, D

Searching in archive hep-ph. Search in all archives.
.
  1. arXiv:2506.11192  [pdf, ps, other

    hep-ph

    Quirk SUEP

    Authors: David Curtin, Sascha Dreyer, Max Fusté Costa, Sarah Heim, Gregor Kasieczka, Louis Moureaux, David Rousso, David Shih, Manuel Sommerhalder

    Abstract: We propose searching for physics beyond the Standard Model in the low-transverse-momentum tracks accompanying hard-scatter events at the LHC. TeV-scale resonances connected to a dark QCD sector could be enhanced by selecting events with anomalies in the track distributions. As a benchmark, a quirk model with microscopic string lengths is developed, including a setup for event simulation. For this… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Report number: DESY-25-081-0

  2. arXiv:2506.00119  [pdf, ps, other

    hep-ph cs.LG hep-ex

    Generator Based Inference (GBI)

    Authors: Chi Lung Cheng, Ranit Das, Runze Li, Radha Mastandrea, Vinicius Mikuni, Benjamin Nachman, David Shih, Gup Singh

    Abstract: Statistical inference in physics is often based on samples from a generator (sometimes referred to as a ``forward model") that emulate experimental data and depend on parameters of the underlying theory. Modern machine learning has supercharged this workflow to enable high-dimensional and unbinned analyses to utilize much more information than ever before. We propose a general framework for descri… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

    Comments: 9 pages, 9 figures

  3. arXiv:2412.14236  [pdf, other

    astro-ph.GA hep-ph

    Mapping Dark Matter Through the Dust of the Milky Way Part I: Dust Correction and Phase Space Density

    Authors: Eric Putney, David Shih, Sung Hak Lim, Matthew R. Buckley

    Abstract: The Boltzmann equation relates the equilibrium phase space distribution of stars in the Milky Way to the Galaxy's gravitational potential. However, observations of stellar populations are biased by extinction from foreground dust, which complicates measurements of the potential in the disk and towards the Galactic center. Using the kinematics of Red Clump and Red Branch stars in Gaia DR3, we use m… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

    Comments: 27 pages, 12 figures

    Report number: CTPU-PTC-24-38

  4. arXiv:2412.10504  [pdf, other

    hep-ph cs.LG hep-ex stat.ML

    Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics

    Authors: Oz Amram, Luca Anzalone, Joschka Birk, Darius A. Faroughy, Anna Hallin, Gregor Kasieczka, Michael Krämer, Ian Pang, Humberto Reyes-Gonzalez, David Shih

    Abstract: Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how data collected by the CMS experiment at the Large Hadron Collider can be useful in pre-training foundation models for HEP. Specifically, we introduce the AspenOpenJets dataset, consisting of approximately 180M hig… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: 11 pages, 4 figures, the AspenOpenJets dataset can be found at http://doi.org/10.25592/uhhfdm.16505

  5. arXiv:2411.00085  [pdf, other

    hep-ph hep-ex physics.data-an

    Accurate and robust methods for direct background estimation in resonant anomaly detection

    Authors: Ranit Das, Thorben Finke, Marie Hein, Gregor Kasieczka, Michael Krämer, Alexander Mück, David Shih

    Abstract: Resonant anomaly detection methods have great potential for enhancing the sensitivity of traditional bump hunt searches. A key component of these methods is a high quality background template used to produce an anomaly score. Using the LHC Olympics R&D dataset, we demonstrate that this background template can also be repurposed to directly estimate the background expectation in a simple cut and co… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

    Comments: 26 pages, 9 figures, 2 tables

    Report number: P3H-24-077, TTK-24-45

  6. arXiv:2410.21611  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph

    CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation

    Authors: Claudius Krause, Michele Faucci Giannelli, Gregor Kasieczka, Benjamin Nachman, Dalila Salamani, David Shih, Anna Zaborowska, Oz Amram, Kerstin Borras, Matthew R. Buckley, Erik Buhmann, Thorsten Buss, Renato Paulo Da Costa Cardoso, Anthony L. Caterini, Nadezda Chernyavskaya, Federico A. G. Corchia, Jesse C. Cresswell, Sascha Diefenbacher, Etienne Dreyer, Vijay Ekambaram, Engin Eren, Florian Ernst, Luigi Favaro, Matteo Franchini, Frank Gaede , et al. (44 additional authors not shown)

    Abstract: We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoder… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

    Comments: 204 pages, 100+ figures, 30+ tables

    Report number: HEPHY-ML-24-05, FERMILAB-PUB-24-0728-CMS, TTK-24-43

  7. arXiv:2410.20537  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    SIGMA: Single Interpolated Generative Model for Anomalies

    Authors: Ranit Das, David Shih

    Abstract: A key step in any resonant anomaly detection search is accurate modeling of the background distribution in each signal region. Data-driven methods like CATHODE accomplish this by training separate generative models on the complement of each signal region, and interpolating them into their corresponding signal regions. Having to re-train the generative model on essentially the entire dataset for ea… ▽ More

    Submitted 4 April, 2025; v1 submitted 27 October, 2024; originally announced October 2024.

    Comments: 12 pages, 7 figures, v2: added timing comparison and sample quality in other SRs

  8. arXiv:2407.20315  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Universal New Physics Latent Space

    Authors: Anna Hallin, Gregor Kasieczka, Sabine Kraml, André Lessa, Louis Moureaux, Tore von Schwartz, David Shih

    Abstract: We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space while conserving information about the relationship between the underlying theories. We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clust… ▽ More

    Submitted 22 January, 2025; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: 25 pages, 17 figures

    Journal ref: Phys. Rev. D 111, 016006 (2025)

  9. arXiv:2405.20407  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Convolutional L2LFlows: Generating Accurate Showers in Highly Granular Calorimeters Using Convolutional Normalizing Flows

    Authors: Thorsten Buss, Frank Gaede, Gregor Kasieczka, Claudius Krause, David Shih

    Abstract: In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to… ▽ More

    Submitted 4 September, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Report number: HEPHY-ML-24-02

    Journal ref: 2024 JINST 19 P09003

  10. arXiv:2405.12131  [pdf, other

    astro-ph.GA hep-ph physics.data-an

    SkyCURTAINs: Model agnostic search for Stellar Streams with Gaia data

    Authors: Debajyoti Sengupta, Stephen Mulligan, David Shih, John Andrew Raine, Tobias Golling

    Abstract: We present SkyCURTAINs, a data driven and model agnostic method to search for stellar streams in the Milky Way galaxy using data from the Gaia telescope. SkyCURTAINs is a weakly supervised machine learning algorithm that builds a background enriched template in the signal region by leveraging the correlation of the source's characterising features with their proper motion in the sky. This allows f… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  11. arXiv:2404.18992  [pdf, other

    hep-ph hep-ex physics.data-an physics.ins-det stat.ML

    Unifying Simulation and Inference with Normalizing Flows

    Authors: Haoxing Du, Claudius Krause, Vinicius Mikuni, Benjamin Nachman, Ian Pang, David Shih

    Abstract: There have been many applications of deep neural networks to detector calibrations and a growing number of studies that propose deep generative models as automated fast detector simulators. We show that these two tasks can be unified by using maximum likelihood estimation (MLE) from conditional generative models for energy regression. Unlike direct regression techniques, the MLE approach is prior-… ▽ More

    Submitted 11 April, 2025; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures; v3: matches published version

    Report number: HEPHY-ML-24-01

    Journal ref: Phys. Rev. D 111, 076004 (2025)

  12. arXiv:2404.07258  [pdf, other

    hep-ph hep-ex physics.data-an

    Complete Optimal Non-Resonant Anomaly Detection

    Authors: Gregor Kasieczka, John Andrew Raine, David Shih, Aman Upadhyay

    Abstract: We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the clas… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 9 pages, 9 figures

  13. arXiv:2312.11629  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Residual ANODE

    Authors: Ranit Das, Gregor Kasieczka, David Shih

    Abstract: We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sid… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 9 pages, 6 figures

  14. arXiv:2312.11618  [pdf, other

    hep-ph astro-ph.IM hep-ex physics.data-an physics.ins-det

    Anomaly detection with flow-based fast calorimeter simulators

    Authors: Claudius Krause, Benjamin Nachman, Ian Pang, David Shih, Yunhao Zhu

    Abstract: Recently, several normalizing flow-based deep generative models have been proposed to accelerate the simulation of calorimeter showers. Using CaloFlow as an example, we show that these models can simultaneously perform unsupervised anomaly detection with no additional training cost. As a demonstration, we consider electromagnetic showers initiated by one (background) or multiple (signal) photons.… ▽ More

    Submitted 29 August, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 14 pages, 8 figures

    Report number: HEPHY-ML-23-03

  15. Normalizing Flows for High-Dimensional Detector Simulations

    Authors: Florian Ernst, Luigi Favaro, Claudius Krause, Tilman Plehn, David Shih

    Abstract: Whenever invertible generative networks are needed for LHC physics, normalizing flows show excellent performance. In this work, we investigate their performance for fast calorimeter shower simulations with increasing phase space dimension. We use fast and expressive coupling spline transformations applied to the CaloChallenge datasets. In addition to the base flow architecture we also employ a VAE… ▽ More

    Submitted 13 January, 2025; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 34 pages, 14 figures, 7 tables, journal version

    Journal ref: SciPost Phys. 18, 081 (2025)

  16. arXiv:2312.00123  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

    Authors: Joschka Birk, Erik Buhmann, Cedric Ewen, Gregor Kasieczka, David Shih

    Abstract: We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g… ▽ More

    Submitted 26 March, 2025; v1 submitted 30 November, 2023; originally announced December 2023.

    Journal ref: Phys. Rev. D 111, 052008 (2025)

  17. arXiv:2310.12209  [pdf, other

    astro-ph.IM astro-ph.HE cs.LG gr-qc hep-ph

    Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows

    Authors: David Shih, Marat Freytsis, Stephen R. Taylor, Jeff A. Dror, Nolan Smyth

    Abstract: Pulsar timing arrays (PTAs) perform Bayesian posterior inference with expensive MCMC methods. Given a dataset of ~10-100 pulsars and O(10^3) timing residuals each, producing a posterior distribution for the stochastic gravitational wave background (SGWB) can take days to a week. The computational bottleneck arises because the likelihood evaluation required for MCMC is extremely costly when conside… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 8 pages, 3 figures

  18. arXiv:2310.06897  [pdf, other

    hep-ph hep-ex physics.data-an

    Full Phase Space Resonant Anomaly Detection

    Authors: Erik Buhmann, Cedric Ewen, Gregor Kasieczka, Vinicius Mikuni, Benjamin Nachman, David Shih

    Abstract: Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model backgrou… ▽ More

    Submitted 9 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: 10 pages, 7 figures

    Journal ref: Phys. Rev. D 109, 055015 (2024)

  19. arXiv:2310.00049  [pdf, other

    hep-ph cs.LG

    EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion

    Authors: Erik Buhmann, Cedric Ewen, Darius A. Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Quétant, John Andrew Raine, Debajyoti Sengupta, David Shih

    Abstract: Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep s… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: 21 pages, 8 figures

  20. arXiv:2309.13111  [pdf, other

    hep-ph hep-ex physics.data-an

    Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection

    Authors: Thorben Finke, Marie Hein, Gregor Kasieczka, Michael Krämer, Alexander Mück, Parada Prangchaikul, Tobias Quadfasel, David Shih, Manuel Sommerhalder

    Abstract: Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using booste… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 11 pages, 9 figures

    Report number: TTK-23-26

  21. Combining Resonant and Tail-based Anomaly Detection

    Authors: Gerrit Bickendorf, Manuel Drees, Gregor Kasieczka, Claudius Krause, David Shih

    Abstract: In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios wi… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 13 pages, 15 figures

  22. arXiv:2308.11700  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Calorimeter shower superresolution

    Authors: Ian Pang, John Andrew Raine, David Shih

    Abstract: Calorimeter shower simulation is a major bottleneck in the Large Hadron Collider computational pipeline. There have been recent efforts to employ deep-generative surrogate models to overcome this challenge. However, many of best performing models have training and generation times that do not scale well to high-dimensional calorimeter showers. In this work, we introduce SuperCalo, a flow-based sup… ▽ More

    Submitted 15 May, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 16 pages, 13 figures, v3: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 092009 (2024)

  23. arXiv:2307.11157  [pdf, other

    hep-ph hep-ex physics.data-an

    The Interplay of Machine Learning--based Resonant Anomaly Detection Methods

    Authors: Tobias Golling, Gregor Kasieczka, Claudius Krause, Radha Mastandrea, Benjamin Nachman, John Andrew Raine, Debajyoti Sengupta, David Shih, Manuel Sommerhalder

    Abstract: Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal… ▽ More

    Submitted 14 March, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: 27 pages, 21 figures. Updated with revisions for journal acceptance

  24. How to Understand Limitations of Generative Networks

    Authors: Ranit Das, Luigi Favaro, Theo Heimel, Claudius Krause, Tilman Plehn, David Shih

    Abstract: Well-trained classifiers and their complete weight distributions provide us with a well-motivated and practicable method to test generative networks in particle physics. We illustrate their benefits for distribution-shifted jets, calorimeter showers, and reconstruction-level events. In all cases, the classifier weights make for a powerful test of the generative network, identify potential problems… ▽ More

    Submitted 7 December, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 32 pages, 19 figures

    Journal ref: SciPost Phys. 16, 031 (2024)

  25. Mapping Dark Matter in the Milky Way using Normalizing Flows and Gaia DR3

    Authors: Sung Hak Lim, Eric Putney, Matthew R. Buckley, David Shih

    Abstract: We present a novel, data-driven analysis of Galactic dynamics, using unsupervised machine learning -- in the form of density estimation with normalizing flows -- to learn the underlying phase space distribution of 6 million nearby stars from the Gaia DR3 catalog. Solving the equilibrium collisionless Boltzmann equation, we calculate -- for the first time ever -- a model-free, unbinned estimate of… ▽ More

    Submitted 21 January, 2025; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 33 pages, 17 figures, 3 tables, version published in JCAP

    Report number: CTPU-PTC-25-01

    Journal ref: JCAP01(2025)021

  26. arXiv:2305.11934  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Inductive Simulation of Calorimeter Showers with Normalizing Flows

    Authors: Matthew R. Buckley, Claudius Krause, Ian Pang, David Shih

    Abstract: Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this probl… ▽ More

    Submitted 13 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 19 pages, 15 figures; v2: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 033006 (2024)

  27. arXiv:2305.03761  [pdf, other

    astro-ph.GA cs.LG hep-ph physics.data-an

    Weakly-Supervised Anomaly Detection in the Milky Way

    Authors: Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman, David Shih, Matthew R. Buckley, Jack H. Collins

    Abstract: Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satelli… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  28. arXiv:2303.01529  [pdf, other

    astro-ph.GA hep-ph

    Via Machinae 2.0: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2

    Authors: David Shih, Matthew R. Buckley, Lina Necib

    Abstract: We present an update to Via Machinae, an automated stellar stream-finding algorithm based on the deep learning anomaly detector ANODE. Via Machinae identifies stellar streams within Gaia, using only angular positions, proper motions, and photometry, without reference to a model of the Milky Way potential for orbit integration or stellar distances. This new version, Via Machinae 2.0, includes many… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 22 pages, 24 figures

  29. arXiv:2302.11594  [pdf, other

    physics.ins-det hep-ex hep-ph physics.data-an

    L2LFlows: Generating High-Fidelity 3D Calorimeter Images

    Authors: Sascha Diefenbacher, Engin Eren, Frank Gaede, Gregor Kasieczka, Claudius Krause, Imahn Shekhzadeh, David Shih

    Abstract: We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of… ▽ More

    Submitted 20 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: v2: 28 pages, 13 figures; matches version accepted for publication in JINST. Neither SISSA Medialab Srl nor IOP Publishing Ltd is responsible for any errors or omissions in this version of the manuscript or any version derived from it. Published version available via DOI

    Journal ref: 2023 JINST 18 P10017

  30. arXiv:2212.00046  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Feature Selection with Distance Correlation

    Authors: Ranit Das, Gregor Kasieczka, David Shih

    Abstract: Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretica… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: 14 pages, 8 figures, 3 tables

  31. arXiv:2211.11765  [pdf, other

    astro-ph.GA astro-ph.IM hep-ph

    GalaxyFlow: Upsampling Hydrodynamical Simulations for Realistic Mock Stellar Catalogs

    Authors: Sung Hak Lim, Kailash A. Raman, Matthew R. Buckley, David Shih

    Abstract: Cosmological N-body simulations of galaxies operate at the level of "star particles" with a mass resolution on the scale of thousands of solar masses. Turning these simulations into stellar mock catalogs requires "upsampling" the star particles into individual stars following the same phase-space density. In this paper, we introduce two new upsampling methods. First, we describe GalaxyFlow, a soph… ▽ More

    Submitted 19 August, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: 22 pages, 14 figures, version published in MNRAS

    Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 533, Issue 1, September 2024, Pages 143-164

  32. arXiv:2210.14924  [pdf, other

    hep-ph hep-ex physics.data-an

    Resonant anomaly detection without background sculpting

    Authors: Anna Hallin, Gregor Kasieczka, Tobias Quadfasel, David Shih, Manuel Sommerhalder

    Abstract: We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that i… ▽ More

    Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: 11 pages, 8 figures; v2 (published version): referencing code and minor style updates

    Journal ref: Phys. Rev. D 107, 114012 (2023)

  33. arXiv:2210.14245  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow for CaloChallenge Dataset 1

    Authors: Claudius Krause, Ian Pang, David Shih

    Abstract: CaloFlow is a new and promising approach to fast calorimeter simulation based on normalizing flows. Applying CaloFlow to the photon and charged pion Geant4 showers of Dataset 1 of the Fast Calorimeter Simulation Challenge 2022, we show how it can produce high-fidelity samples with a sampling time that is several orders of magnitude faster than Geant4. We demonstrate the fidelity of the samples usi… ▽ More

    Submitted 15 May, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 36 pages, 21 figures, v3: match published version

    Journal ref: SciPost Phys. 16, 126 (2024)

  34. arXiv:2209.06225  [pdf, other

    hep-ph hep-ex physics.data-an

    Anomaly Detection under Coordinate Transformations

    Authors: Gregor Kasieczka, Radha Mastandrea, Vinicius Mikuni, Benjamin Nachman, Mariel Pettee, David Shih

    Abstract: There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 10 pages, 6 figures

  35. arXiv:2209.05518  [pdf, other

    hep-ph hep-ex

    VBF vs. GGF Higgs with Full-Event Deep Learning: Towards a Decay-Agnostic Tagger

    Authors: Cheng-Wei Chiang, David Shih, Shang-Fu Wei

    Abstract: We study the benefits of jet- and event-level deep learning methods in distinguishing vector boson fusion (VBF) from gluon-gluon fusion (GGF) Higgs production at the LHC. We show that a variety of classifiers (CNNs, attention-based networks) trained on the complete low-level inputs of the full event achieve significant performance gains over shallow machine learning methods (BDTs) trained on jet k… ▽ More

    Submitted 4 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: 21 pages+appendices, 16 figures; added references, updated Pythia shower scheme for VBF, and added Appendix C for version 2

  36. arXiv:2205.01129  [pdf, other

    astro-ph.GA hep-ph

    Measuring Galactic Dark Matter through Unsupervised Machine Learning

    Authors: Matthew R Buckley, Sung Hak Lim, Eric Putney, David Shih

    Abstract: Measuring the density profile of dark matter in the Solar neighborhood has important implications for both dark matter theory and experiment. In this work, we apply autoregressive flows to stars from a realistic simulation of a Milky Way-type galaxy to learn -- in an unsupervised way -- the stellar phase space density and its derivatives. With these as inputs, and under the assumption of dynamic e… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: 23 pages, 9 figures

  37. arXiv:2203.08806  [pdf, other

    hep-ph cs.LG hep-ex physics.comp-ph physics.ins-det

    New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

    Authors: Andreas Adelmann, Walter Hopkins, Evangelos Kourlitis, Michael Kagan, Gregor Kasieczka, Claudius Krause, David Shih, Vinicius Mikuni, Benjamin Nachman, Kevin Pedro, Daniel Winklehner

    Abstract: The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, pr… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: contribution to Snowmass 2021

    Report number: FERMILAB-CONF-22-199-SCD

  38. Machine Learning and LHC Event Generation

    Authors: Anja Butter, Tilman Plehn, Steffen Schumann, Simon Badger, Sascha Caron, Kyle Cranmer, Francesco Armando Di Bello, Etienne Dreyer, Stefano Forte, Sanmay Ganguly, Dorival Gonçalves, Eilam Gross, Theo Heimel, Gudrun Heinrich, Lukas Heinrich, Alexander Held, Stefan Höche, Jessica N. Howard, Philip Ilten, Joshua Isaacson, Timo Janßen, Stephen Jones, Marumi Kado, Michael Kagan, Gregor Kasieczka , et al. (26 additional authors not shown)

    Abstract: First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requi… ▽ More

    Submitted 28 December, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Review article based on a Snowmass 2021 contribution

    Journal ref: SciPost Phys. 14, 079 (2023)

  39. arXiv:2202.09375  [pdf, other

    hep-ph hep-ex physics.data-an

    Ephemeral Learning -- Augmenting Triggers with Online-Trained Normalizing Flows

    Authors: Anja Butter, Sascha Diefenbacher, Gregor Kasieczka, Benjamin Nachman, Tilman Plehn, David Shih, Ramon Winterhalder

    Abstract: The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomal… ▽ More

    Submitted 28 June, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 17 pages, 9 figures, minor changes to text, addressed referee comments

    Report number: CP3-22-10

    Journal ref: SciPost Phys. 13, 087 (2022)

  40. Dark Photons and Displaced Vertices at the MUonE Experiment

    Authors: Iftah Galon, David Shih, Isaac R. Wang

    Abstract: MUonE is a proposed experiment designed to measure the hadronic vacuum polarization contribution to muon $g-2$ through elastic $μ-e$ scattering. As such it employs an extremely high-resolution tracking apparatus. We point out that this makes MUonE also a very promising experiment to search for displaced vertices from light, weakly-interacting new particles. We demonstrate its potential by showing… ▽ More

    Submitted 5 May, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: PRD version, 10 pages, 5 figures

  41. Resolving Combinatorial Ambiguities in Dilepton $t \bar t$ Event Topologies with Neural Networks

    Authors: Haider Alhazmi, Zhongtian Dong, Li Huang, Jeong Han Kim, Kyoungchul Kong, David Shih

    Abstract: We study the potential of deep learning to resolve the combinatorial problem in SUSY-like events with two invisible particles at the LHC. As a concrete example, we focus on dileptonic $t \bar t$ events, where the combinatorial problem becomes an issue of binary classification: pairing the correct lepton with each $b$ quark coming from the decays of the tops. We investigate the performance of a num… ▽ More

    Submitted 27 June, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: 22 pages, 15 figures, 1 table, matches the published version

    Journal ref: Phys.Rev.D 105 (2022) 11, 115011

  42. arXiv:2112.03769  [pdf, other

    hep-ph hep-ex physics.data-an stat.ML

    Machine Learning in the Search for New Fundamental Physics

    Authors: Georgia Karagiorgi, Gregor Kasieczka, Scott Kravitz, Benjamin Nachman, David Shih

    Abstract: Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fiel… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: Preprint of article submitted to Nature Reviews Physics, 19 pages, 1 figure

  43. arXiv:2111.06417  [pdf, other

    cs.LG hep-ex hep-ph physics.acc-ph physics.data-an

    Online-compatible Unsupervised Non-resonant Anomaly Detection

    Authors: Vinicius Mikuni, Benjamin Nachman, David Shih

    Abstract: There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events - there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of non… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 9 pages, 3 figures

  44. arXiv:2110.11377  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows

    Authors: Claudius Krause, David Shih

    Abstract: Recently, we introduced CaloFlow, a high-fidelity generative model for GEANT4 calorimeter shower emulation based on normalizing flows. Here, we present CaloFlow v2, an improvement on our original framework that speeds up shower generation by a further factor of 500 relative to the original. The improvement is based on a technique called Probability Density Distillation, originally developed for sp… ▽ More

    Submitted 5 May, 2023; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 24 pages, 15 figures, 4 tables; v2: matches accepted version

  45. arXiv:2109.00546  [pdf, other

    hep-ph hep-ex physics.data-an

    Classifying Anomalies THrough Outer Density Estimation (CATHODE)

    Authors: Anna Hallin, Joshua Isaacson, Gregor Kasieczka, Claudius Krause, Benjamin Nachman, Tobias Quadfasel, Matthias Schlaffer, David Shih, Manuel Sommerhalder

    Abstract: We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional dens… ▽ More

    Submitted 11 September, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: 17 pages, 12 figures; v2: minor updates; v3 (published version): added study of background sculpting and minor fixes

    Report number: EFI-20-5, FERMILAB-PUB-21-389-T

    Journal ref: Phys. Rev. D 106, 055006 (2022)

  46. arXiv:2107.02821  [pdf, other

    stat.ML cs.LG hep-ex hep-ph

    New Methods and Datasets for Group Anomaly Detection From Fundamental Physics

    Authors: Gregor Kasieczka, Benjamin Nachman, David Shih

    Abstract: The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted for ANDEA (Anomaly and Novelty Detection, Explanation and Accommodation) Workshop at KDD 2021

  47. arXiv:2106.05285  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow: Fast and Accurate Generation of Calorimeter Showers with Normalizing Flows

    Authors: Claudius Krause, David Shih

    Abstract: We introduce CaloFlow, a fast detector simulation framework based on normalizing flows. For the first time, we demonstrate that normalizing flows can reproduce many-channel calorimeter showers with extremely high fidelity, providing a fresh alternative to computationally expensive GEANT4 simulations, as well as other state-of-the-art fast simulation frameworks based on GANs and VAEs. Besides the u… ▽ More

    Submitted 5 May, 2023; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 33 pages, 19 figures, 5 tables; v2: improved handling of datasets, conclusions unchanged; v3: matches accepted version

  48. arXiv:2104.12789  [pdf, other

    astro-ph.GA hep-ph physics.data-an

    Via Machinae: Searching for Stellar Streams using Unsupervised Machine Learning

    Authors: David Shih, Matthew R. Buckley, Lina Necib, John Tamanas

    Abstract: We develop a new machine learning algorithm, Via Machinae, to identify cold stellar streams in data from the Gaia telescope. Via Machinae is based on ANODE, a general method that uses conditional density estimation and sideband interpolation to detect local overdensities in the data in a model agnostic way. By applying ANODE to the positions, proper motions, and photometry of stars observed by Gai… ▽ More

    Submitted 28 December, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: 17 pages, 17 figures, v2: references added, minor corrections, v3: published version

  49. arXiv:2104.02092  [pdf, other

    hep-ph hep-ex physics.data-an stat.ML

    Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection

    Authors: Jack H. Collins, Pablo Martín-Ramiro, Benjamin Nachman, David Shih

    Abstract: Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We ex… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: 39 pages, 17 figures

  50. arXiv:2101.08320  [pdf, other

    hep-ph hep-ex physics.data-an

    The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics

    Authors: Gregor Kasieczka, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H. Collins, Biwei Dai, Felipe F. De Freitas, Barry M. Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D. A. Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F. Kamenik, Charanjit K. Khosa, Patrick Komiske, Luc Le Pottier , et al. (22 additional authors not shown)

    Abstract: A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a… ▽ More

    Submitted 20 January, 2021; originally announced January 2021.

    Comments: 108 pages, 53 figures, 3 tables