-
DeepPD: Joint Phase and Object Estimation from Phase Diversity with Neural Calibration of a Deformable Mirror
Authors:
Magdalena C. Schneider,
Courtney Johnson,
Cedric Allier,
Larissa Heinrich,
Diane Adjavon,
Joren Husic,
Patrick La Rivière,
Stephan Saalfeld,
Hari Shroff
Abstract:
Sample-induced aberrations and optical imperfections limit the resolution of fluorescence microscopy. Phase diversity is a powerful technique that leverages complementary phase information in sequentially acquired images with deliberately introduced aberrations--the phase diversities--to enable phase and object reconstruction and restore diffraction-limited resolution. These phase diversities are…
▽ More
Sample-induced aberrations and optical imperfections limit the resolution of fluorescence microscopy. Phase diversity is a powerful technique that leverages complementary phase information in sequentially acquired images with deliberately introduced aberrations--the phase diversities--to enable phase and object reconstruction and restore diffraction-limited resolution. These phase diversities are typically introduced into the optical path via a deformable mirror. Existing phase-diversity-based methods are limited to Zernike modes, require large numbers of diversity images, or depend on accurate mirror calibration--which are all suboptimal. We present DeepPD, a deep learning-based framework that combines neural representations of the object and wavefront with a learned model of the deformable mirror to jointly estimate both object and phase from only five images. DeepPD improves robustness and reconstruction quality over previous approaches, even under severe aberrations. We demonstrate its performance on calibration targets and biological samples, including immunolabeled myosin in fixed PtK2 cells.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Strategic White Paper on AI Infrastructure for Particle, Nuclear, and Astroparticle Physics: Insights from JENA and EuCAIF
Authors:
Sascha Caron,
Andreas Ipp,
Gert Aarts,
Gábor Bíró,
Daniele Bonacorsi,
Elena Cuoco,
Caterina Doglioni,
Tommaso Dorigo,
Julián García Pardiñas,
Stefano Giagu,
Tobias Golling,
Lukas Heinrich,
Ik Siong Heng,
Paula Gina Isar,
Karolos Potamianos,
Liliana Teodorescu,
John Veitch,
Pietro Vischia,
Christoph Weniger
Abstract:
Artificial intelligence (AI) is transforming scientific research, with deep learning methods playing a central role in data analysis, simulations, and signal detection across particle, nuclear, and astroparticle physics. Within the JENA communities-ECFA, NuPECC, and APPEC-and as part of the EuCAIF initiative, AI integration is advancing steadily. However, broader adoption remains constrained by ch…
▽ More
Artificial intelligence (AI) is transforming scientific research, with deep learning methods playing a central role in data analysis, simulations, and signal detection across particle, nuclear, and astroparticle physics. Within the JENA communities-ECFA, NuPECC, and APPEC-and as part of the EuCAIF initiative, AI integration is advancing steadily. However, broader adoption remains constrained by challenges such as limited computational resources, a lack of expertise, and difficulties in transitioning from research and development (R&D) to production. This white paper provides a strategic roadmap, informed by a community survey, to address these barriers. It outlines critical infrastructure requirements, prioritizes training initiatives, and proposes funding strategies to scale AI capabilities across fundamental physics over the next five years.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Flow Annealed Importance Sampling Bootstrap meets Differentiable Particle Physics
Authors:
Annalena Kofler,
Vincent Stimper,
Mikhail Mikhasenko,
Michael Kagan,
Lukas Heinrich
Abstract:
High-energy physics requires the generation of large numbers of simulated data samples from complex but analytically tractable distributions called matrix elements. Surrogate models, such as normalizing flows, are gaining popularity for this task due to their computational efficiency. We adopt an approach based on Flow Annealed importance sampling Bootstrap (FAB) that evaluates the differentiable…
▽ More
High-energy physics requires the generation of large numbers of simulated data samples from complex but analytically tractable distributions called matrix elements. Surrogate models, such as normalizing flows, are gaining popularity for this task due to their computational efficiency. We adopt an approach based on Flow Annealed importance sampling Bootstrap (FAB) that evaluates the differentiable target density during training and helps avoid the costly generation of training data in advance. We show that FAB reaches higher sampling efficiency with fewer target evaluations in high dimensions in comparison to other methods.
△ Less
Submitted 25 May, 2025; v1 submitted 25 November, 2024;
originally announced November 2024.
-
Is Tokenization Needed for Masked Particle Modelling?
Authors:
Matthew Leigh,
Samuel Klein,
François Charton,
Tobias Golling,
Lukas Heinrich,
Michael Kagan,
Inês Ochoa,
Margarita Osadchy
Abstract:
In this work, we significantly enhance masked particle modeling (MPM), a self-supervised learning scheme for constructing highly expressive representations of unordered sets relevant to developing foundation models for high-energy physics. In MPM, a model is trained to recover the missing elements of a set, a learning objective that requires no labels and can be applied directly to experimental da…
▽ More
In this work, we significantly enhance masked particle modeling (MPM), a self-supervised learning scheme for constructing highly expressive representations of unordered sets relevant to developing foundation models for high-energy physics. In MPM, a model is trained to recover the missing elements of a set, a learning objective that requires no labels and can be applied directly to experimental data. We achieve significant performance improvements over previous work on MPM by addressing inefficiencies in the implementation and incorporating a more powerful decoder. We compare several pre-training tasks and introduce new reconstruction methods that utilize conditional generative models without data tokenization or discretization. We show that these new methods outperform the tokenized learning objective from the original MPM on a new test bed for foundation models for jets, which includes using a wide variety of downstream tasks relevant to jet physics, such as classification, secondary vertex finding, and track identification.
△ Less
Submitted 1 October, 2024; v1 submitted 19 September, 2024;
originally announced September 2024.
-
DaCapo: a modular deep learning framework for scalable 3D image segmentation
Authors:
William Patton,
Jeff L. Rhoades,
Marwan Zouinkhi,
David G. Ackerman,
Caroline Malin-Mayor,
Diane Adjavon,
Larissa Heinrich,
Davis Bennett,
Yurii Zubov,
CellMap Project Team,
Aubrey V. Weigel,
Jan Funke
Abstract:
DaCapo is a specialized deep learning library tailored to expedite the training and application of existing machine learning approaches on large, near-isotropic image data. In this correspondence, we introduce DaCapo's unique features optimized for this specific domain, highlighting its modular structure, efficient experiment management tools, and scalable deployment capabilities. We discuss its p…
▽ More
DaCapo is a specialized deep learning library tailored to expedite the training and application of existing machine learning approaches on large, near-isotropic image data. In this correspondence, we introduce DaCapo's unique features optimized for this specific domain, highlighting its modular structure, efficient experiment management tools, and scalable deployment capabilities. We discuss its potential to improve access to large-scale, isotropic image segmentation and invite the community to explore and contribute to this open-source initiative.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Decomposing heterogeneous dynamical systems with graph neural networks
Authors:
Cédric Allier,
Magdalena C. Schneider,
Michael Innerberger,
Larissa Heinrich,
John A. Bogovic,
Stephan Saalfeld
Abstract:
Natural physical, chemical, and biological dynamical systems are often complex, with heterogeneous components interacting in diverse ways. We show that graph neural networks can be designed to jointly learn the interaction rules and the structure of the heterogeneity from data alone. The learned latent structure and dynamics can be used to virtually decompose the complex system which is necessary…
▽ More
Natural physical, chemical, and biological dynamical systems are often complex, with heterogeneous components interacting in diverse ways. We show that graph neural networks can be designed to jointly learn the interaction rules and the structure of the heterogeneity from data alone. The learned latent structure and dynamics can be used to virtually decompose the complex system which is necessary to parameterize and infer the underlying governing equations. We tested the approach with simulation experiments of moving particles and vector fields that interact with each other. While our current aim is to better understand and validate the approach with simulated data, we anticipate it to become a generally applicable tool to uncover the governing rules underlying complex dynamics observed in nature.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Scalable ATLAS pMSSM computational workflows using containerised REANA reusable analysis platform
Authors:
Marco Donadoni,
Matthew Feickert,
Lukas Heinrich,
Yang Liu,
Audrius Mečionis,
Vladyslav Moisieienkov,
Tibor Šimko,
Giordon Stark,
Marco Vidal García
Abstract:
In this paper we describe the development of a streamlined framework for large-scale ATLAS pMSSM reinterpretations of LHC Run-2 analyses using containerised computational workflows. The project is looking to assess the global coverage of BSM physics and requires running O(5k) computational workflows representing pMSSM model points. Following ATLAS Analysis Preservation policies, many analyses have…
▽ More
In this paper we describe the development of a streamlined framework for large-scale ATLAS pMSSM reinterpretations of LHC Run-2 analyses using containerised computational workflows. The project is looking to assess the global coverage of BSM physics and requires running O(5k) computational workflows representing pMSSM model points. Following ATLAS Analysis Preservation policies, many analyses have been preserved as containerised Yadage workflows, and after validation were added to a curated selection for the pMSSM study. To run the workflows at scale, we utilised the REANA reusable analysis platform. We describe how the REANA platform was enhanced to ensure the best concurrent throughput by internal service scheduling changes. We discuss the scalability of the approach on Kubernetes clusters from 500 to 5000 cores. Finally, we demonstrate a possibility of using additional ad-hoc public cloud infrastructure resources by running the same workflows on the Google Cloud Platform.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Combined track finding with GNN & CKF
Authors:
Lukas Heinrich,
Benjamin Huth,
Andreas Salzburger,
Tilo Wettig
Abstract:
The application of Graph Neural Networks (GNN) in track reconstruction is a promising approach to cope with the challenges arising at the High-Luminosity upgrade of the Large Hadron Collider (HL-LHC). GNNs show good track-finding performance in high-multiplicity scenarios and are naturally parallelizable on heterogeneous compute architectures.
Typical high-energy-physics detectors have high reso…
▽ More
The application of Graph Neural Networks (GNN) in track reconstruction is a promising approach to cope with the challenges arising at the High-Luminosity upgrade of the Large Hadron Collider (HL-LHC). GNNs show good track-finding performance in high-multiplicity scenarios and are naturally parallelizable on heterogeneous compute architectures.
Typical high-energy-physics detectors have high resolution in the innermost layers to support vertex reconstruction but lower resolution in the outer parts. GNNs mainly rely on 3D space-point information, which can cause reduced track-finding performance in the outer regions.
In this contribution, we present a novel combination of GNN-based track finding with the classical Combinatorial Kalman Filter (CKF) algorithm to circumvent this issue: The GNN resolves the track candidates in the inner pixel region, where 3D space points can represent measurements very well. These candidates are then picked up by the CKF in the outer regions, where the CKF performs well even for 1D measurements.
Using the ACTS infrastructure, we present a proof of concept based on truth tracking in the pixels as well as a dedicated GNN pipeline trained on $t\bar{t}$ events with pile-up 200 in the OpenDataDetector.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models
Authors:
Tobias Golling,
Lukas Heinrich,
Michael Kagan,
Samuel Klein,
Matthew Leigh,
Margarita Osadchy,
John Andrew Raine
Abstract:
We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards bui…
▽ More
We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.
△ Less
Submitted 11 July, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Finetuning Foundation Models for Joint Analysis Optimization
Authors:
Matthias Vigl,
Nicole Hartman,
Lukas Heinrich
Abstract:
In this work we demonstrate that significant gains in performance and data efficiency can be achieved in High Energy Physics (HEP) by moving beyond the standard paradigm of sequential optimization or reconstruction and analysis components. We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensi…
▽ More
In this work we demonstrate that significant gains in performance and data efficiency can be achieved in High Energy Physics (HEP) by moving beyond the standard paradigm of sequential optimization or reconstruction and analysis components. We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensional embedding spaces and quantify the gains in the example usecase of searches of heavy resonances decaying via an intermediate di-Higgs system to four $b$-jets.
△ Less
Submitted 25 January, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Branches of a Tree: Taking Derivatives of Programs with Discrete and Branching Randomness in High Energy Physics
Authors:
Michael Kagan,
Lukas Heinrich
Abstract:
We propose to apply several gradient estimation techniques to enable the differentiation of programs with discrete randomness in High Energy Physics. Such programs are common in High Energy Physics due to the presence of branching processes and clustering-based analysis. Thus differentiating such programs can open the way for gradient based optimization in the context of detector design optimizati…
▽ More
We propose to apply several gradient estimation techniques to enable the differentiation of programs with discrete randomness in High Energy Physics. Such programs are common in High Energy Physics due to the presence of branching processes and clustering-based analysis. Thus differentiating such programs can open the way for gradient based optimization in the context of detector design optimization, simulator tuning, or data analysis and reconstruction optimization. We discuss several possible gradient estimation strategies, including the recent Stochastic AD method, and compare them in simplified detector design experiments. In doing so we develop, to the best of our knowledge, the first fully differentiable branching program.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Hierarchical Neural Simulation-Based Inference Over Event Ensembles
Authors:
Lukas Heinrich,
Siddharth Mishra-Sharma,
Chris Pollard,
Philipp Windischhofer
Abstract:
When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for frequentist and Baye…
▽ More
When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for frequentist and Bayesian dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via a hierarchical forward model. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to significantly tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics and cosmology.
△ Less
Submitted 21 February, 2024; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Potential of the Julia programming language for high energy physics computing
Authors:
J. Eschle,
T. Gal,
M. Giordano,
P. Gras,
B. Hegner,
L. Heinrich,
U. Hernandez Acosta,
S. Kluth,
J. Ling,
P. Mato,
M. Mikhasenko,
A. Moreno Briceño,
J. Pivarski,
K. Samaras-Tsakiris,
O. Schulz,
G. . A. Stewart,
J. Strube,
V. Vassilev
Abstract:
Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular app…
▽ More
Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular approach consists of combining Python, used for the high-level interface, and C++, used for the computing intensive part of the code. A more convenient and efficient approach would be to use a language that provides both high-level programming and high-performance. The Julia programming language, developed at MIT especially to allow the use of a single language in research activities, has followed this path. In this paper the applicability of using the Julia language for HEP research is explored, covering the different aspects that are important for HEP code development: runtime performance, handling of large projects, interface with legacy code, distributed computing, training, and ease of programming. The study shows that the HEP community would benefit from a large scale adoption of this programming language. The HEP-specific foundation libraries that would need to be consolidated are identified
△ Less
Submitted 6 October, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Configurable calorimeter simulation for AI applications
Authors:
Francesco Armando Di Bello,
Anton Charkin-Gorbulin,
Kyle Cranmer,
Etienne Dreyer,
Sanmay Ganguly,
Eilam Gross,
Lukas Heinrich,
Lorenzo Santi,
Marumi Kado,
Nilotpal Kakati,
Patrick Rieck,
Matteo Tusoni
Abstract:
A configurable calorimeter simulation for AI (COCOA) applications is presented, based on the Geant4 toolkit and interfaced with the Pythia event generator. This open-source project is aimed to support the development of machine learning algorithms in high energy physics that rely on realistic particle shower descriptions, such as reconstruction, fast simulation, and low-level analysis. Specificati…
▽ More
A configurable calorimeter simulation for AI (COCOA) applications is presented, based on the Geant4 toolkit and interfaced with the Pythia event generator. This open-source project is aimed to support the development of machine learning algorithms in high energy physics that rely on realistic particle shower descriptions, such as reconstruction, fast simulation, and low-level analysis. Specifications such as the granularity and material of its nearly hermetic geometry are user-configurable. The tool is supplemented with simple event processing including topological clustering, jet algorithms, and a nearest-neighbors graph construction. Formatting is also provided to visualise events using the Phoenix event display software.
△ Less
Submitted 8 March, 2023; v1 submitted 3 March, 2023;
originally announced March 2023.
-
FAIR for AI: An interdisciplinary and international community building perspective
Authors:
E. A. Huerta,
Ben Blaiszik,
L. Catherine Brinson,
Kristofer E. Bouchard,
Daniel Diaz,
Caterina Doglioni,
Javier M. Duarte,
Murali Emani,
Ian Foster,
Geoffrey Fox,
Philip Harris,
Lukas Heinrich,
Shantenu Jha,
Daniel S. Katz,
Volodymyr Kindratenko,
Christine R. Kirkpatrick,
Kati Lassila-Perini,
Ravi K. Madduri,
Mark S. Neubauer,
Fotis E. Psomopoulos,
Avik Roy,
Oliver Rübel,
Zhizhen Zhao,
Ruike Zhu
Abstract:
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i…
▽ More
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.
△ Less
Submitted 1 August, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.
-
neos: End-to-End-Optimised Summary Statistics for High Energy Physics
Authors:
Nathan Simpson,
Lukas Heinrich
Abstract:
The advent of deep learning has yielded powerful tools to automatically compute gradients of computations. This is because training a neural network equates to iteratively updating its parameters using gradient descent to find the minimum of a loss function. Deep learning is then a subset of a broader paradigm; a workflow with free parameters that is end-to-end optimisable, provided one can keep t…
▽ More
The advent of deep learning has yielded powerful tools to automatically compute gradients of computations. This is because training a neural network equates to iteratively updating its parameters using gradient descent to find the minimum of a loss function. Deep learning is then a subset of a broader paradigm; a workflow with free parameters that is end-to-end optimisable, provided one can keep track of the gradients all the way through. This work introduces neos: an example implementation following this paradigm of a fully differentiable high-energy physics workflow, capable of optimising a learnable summary statistic with respect to the expected sensitivity of an analysis. Doing this results in an optimisation process that is aware of the modelling and treatment of systematic uncertainties.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Differentiable Matrix Elements with MadJax
Authors:
Lukas Heinrich,
Michael Kagan
Abstract:
MadJax is a tool for generating and evaluating differentiable matrix elements of high energy scattering processes. As such, it is a step towards a differentiable programming paradigm in high energy physics that facilitates the incorporation of high energy physics domain knowledge, encoded in simulation software, into gradient based learning and optimization pipelines. MadJax comprises two componen…
▽ More
MadJax is a tool for generating and evaluating differentiable matrix elements of high energy scattering processes. As such, it is a step towards a differentiable programming paradigm in high energy physics that facilitates the incorporation of high energy physics domain knowledge, encoded in simulation software, into gradient based learning and optimization pipelines. MadJax comprises two components: (a) a plugin to the general purpose matrix element generator MadGraph that integrates matrix element and phase space sampling code with the JAX differentiable programming framework, and (b) a standalone wrapping API for accessing the matrix element code and its gradients, which are computed with automatic differentiation. The MadJax implementation and example applications of simulation based inference and normalizing flow based matrix element modeling, with capabilities enabled uniquely with differentiable matrix elements, are presented.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
Distributed statistical inference with pyhf enabled through funcX
Authors:
Matthew Feickert,
Lukas Heinrich,
Giordon Stark,
Ben Galewsky
Abstract:
In High Energy Physics facilities that provide High Performance Computing environments provide an opportunity to efficiently perform the statistical inference required for analysis of data from the Large Hadron Collider, but can pose problems with orchestration and efficient scheduling. The compute architectures at these facilities do not easily support the Python compute model, and the configurat…
▽ More
In High Energy Physics facilities that provide High Performance Computing environments provide an opportunity to efficiently perform the statistical inference required for analysis of data from the Large Hadron Collider, but can pose problems with orchestration and efficient scheduling. The compute architectures at these facilities do not easily support the Python compute model, and the configuration scheduling of batch jobs for physics often requires expertise in multiple job scheduling services. The combination of the pure-Python libraries pyhf and funcX reduces the common problem in HEP analyses of performing statistical inference with binned models, that would traditionally take multiple hours and bespoke scheduling, to an on-demand (fitting) "function as a service" that can scalably execute across workers in just a few minutes, offering reduced time to insight and inference. We demonstrate execution of a scalable workflow using funcX to simultaneously fit 125 signal hypotheses from a published ATLAS search for new physics using pyhf with a wall time of under 3 minutes. We additionally show performance comparisons for other physics analyses with openly published probability models and argue for a blueprint of fitting as a service systems at HPC centers.
△ Less
Submitted 31 August, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
Authors:
Atılım Güneş Baydin,
Lei Shao,
Wahid Bhimji,
Lukas Heinrich,
Lawrence Meadows,
Jialin Liu,
Andreas Munk,
Saeid Naderiparizi,
Bradley Gram-Hansen,
Gilles Louppe,
Mingfei Ma,
Xiaohui Zhao,
Philip Torr,
Victor Lee,
Kyle Cranmer,
Prabhat,
Frank Wood
Abstract:
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL frame…
▽ More
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
△ Less
Submitted 27 August, 2019; v1 submitted 7 July, 2019;
originally announced July 2019.
-
Multi-Domain Adversarial Learning
Authors:
Alice Schoenauer-Sebag,
Louise Heinrich,
Marc Schoenauer,
Michele Sebag,
Lani F. Wu,
Steve J. Altschuler
Abstract:
Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage…
▽ More
Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average- and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state of the art on two standard image benchmarks, and a novel bioimage dataset, Cell.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model
Authors:
Atılım Güneş Baydin,
Lukas Heinrich,
Wahid Bhimji,
Lei Shao,
Saeid Naderiparizi,
Andreas Munk,
Jialin Liu,
Bradley Gram-Hansen,
Gilles Louppe,
Lawrence Meadows,
Philip Torr,
Victor Lee,
Prabhat,
Kyle Cranmer,
Frank Wood
Abstract:
We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable po…
▽ More
We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable posterior inference in the structured model defined by the simulator code base. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of a Markov chain Monte Carlo baseline.
△ Less
Submitted 17 February, 2020; v1 submitted 20 July, 2018;
originally announced July 2018.
-
Machine Learning in High Energy Physics Community White Paper
Authors:
Kim Albertsson,
Piero Altoe,
Dustin Anderson,
John Anderson,
Michael Andrews,
Juan Pedro Araque Espinosa,
Adam Aurisano,
Laurent Basara,
Adrian Bevan,
Wahid Bhimji,
Daniele Bonacorsi,
Bjorn Burkle,
Paolo Calafiura,
Mario Campanelli,
Louis Capps,
Federico Carminati,
Stefano Carrazza,
Yi-fan Chen,
Taylor Childers,
Yann Coadou,
Elias Coniavitis,
Kyle Cranmer,
Claire David,
Douglas Davis,
Andrea De Simone
, et al. (103 additional authors not shown)
Abstract:
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d…
▽ More
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.
△ Less
Submitted 16 May, 2019; v1 submitted 8 July, 2018;
originally announced July 2018.
-
Synaptic Cleft Segmentation in Non-Isotropic Volume Electron Microscopy of the Complete Drosophila Brain
Authors:
Larissa Heinrich,
Jan Funke,
Constantin Pape,
Juan Nunez-Iglesias,
Stephan Saalfeld
Abstract:
Neural circuit reconstruction at single synapse resolution is increasingly recognized as crucially important to decipher the function of biological nervous systems. Volume electron microscopy in serial transmission or scanning mode has been demonstrated to provide the necessary resolution to segment or trace all neurites and to annotate all synaptic connections.
Automatic annotation of synaptic…
▽ More
Neural circuit reconstruction at single synapse resolution is increasingly recognized as crucially important to decipher the function of biological nervous systems. Volume electron microscopy in serial transmission or scanning mode has been demonstrated to provide the necessary resolution to segment or trace all neurites and to annotate all synaptic connections.
Automatic annotation of synaptic connections has been done successfully in near isotropic electron microscopy of vertebrate model organisms. Results on non-isotropic data in insect models, however, are not yet on par with human annotation.
We designed a new 3D-U-Net architecture to optimally represent isotropic fields of view in non-isotropic data. We used regression on a signed distance transform of manually annotated synaptic clefts of the CREMI challenge dataset to train this model and observed significant improvement over the state of the art.
We developed open source software for optimized parallel prediction on very large volumetric datasets and applied our model to predict synaptic clefts in a 50 tera-voxels dataset of the complete Drosophila brain. Our model generalizes well to areas far away from where training data was available.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators
Authors:
Mario Lezcano Casado,
Atilim Gunes Baydin,
David Martinez Rubio,
Tuan Anh Le,
Frank Wood,
Lukas Heinrich,
Gilles Louppe,
Kyle Cranmer,
Karen Ng,
Wahid Bhimji,
Prabhat
Abstract:
We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges…
▽ More
We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges for traditional approaches to inference. We extend previous work in "inference compilation", which combines universal probabilistic programming and deep learning methods, to large-scale scientific simulators, and introduce a C++ based probabilistic programming library called CPProb. We successfully use CPProb to interface with SHERPA, a large code-base used in particle physics. Here we describe the technical innovations realized and planned for this library.
△ Less
Submitted 21 December, 2017;
originally announced December 2017.
-
Deep Learning for Isotropic Super-Resolution from Non-Isotropic 3D Electron Microscopy
Authors:
Larissa Heinrich,
John A. Bogovic,
Stephan Saalfeld
Abstract:
The most sophisticated existing methods to generate 3D isotropic super-resolution (SR) from non-isotropic electron microscopy (EM) are based on learned dictionaries. Unfortunately, none of the existing methods generate practically satisfying results. For 2D natural images, recently developed super-resolution methods that use deep learning have been shown to significantly outperform the previous st…
▽ More
The most sophisticated existing methods to generate 3D isotropic super-resolution (SR) from non-isotropic electron microscopy (EM) are based on learned dictionaries. Unfortunately, none of the existing methods generate practically satisfying results. For 2D natural images, recently developed super-resolution methods that use deep learning have been shown to significantly outperform the previous state of the art.
We have adapted one of the most successful architectures (FSRCNN) for 3D super-resolution, and compared its performance to a 3D U-Net architecture that has not been used previously to generate super-resolution.
We trained both architectures on artificially downscaled isotropic ground truth from focused ion beam milling scanning EM (FIB-SEM) and tested the performance for various hyperparameter settings.
Our results indicate that both architectures can successfully generate 3D isotropic super-resolution from non-isotropic EM, with the U-Net performing consistently better. We propose several promising directions for practical application.
△ Less
Submitted 9 June, 2017;
originally announced June 2017.
-
HEPData: a repository for high energy physics data
Authors:
Eamonn Maguire,
Lukas Heinrich,
Graeme Watt
Abstract:
The Durham High Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics papers. It comprises data points underlying several thousand publications. Over the last two years, the HEPData software has been completely rewritten using modern computing technologies as an overlay on the Invenio…
▽ More
The Durham High Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics papers. It comprises data points underlying several thousand publications. Over the last two years, the HEPData software has been completely rewritten using modern computing technologies as an overlay on the Invenio v3 digital library framework. The software is open source with the new site available at https://hepdata.net now replacing the previous site at http://hepdata.cedar.ac.uk. In this write-up, we describe the development of the new site and explain some of the advantages it offers over the previous platform.
△ Less
Submitted 18 April, 2017;
originally announced April 2017.