Search | arXiv e-print repository

doi 10.21468/SciPostPhys.18.4.117

Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion

Authors: Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

Abstract: The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generati… ▽ More The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic top quark pair production at the Large Hadron Collider. △ Less

Submitted 23 January, 2025; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: Submission to SciPost

Journal ref: SciPost Phys. 18, 117 (2025)

arXiv:2309.01886 [pdf]

doi 10.1038/s42005-024-01627-4

Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks

Authors: Michael James Fenton, Alexander Shmakov, Hideki Okawa, Yuji Li, Ko-Yang Hsiao, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

Abstract: Reconstructing unstable heavy particles requires sophisticated techniques to sift through the large number of possible permutations for assignment of detector objects to the underlying partons. Anapproach based on a generalized attention mechanism, symmetry preserving attention networks (SPA-NET), has been previously applied to top quark pair decays at the Large Hadron Collider which produce only… ▽ More Reconstructing unstable heavy particles requires sophisticated techniques to sift through the large number of possible permutations for assignment of detector objects to the underlying partons. Anapproach based on a generalized attention mechanism, symmetry preserving attention networks (SPA-NET), has been previously applied to top quark pair decays at the Large Hadron Collider which produce only hadronic jets. Here we extend the SPA-NET architecture to consider multiple input object types, such as leptons, as well as global event features, such as the missing transverse momentum. Inaddition, we provide regression and classification outputs to supplement the parton assignment. We explore the performance of the extended capability of SPA-NET in the context of semi-leptonic decays of top quark pairs as well as top quark pairs produced in association with a Higgs boson. We find significant improvements in the power of three representative studies: a search for ttH, a measurement of the top quark mass, and a search for a heavy Z' decaying to top quark pairs. We present ablation studies to provide insight on what the network has learned in each case. △ Less

Submitted 30 April, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

Comments: Accepted by Nature Communications Physics, replaced with published version

Journal ref: Commun Phys 7, 139 (2024)

arXiv:2301.01704 [pdf, other]

TRASH: Tandem Rover and Aerial Scrap Harvester

Authors: Lee Milburn, John Chiaramonte, Jack Fenton, Taskin Padir

Abstract: Addressing the challenge of roadside litter in the United States, which has traditionally relied on costly and ineffective manual cleanup methods, this paper presents an autonomous multi-robot system for highway litter monitoring and collection. Our solution integrates an aerial vehicle to scan and gather data across highway stretches with a terrestrial robot equipped with a Convolutional Neural N… ▽ More Addressing the challenge of roadside litter in the United States, which has traditionally relied on costly and ineffective manual cleanup methods, this paper presents an autonomous multi-robot system for highway litter monitoring and collection. Our solution integrates an aerial vehicle to scan and gather data across highway stretches with a terrestrial robot equipped with a Convolutional Neural Network (CNN) for litter detection and mapping. Upon detecting litter, the ground robot navigates to each pinpointed location, re-assesses the vicinity, and employs a "greedy pickup" approach to address potential mapping inaccuracies or litter misplacements. Through simulation studies and real-world robotic trials, this work highlights the potential of our proposed system for highway cleanliness and management in the context of Robotics, Automation, and Artificial Intelligence △ Less

Submitted 20 November, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

Comments: To be published in RAAI 2023 Conference

arXiv:2201.08239 [pdf, other]

LaMDA: Language Models for Dialog Applications

Authors: Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao , et al. (35 additional authors not shown)

Abstract: We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotat… ▽ More We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency. △ Less

Submitted 10 February, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

arXiv:2106.03898 [pdf, other]

doi 10.21468/SciPostPhys.12.5.178

SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Authors: Alexander Shmakov, Michael James Fenton, Ta-Wei Ho, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

Abstract: The creation of unstable heavy particles at the Large Hadron Collider is the most direct way to address some of the deepest open questions in physics. Collisions typically produce variable-size sets of observed particles which have inherent ambiguities complicating the assignment of observed particles to the decay products of the heavy particles. Current strategies for tackling these challenges in… ▽ More The creation of unstable heavy particles at the Large Hadron Collider is the most direct way to address some of the deepest open questions in physics. Collisions typically produce variable-size sets of observed particles which have inherent ambiguities complicating the assignment of observed particles to the decay products of the heavy particles. Current strategies for tackling these challenges in the physics community ignore the physical symmetries of the decay products and consider all possible assignment permutations and do not scale to complex configurations. Attention based deep learning methods for sequence modelling have achieved state-of-the-art performance in natural language processing, but they lack built-in mechanisms to deal with the unique symmetries found in physical set-assignment problems. We introduce a novel method for constructing symmetry-preserving attention networks which reflect the problem's natural invariances to efficiently find assignments without evaluating all permutations. This general approach is applicable to arbitrarily complex configurations and significantly outperforms current methods, improving reconstruction efficiency between 19\% - 35\% on typical benchmark problems while decreasing inference time by two to five orders of magnitude on the most complex events, making many important and previously intractable cases tractable. A full code repository containing a general library, the specific configuration used, and a complete dataset release, are avaiable at https://github.com/Alexanders101/SPANet △ Less

Submitted 22 July, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: published in SciPost

Journal ref: SciPost Phys. 12, 178 (2022)

arXiv:2010.09206 [pdf]

doi 10.1103/PhysRevD.105.112008

Permutationless Many-Jet Event Reconstruction with Symmetry Preserving Attention Networks

Authors: Michael James Fenton, Alexander Shmakov, Ta-Wei Ho, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

Abstract: Top quarks, produced in large numbers at the Large Hadron Collider, have a complex detector signature and require special reconstruction techniques. The most common decay mode, the "all-jet" channel, results in a 6-jet final state which is particularly difficult to reconstruct in $pp$ collisions due to the large number of permutations possible. We present a novel approach to this class of problem,… ▽ More Top quarks, produced in large numbers at the Large Hadron Collider, have a complex detector signature and require special reconstruction techniques. The most common decay mode, the "all-jet" channel, results in a 6-jet final state which is particularly difficult to reconstruct in $pp$ collisions due to the large number of permutations possible. We present a novel approach to this class of problem, based on neural networks using a generalized attention mechanism, that we call Symmetry Preserving Attention Networks (SPA-Net). We train one such network to identify the decay products of each top quark unambiguously and without combinatorial explosion as an example of the power of this technique.This approach significantly outperforms existing state-of-the-art methods, correctly assigning all jets in $93.0%$ of $6$-jet, $87.8%$ of $7$-jet, and $82.6%$ of $\geq 8$-jet events respectively. △ Less

Submitted 14 July, 2022; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: replaced with final published version

Journal ref: Phys. Rev. D 105, 11200 Published 15 June 2022

Showing 1–6 of 6 results for author: Fenton, J