Search | arXiv e-print repository

Inferring correlated distributions: boosted top jets

Authors: Ezequiel Alvarez, Manuel Szewc, Alejandro Szynkman, Santiago Tanco, Tatiana Tarutina

Abstract: Improving the understanding of signal and background distributions in signal-region is a valuable key to enhance any analysis in collider physics. This is usually a difficult task because -- among others -- signal and backgrounds are hard to discriminate in signal-region, simulations may reach a limit of reliability if they need to model non-perturbative QCD, and distributions are multi-dimensiona… ▽ More Improving the understanding of signal and background distributions in signal-region is a valuable key to enhance any analysis in collider physics. This is usually a difficult task because -- among others -- signal and backgrounds are hard to discriminate in signal-region, simulations may reach a limit of reliability if they need to model non-perturbative QCD, and distributions are multi-dimensional and many times may be correlated within each class. Bayesian density estimation is a technique that leverages prior knowledge and data correlations to effectively extract information from data in signal-region. In this work we extend previous works on data-driven mixture models for meaningful unsupervised signal extraction in collider physics to incorporate correlations between features. Using a standard dataset of top and QCD jets, we show how simulators, despite having an expected bias, can be used to inject sufficient inductive nuance into an inference model in terms of priors to then be corrected by data and estimate the true correlated distributions between features within each class. We compare the model with and without correlations to show how the signal extraction is sensitive to their inclusion and we quantify the improvement due to the inclusion of correlations using both supervised and unsupervised metrics. △ Less

Submitted 16 May, 2025; originally announced May 2025.

Comments: 25 pages, 12 figures, 2 tables

arXiv:2503.05667 [pdf, other]

Characterizing the hadronization of parton showers using the HOMER method

Authors: Benoit Assi, Christan Bierlich, Philip Ilten, Tony Menzo, Stephen Mrenna, Manuel Szewc, Michael K. Wilkinson, Ahmed Youssef, Jure Zupan

Abstract: We update the HOMER method, a technique to solve a restricted version of the inverse problem of hadronization -- extracting the Lund string fragmentation function $f(z)$ from data using only observable information. Here, we demonstrate its utility by extracting $f(z)$ from synthetic Pythia simulations using high-level observables constructed on an event-by-event basis, such as multiplicities and s… ▽ More We update the HOMER method, a technique to solve a restricted version of the inverse problem of hadronization -- extracting the Lund string fragmentation function $f(z)$ from data using only observable information. Here, we demonstrate its utility by extracting $f(z)$ from synthetic Pythia simulations using high-level observables constructed on an event-by-event basis, such as multiplicities and shape variables. Four cases of increasing complexity are considered, corresponding to $e^+e^-$ collisions at a center-of-mass energy of $90$ GeV producing either a string stretched between a $q$ and $\bar{q}$ containing no gluons; the same string containing one gluon $g$ with fixed kinematics; the same but the gluon has varying kinematics; and the most realistic case, strings with an unrestricted number of gluons that is the end-result of a parton shower. We demonstrate the extraction of $f(z)$ in each case, with the result of only a relatively modest degradation in performance of the HOMER method with the increased complexity of the string system. △ Less

Submitted 7 March, 2025; originally announced March 2025.

Comments: 43 pages, 27 figures

Report number: FERMILAB-PUB-25-0133-CSAID

arXiv:2411.02194 [pdf, other]

Rejection Sampling with Autodifferentiation - Case study: Fitting a Hadronization Model

Authors: Nick Heller, Phil Ilten, Tony Menzo, Stephen Mrenna, Benjamin Nachman, Andrzej Siodmok, Manuel Szewc, Ahmed Youssef

Abstract: We present an autodifferentiable rejection sampling algorithm termed Rejection Sampling with Autodifferentiation (RSA). In conjunction with reweighting, we show that RSA can be used for efficient parameter estimation and model exploration. Additionally, this approach facilitates the use of unbinned machine-learning-based observables, allowing for more precise, data-driven fits. To showcase these c… ▽ More We present an autodifferentiable rejection sampling algorithm termed Rejection Sampling with Autodifferentiation (RSA). In conjunction with reweighting, we show that RSA can be used for efficient parameter estimation and model exploration. Additionally, this approach facilitates the use of unbinned machine-learning-based observables, allowing for more precise, data-driven fits. To showcase these capabilities, we apply an RSA-based parameter fit to a simplified hadronization model. △ Less

Submitted 6 December, 2024; v1 submitted 4 November, 2024; originally announced November 2024.

Comments: 12 pages, 5 figures

Report number: FERMILAB-PUB-24-0784-CSAID, MCNET-24-18

arXiv:2410.06342 [pdf, other]

doi 10.21468/SciPostPhys.18.2.054

Describing Hadronization via Histories and Observables for Monte-Carlo Event Reweighting

Authors: Christian Bierlich, Phil Ilten, Tony Menzo, Stephen Mrenna, Manuel Szewc, Michael K. Wilkinson, Ahmed Youssef, Jure Zupan

Abstract: We introduce a novel method for extracting a fragmentation model directly from experimental data without requiring an explicit parametric form, called Histories and Observables for Monte-Carlo Event Reweighting (HOMER), consisting of three steps: the training of a classifier between simulation and data, the inference of single fragmentation weights, and the calculation of the weight for the full h… ▽ More We introduce a novel method for extracting a fragmentation model directly from experimental data without requiring an explicit parametric form, called Histories and Observables for Monte-Carlo Event Reweighting (HOMER), consisting of three steps: the training of a classifier between simulation and data, the inference of single fragmentation weights, and the calculation of the weight for the full hadronization chain. We illustrate the use of HOMER on a simplified hadronization problem, a $q\bar{q}$ string fragmenting into pions, and extract a modified Lund string fragmentation function $f(z)$. We then demonstrate the use of HOMER on three types of experimental data: (i) binned distributions of high level observables, (ii) unbinned event-by-event distributions of these observables, and (iii) full particle cloud information. After demonstrating that $f(z)$ can be extracted from data (the inverse of hadronization), we also show that, at least in this limited setup, the fidelity of the extracted $f(z)$ suffers only limited loss when moving from (i) to (ii) to (iii). Public code is available at https://gitlab.com/uchep/mlhad. △ Less

Submitted 10 January, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

Comments: 42 pages, 21 figures. Updated version with minor revision recommended by SciPost Physics. Public code available

Report number: FERMILAB-PUB-23-414-CSAID

Journal ref: SciPost Phys. 18, 054 (2025)

arXiv:2405.08880 [pdf, other]

doi 10.1007/JHEP11(2024)017

Direct CKM determination from W decays at future lepton colliders

Authors: David Marzocca, Manuel Szewc, Michele Tammaro

Abstract: We project the reach of future lepton colliders for measuring CKM elements from direct observations of $W$ decays. We focus our attention to $|V_{cs}|$ and $|V_{cb}|$ determinations, using FCC-ee as case study. We employ state-of-the-art jet flavor taggers to obtain the projected sensitivity, and scan over tagger performances to show their effect. We conclude that future lepton collider can sizeab… ▽ More We project the reach of future lepton colliders for measuring CKM elements from direct observations of $W$ decays. We focus our attention to $|V_{cs}|$ and $|V_{cb}|$ determinations, using FCC-ee as case study. We employ state-of-the-art jet flavor taggers to obtain the projected sensitivity, and scan over tagger performances to show their effect. We conclude that future lepton collider can sizeably improve the sensitivity on $|V_{cs}|$ and $|V_{cb}|$, albeit the achievable reach will strongly depend on the level of systematic uncertainties on tagger parameters. △ Less

Submitted 12 November, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: 16 pages, 3 figures. Matches version published in JHEP

Journal ref: J. High Energ. Phys. 2024, 17 (2024)

arXiv:2402.08001 [pdf, other]

doi 10.21468/SciPostPhysCore.7.3.043

Improvement and generalization of ABCD method with Bayesian inference

Authors: Ezequiel Alvarez, Leandro Da Rold, Manuel Szewc, Alejandro Szynkman, Santiago A. Tanco, Tatiana Tarutina

Abstract: To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors. We focus on taking advantage of available information and pour our effort in re-thinking the usual data-driven ABCD method to improve it and to generalize it using Bayesian Machine Learning tools. We propose that a dataset consisting of a signal and many backgrounds is well… ▽ More To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors. We focus on taking advantage of available information and pour our effort in re-thinking the usual data-driven ABCD method to improve it and to generalize it using Bayesian Machine Learning tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by $pp\to hh\to b\bar b b \bar b$, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with $1\%$ and $0.5\%$ true signal fractions in the dataset. We also show that the method is robust against the absence of signal. △ Less

Submitted 24 September, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: 24 pages, 9 figures. Matches published version

Journal ref: SciPost Phys. Core 7, 043 (2024)

arXiv:2312.00806 [pdf, other]

Accessing CKM suppressed top decays at the LHC

Authors: Manuel Szewc

Abstract: We present an strategy for measuring the off-diagonal elements of the third row of CKM matrix $|V_{tq}|$ through the branching fractions of top quark decays $t\to q W$, where $q$ is a light quark jet. This strategy is an extension of existing measurements, with the improvement rooted in the use of orthogonal $b$- and $q$-taggers that add a new observable, the number of light-quark-tagged jets, to… ▽ More We present an strategy for measuring the off-diagonal elements of the third row of CKM matrix $|V_{tq}|$ through the branching fractions of top quark decays $t\to q W$, where $q$ is a light quark jet. This strategy is an extension of existing measurements, with the improvement rooted in the use of orthogonal $b$- and $q$-taggers that add a new observable, the number of light-quark-tagged jets, to the already commonly used observable, the fraction of $b$-tagged jets in an event. Careful inclusion of the additional complementary observable significantly increases the expected statistical power of the analysis, with the possibility of excluding a null $|V_{td}|^2+|V_{ts}|^2$ at $95\%$ C.L. at the HL-LHC. △ Less

Submitted 24 November, 2023; originally announced December 2023.

Comments: Talk at the 16th International Workshop on Top Quark Physics (Top2023), 24-29 September 2023

arXiv:2311.09296 [pdf, other]

doi 10.21468/SciPostPhys.17.2.045

Towards a data-driven model of hadronization using normalizing flows

Authors: Christian Bierlich, Phil Ilten, Tony Menzo, Stephen Mrenna, Manuel Szewc, Michael K. Wilkinson, Ahmed Youssef, Jure Zupan

Abstract: We introduce a model of hadronization based on invertible neural networks that faithfully reproduces a simplified version of the Lund string model for meson hadronization. Additionally, we introduce a new training method for normalizing flows, termed MAGIC, that improves the agreement between simulated and experimental distributions of high-level (macroscopic) observables by adjusting single-emiss… ▽ More We introduce a model of hadronization based on invertible neural networks that faithfully reproduces a simplified version of the Lund string model for meson hadronization. Additionally, we introduce a new training method for normalizing flows, termed MAGIC, that improves the agreement between simulated and experimental distributions of high-level (macroscopic) observables by adjusting single-emission (microscopic) dynamics. Our results constitute an important step toward realizing a machine-learning based model of hadronization that utilizes experimental data during training. Finally, we demonstrate how a Bayesian extension to this normalizing-flow architecture can be used to provide analysis of statistical and modeling uncertainties on the generated observable distributions. △ Less

Submitted 19 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: 26 pages, 9 figures, public code available

Report number: FERMILAB-PUB-23-698-CSAID

Journal ref: SciPost Phys. 17, 045 (2024)

arXiv:2212.13583 [pdf, other]

Exploring unsupervised top tagging using Bayesian inference

Authors: Ezequiel Alvarez, Manuel Szewc, Alejandro Szynkman, Santiago A. Tanco, Tatiana Tarutina

Abstract: Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these… ▽ More Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable $\tilde{A}_{3}$, and in the other we consider the more traditional $τ_{3}/τ_{2}$ $N$-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC $\sim 0.80-0.81$ and accuracies of about 69% $-$ 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers. △ Less

Submitted 23 June, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

Comments: 15 pages, 6 figures; Published version

arXiv:2210.02226 [pdf, other]

doi 10.1016/j.physletb.2023.137836

Null Hypothesis Test for Anomaly Detection

Authors: Jernej F. Kamenik, Manuel Szewc

Abstract: We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method r… ▽ More We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations. △ Less

Submitted 15 March, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: 10 pages, 3 figures, 1 Table. Matches published version at Physics Letters B. All code is available at https://github.com/ManuelSzewc/Null_Hypothesis_Test_for_Anomaly_Detection. Comments welcome!

Journal ref: Physics Letters B, 840, 2023, 137836

arXiv:2209.01222 [pdf, other]

doi 10.21468/SciPostPhys.16.5.131

Accessing CKM suppressed top decays at the LHC

Authors: Darius A. Faroughy, Jernej F. Kamenik, Manuel Szewc, Jure Zupan

Abstract: We propose an extension of the existing experimental strategy for measuring branching fractions of top quark decays, targeting specifically $t\to j_q W$, where $j_q$ is a light quark jet. The improved strategy uses orthogonal $b$- and $q$-taggers, and adds a new observable, the number of light-quark-tagged jets, to the already commonly used observable, the fraction of $b$-tagged jets in an event.… ▽ More We propose an extension of the existing experimental strategy for measuring branching fractions of top quark decays, targeting specifically $t\to j_q W$, where $j_q$ is a light quark jet. The improved strategy uses orthogonal $b$- and $q$-taggers, and adds a new observable, the number of light-quark-tagged jets, to the already commonly used observable, the fraction of $b$-tagged jets in an event. Careful inclusion of the additional complementary observable significantly increases the expected statistical power of the analysis, with the possibility of excluding $|V_{tb}|=1$ at $95\%$ C.L. at the HL-LHC, and accessing directly the standard model value of $|V_{td}|^2+|V_{ts}|^2$. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Comments: 19 pages, 7 figures

Journal ref: SciPost Phys. 16, 131 (2024)

arXiv:2205.05952 [pdf, other]

doi 10.1140/epjc/s10052-022-10944-3

A method for approximating optimal statistical significances with machine-learned likelihoods

Authors: Ernesto Arganda, Xabier Marcano, Víctor Martín Lozano, Anibal D. Medina, Andres D. Perez, Manuel Szewc, Alejandro Szynkman

Abstract: Machine-learning techniques have become fundamental in high-energy physics and, for new physics searches, it is crucial to know their performance in terms of experimental sensitivity, understood as the statistical significance of the signal-plus-background hypothesis over the background-only one. We present here a simple method that combines the power of current machine-learning techniques to face… ▽ More Machine-learning techniques have become fundamental in high-energy physics and, for new physics searches, it is crucial to know their performance in terms of experimental sensitivity, understood as the statistical significance of the signal-plus-background hypothesis over the background-only one. We present here a simple method that combines the power of current machine-learning techniques to face high-dimensional data with the likelihood-based inference tests used in traditional analyses, which allows us to estimate the sensitivity for both discovery and exclusion limits through a single parameter of interest, the signal strength. Based on supervised learning techniques, it can perform well also with high-dimensional data, when traditional techniques cannot. We apply the method to a toy model first, so we can explore its potential, and then to a LHC study of new physics particles in dijet final states. Considering as the optimal statistical significance the one we would obtain if the true generative functions were known, we show that our method provides a better approximation than the usual naive counting experimental results. △ Less

Submitted 9 November, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

Comments: 24 pages, 8 figures; matches version published in Eur. Phys. J. C

Report number: IFT-UAM/CSIC-22-52

Journal ref: Eur. Phys. J. C 82 (2022) 11, 993

arXiv:2107.00668 [pdf, other]

doi 10.1103/PhysRevD.105.092001

Bayesian Probabilistic Modelling for Four-Tops at the LHC

Authors: Ezequiel Alvarez, Barry M. Dillon, Darius A. Faroughy, Jernej F. Kamenik, Federico Lamagna, Manuel Szewc

Abstract: Monte Carlo (MC) generators are crucial for analyzing data in particle collider experiments. However, often even a small mismatch between the MC simulations and the measurements can undermine the interpretation of the results. This is particularly important in the context of LHC searches for rare physics processes within and beyond the standard model (SM). One of the ultimate rare processes in the… ▽ More Monte Carlo (MC) generators are crucial for analyzing data in particle collider experiments. However, often even a small mismatch between the MC simulations and the measurements can undermine the interpretation of the results. This is particularly important in the context of LHC searches for rare physics processes within and beyond the standard model (SM). One of the ultimate rare processes in the SM currently being explored at the LHC, $pp\to t\bar tt \bar t$ with its large multi-dimensional phase-space is an ideal testing ground to explore new ways to reduce the impact of potential MC mismodelling on experimental results. We propose a novel statistical method capable of disentangling the 4-top signal from the dominant backgrounds in the same-sign dilepton channel, while simultaneously correcting for possible MC imperfections in modelling of the most relevant discriminating observables -- the jet multiplicity distributions. A Bayesian mixture of multinomials is used to model the light-jet and $b$-jet multiplicities under the assumption of their conditional independence. The signal and background distributions generated from a deliberately mistuned MC simulator are used as model priors. The posterior distributions, as well as the signal and background fractions, are then learned from the data using Bayesian inference. We demonstrate that our method can mitigate the effects of large MC mismodellings in the context of a realistic $t\bar tt\bar t$ search, leading to corrected posterior distributions that better approximate the underlying truth-level spectra. △ Less

Submitted 21 April, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: Matches accepted version at PRD. 11 pages, 8 figures. This expanded version includes a toy model and explicit tests of the model validity for Four-Tops. Code available at https://github.com/ManuelSzewc/bayes-4tops

Report number: ZU-TH 30/21

arXiv:2101.08320 [pdf, other]

doi 10.1088/1361-6633/ac36b9

The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics

Authors: Gregor Kasieczka, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H. Collins, Biwei Dai, Felipe F. De Freitas, Barry M. Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D. A. Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F. Kamenik, Charanjit K. Khosa, Patrick Komiske, Luc Le Pottier , et al. (22 additional authors not shown)

Abstract: A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a… ▽ More A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders. △ Less

Submitted 20 January, 2021; originally announced January 2021.

Comments: 108 pages, 53 figures, 3 tables

arXiv:2011.06514 [pdf, other]

doi 10.1007/JHEP05(2021)125

Topping-up multilepton plus b-jets anomalies at the LHC with a $Z'$ boson

Authors: Ezequiel Alvarez, Aurelio Juste, Manuel Szewc, Tamara Vazquez Schroeder

Abstract: During the last years ATLAS and CMS have reported a number of slight to mild discrepancies in signatures of multileptons plus $b$-jets in analyses such as $t\bar t H$, $t\bar t W^\pm$, $t\bar t Z$ and $t\bar t t\bar t$. Among them, a recent ATLAS result on $t\bar t H$ production has also reported an excess in the charge asymmetry in the same-sign dilepton channel with two or more $b$-tagged jets.… ▽ More During the last years ATLAS and CMS have reported a number of slight to mild discrepancies in signatures of multileptons plus $b$-jets in analyses such as $t\bar t H$, $t\bar t W^\pm$, $t\bar t Z$ and $t\bar t t\bar t$. Among them, a recent ATLAS result on $t\bar t H$ production has also reported an excess in the charge asymmetry in the same-sign dilepton channel with two or more $b$-tagged jets. Motivated by these tantalizing discrepancies, we study a phenomenological New Physics model consisting of a $Z'$ boson that couples to up-type quarks via right-handed currents: $t_Rγ^μ\bar t_R$, $t_Rγ^μ\bar c_R$, and $t_R γ^μ\bar u_R$. The latter vertex allows to translate the charge asymmetry at the LHC initial state protons to a final state with top quarks which, decaying to a positive lepton and a $b$-jet, provides a crucial contribution to some of the observed discrepancies. Through an analysis at a detector level, we select the region in parameter space of our model that best reproduces the data in the aforementioned $t\bar t H$ study, and in a recent ATLAS $t\bar t t \bar t$ search. We find that our model provides a better fit to the experimental data than the Standard Model for a New Physics scale of approximately $\sim$500 GeV, and with a hierarchical coupling of the $Z'$ boson that favours the top quark and the presence of FCNC currents. In order to estimate the LHC sensitivity to this signal, we design a broadband search featuring many kinematic regions with different signal-to-background ratio, and perform a global analysis. We also define signal-enhanced regions and study observables that could further distinguish signal from background. We find that the region in parameter space of our model that best fits the analysed data could be probed with a significance exceeding 3 standard deviations with just the full Run-2 dataset. △ Less

Submitted 19 May, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: 33 pages, 16 figures, appendix included. Matches JHEP version. Changes in text and in Figures. References added. Comments welcome

Report number: ICAS 055/20

Journal ref: J. High Energ. Phys. 2021, 125 (2021)

arXiv:2005.12319 [pdf, other]

doi 10.1007/JHEP10(2020)206

Learning the latent structure of collider events

Authors: Barry M. Dillon, Darius A. Faroughy, Jernej F. Kamenik, Manuel Szewc

Abstract: We describe a technique to learn the underlying structure of collider events directly from the data, without having a particular theoretical model in mind. It allows to infer aspects of the theoretical model that may have given rise to this structure, and can be used to cluster or classify the events for analysis purposes. The unsupervised machine-learning technique is based on the probabilistic (… ▽ More We describe a technique to learn the underlying structure of collider events directly from the data, without having a particular theoretical model in mind. It allows to infer aspects of the theoretical model that may have given rise to this structure, and can be used to cluster or classify the events for analysis purposes. The unsupervised machine-learning technique is based on the probabilistic (Bayesian) generative model of Latent Dirichlet Allocation. We pair the model with an approximate inference algorithm called Variational Inference, which we then use to extract the latent probability distributions describing the learned underlying structure of collider events. We provide a detailed systematic study of the technique using two example scenarios to learn the latent structure of di-jet event samples made up of QCD background events and either $t\bar{t}$ or hypothetical $W' \to (φ\to WW) W$ signal events. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 38 pages, 23 figures

arXiv:1911.09699 [pdf, other]

doi 10.1007/JHEP01(2020)049

Topic Model for four-top at the LHC

Authors: Ezequiel Alvarez, Federico Lamagna, Manuel Szewc

Abstract: We study the implementation of a Topic Model algorithm in four-top searches at the LHC as a test-probe of a not ideal system for applying this technique. We study this Topic Model behavior as its different hypotheses such as mutual reducibility and equal distribution in all samples shift from true. The four-top final state at the LHC is not only relevant because it does not fulfill these condition… ▽ More We study the implementation of a Topic Model algorithm in four-top searches at the LHC as a test-probe of a not ideal system for applying this technique. We study this Topic Model behavior as its different hypotheses such as mutual reducibility and equal distribution in all samples shift from true. The four-top final state at the LHC is not only relevant because it does not fulfill these conditions, but also because it is a difficult and inefficient system to reconstruct and current Monte Carlo modeling of signal and backgrounds suffers from non-negligible uncertainties. We implement this Topic Model algorithm in the Same-Sign lepton channel where S/B is of order one and all backgrounds cannot have more than two b-jets at parton level. We define different mixtures according to the number of b-jets and we use the total number of jets to demix. Since only the background has an anchor bin, we find that we can reconstruct the background in the signal region independently of Monte Carlo. We propose to use this information to tune the Monte Carlo in the signal region and then compare signal prediction with data. We also explore Machine Learning techniques applied to this Topic Model algorithm and find slight improvements as well as potential roads to investigate. Although our findings indicate that still with the full LHC run 3 data the implementation would be challenging, we pursue through this work to find ways to reduce the impact of Monte Carlo simulations in four-top searches at the LHC. △ Less

Submitted 12 March, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

Comments: 26 pages, 10 figures, appendix included. Minor corrections to match JHEP version. As suggested by Referee, all the code needed to reproduce the results in the paper is uploaded to https://github.com/ManuelSzewc/Topic-Model-for-four-top-at-the-LHC/. Comments welcome

Report number: ICAS 045/19

Journal ref: J. High Energ. Phys. 2020, 49 (2020)

arXiv:1811.05944 [pdf, other]

doi 10.1103/PhysRevD.99.095004

Non-resonant Leptoquark with multigeneration couplings to $μμjj$ and $μνjj$ at LHC

Authors: Ezequiel Alvarez, Manuel Szewc

Abstract: CMS has recently reported a moderate excess in the $μνjj$ final state in a second generation Leptoquark search, but they have disregarded it because the excess is not present in the $μμjj$ final state and because they do not observe the expected resonant peak in the distributions. As a proof of concept we show that a simple Leptoquark model including second and third generation couplings with non-… ▽ More CMS has recently reported a moderate excess in the $μνjj$ final state in a second generation Leptoquark search, but they have disregarded it because the excess is not present in the $μμjj$ final state and because they do not observe the expected resonant peak in the distributions. As a proof of concept we show that a simple Leptoquark model including second and third generation couplings with non-negligible single- and non-resonant production in addition to usual pair production could explain the data: excess ($μνjj$), lack of excess ($μμjj$) and missing peak in the distributions; while being in agreement with collider constraints. We take this result and analysis as a starting point of a reconsideration of the ATLAS and CMS second generation Leptoquark searches. We also discuss which would be the consequences and modifications that should be performed in the searches to test if this deviation would correspond to a New Physics signal. We observe that low-energy flavor constraints can be avoided by adding heavier particles to the model. △ Less

Submitted 11 March, 2019; v1 submitted 14 November, 2018; originally announced November 2018.

Comments: Matches accepted version in PRD. An appendix has been added to address low energy flavour constraints in non-resonant scenarios with large couplings of O(1). 28 pages, 11 figures

Report number: ICAS 036/18

Journal ref: Phys. Rev. D 99, 095004 (2019)

Showing 1–18 of 18 results for author: Szewc, M