-
Fundamental limits to learning closed-form mathematical models from data
Authors:
Oscar Fajardo-Fontiveros,
Ignasi Reichardt,
Harry R. De Los Rios,
Jordi Duch,
Marta Sales-Pardo,
Roger Guimera
Abstract:
Given a finite and noisy dataset generated with a closed-form mathematical model, when is it possible to learn the true generating model from the data alone? This is the question we investigate here. We show that this model-learning problem displays a transition from a low-noise phase in which the true model can be learned, to a phase in which the observation noise is too high for the true model t…
▽ More
Given a finite and noisy dataset generated with a closed-form mathematical model, when is it possible to learn the true generating model from the data alone? This is the question we investigate here. We show that this model-learning problem displays a transition from a low-noise phase in which the true model can be learned, to a phase in which the observation noise is too high for the true model to be learned by any method. Both in the low-noise phase and in the high-noise phase, probabilistic model selection leads to optimal generalization to unseen data. This is in contrast to standard machine learning approaches, including artificial neural networks, which in this particular problem are limited, in the low-noise phase, by their ability to interpolate. In the transition region between the learnable and unlearnable phases, generalization is hard for all approaches including probabilistic model selection.
△ Less
Submitted 16 December, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Bayesian machine scientist to compare data collapses for the Nikuradse dataset
Authors:
Ignasi Reichardt,
Jordi Pallares Marta Sales-Pardo,
Roger Guimera
Abstract:
Ever since Nikuradse's experiments on turbulent friction in 1933, there have been theoretical attempts to describe his measurements by collapsing the data into single-variable functions. However, this approach, which is common in other areas of physics and in other fields, is limited by the lack of rigorous quantitative methods to compare alternative data collapses. Here, we address this limitatio…
▽ More
Ever since Nikuradse's experiments on turbulent friction in 1933, there have been theoretical attempts to describe his measurements by collapsing the data into single-variable functions. However, this approach, which is common in other areas of physics and in other fields, is limited by the lack of rigorous quantitative methods to compare alternative data collapses. Here, we address this limitation by using an unsupervised method to find analytic functions that optimally describe each of the data collapses for the Nikuradse dataset. By descaling these analytic functions, we show that a low dispersion of the scaled data does not guarantee that a data collapse is a good description of the original data. In fact, we find that, out of all the proposed data collapses, the original one proposed by Prandtl and Nikuradse over 80 years ago provides the best description of the data so far, and that it also agrees well with recent experimental data, provided that some model parameters are allowed to vary across experiments.
△ Less
Submitted 25 April, 2020;
originally announced April 2020.
-
A Bayesian machine scientist to aid in the solution of challenging scientific problems
Authors:
Roger Guimera,
Ignasi Reichardt,
Antoni Aguilar-Mogas,
Francesco A Massucci,
Manuel Miranda,
Jordi Pallares,
Marta Sales-Pardo
Abstract:
Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need "machine scientists" that are able to extract these models automatically from data. Here, w…
▽ More
Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need "machine scientists" that are able to extract these models automatically from data. Here, we introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions. It explores the space of models using Markov chain Monte Carlo. We show that this approach uncovers accurate models for synthetic and real data and provides out-of-sample predictions that are more accurate than those of existing approaches and of other nonparametric methods.
△ Less
Submitted 25 April, 2020;
originally announced April 2020.
-
Very High-Energy Gamma-Ray Follow-Up Program Using Neutrino Triggers from IceCube
Authors:
IceCube Collaboration,
M. G. Aartsen,
K. Abraham,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
D. Altmann,
K. Andeen,
T. Anderson,
I. Ansseau,
G. Anton,
M. Archinger,
C. Arguelles,
J. Auffenberg,
S. Axani,
X. Bai,
S. W. Barwick,
V. Baum,
R. Bay,
J. J. Beatty,
J. Becker-Tjus,
K. -H. Becker,
S. BenZvi
, et al. (519 additional authors not shown)
Abstract:
We describe and report the status of a neutrino-triggered program in IceCube that generates real-time alerts for gamma-ray follow-up observations by atmospheric-Cherenkov telescopes (MAGIC and VERITAS). While IceCube is capable of monitoring the whole sky continuously, high-energy gamma-ray telescopes have restricted fields of view and in general are unlikely to be observing a potential neutrino-f…
▽ More
We describe and report the status of a neutrino-triggered program in IceCube that generates real-time alerts for gamma-ray follow-up observations by atmospheric-Cherenkov telescopes (MAGIC and VERITAS). While IceCube is capable of monitoring the whole sky continuously, high-energy gamma-ray telescopes have restricted fields of view and in general are unlikely to be observing a potential neutrino-flaring source at the time such neutrinos are recorded. The use of neutrino-triggered alerts thus aims at increasing the availability of simultaneous multi-messenger data during potential neutrino flaring activity, which can increase the discovery potential and constrain the phenomenological interpretation of the high-energy emission of selected source classes (e.g. blazars). The requirements of a fast and stable online analysis of potential neutrino signals and its operation are presented, along with first results of the program operating between 14 March 2012 and 31 December 2015.
△ Less
Submitted 12 November, 2016; v1 submitted 6 October, 2016;
originally announced October 2016.
-
Silicon Photomultiplier Research and Development Studies for the Large Size Telescope of the Cherenkov Telescope Array
Authors:
Riccardo Rando,
Daniele Corti,
Francesco Dazzi,
Alessandro De Angelis,
Antonios Dettlaff,
Daniela Dorner,
David Fink,
Nadia Fouque,
Felix Grundner,
Werner Haberer,
Alexander Hahn,
Richard Hermel,
Samo Korpar,
Gašper Kukec Mezek,
Ronald Maier,
Christian Manea,
Mosè Mariotti,
Daniel Mazin,
Fatima Mehrez,
Razmik Mirzoyan,
Sergey Podkladkin,
Ignasi Reichardt,
Wolfgang Rhode,
Sylvie Rosier,
Cornelia Schultz
, et al. (4 additional authors not shown)
Abstract:
The Cherenkov Telescope Array (CTA) is the the next generation facility of imaging atmospheric Cherenkov telescopes; two sites will cover both hemispheres. CTA will reach unprecedented sensitivity, energy and angular resolution in very-high-energy gamma-ray astronomy. Each CTA array will include four Large Size Telescopes (LSTs), designed to cover the low-energy range of the CTA sensitivity (…
▽ More
The Cherenkov Telescope Array (CTA) is the the next generation facility of imaging atmospheric Cherenkov telescopes; two sites will cover both hemispheres. CTA will reach unprecedented sensitivity, energy and angular resolution in very-high-energy gamma-ray astronomy. Each CTA array will include four Large Size Telescopes (LSTs), designed to cover the low-energy range of the CTA sensitivity ($\sim$20 GeV to 200 GeV). In the baseline LST design, the focal-plane camera will be instrumented with 265 photodetector clusters; each will include seven photomultiplier tubes (PMTs), with an entrance window of 1.5 inches in diameter. The PMT design is based on mature and reliable technology. Recently, silicon photomultipliers (SiPMs) are emerging as a competitor. Currently, SiPMs have advantages (e.g. lower operating voltage and tolerance to high illumination levels) and disadvantages (e.g. higher capacitance and cross talk rates), but this technology is still young and rapidly evolving. SiPM technology has a strong potential to become superior to the PMT one in terms of photon detection efficiency and price per square mm of detector area. While the advantage of SiPMs has been proven for high-density, small size cameras, it is yet to be demonstrated for large area cameras such as the one of the LST. We are working to develop a SiPM-based module for the LST camera, in view of a possible camera upgrade. We will describe the solutions we are exploring in order to balance a competitive performance with a minimal impact on the overall LST camera design.
△ Less
Submitted 28 August, 2015;
originally announced August 2015.