-
PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning
Authors:
Daniele Zambon,
Michele Cattaneo,
Ivan Marisca,
Jonas Bhend,
Daniele Nerini,
Cesare Alippi
Abstract:
Accurate weather forecasts are essential for supporting a wide range of activities and decision-making processes, as well as mitigating the impacts of adverse weather events. While traditional numerical weather prediction (NWP) remains the cornerstone of operational forecasting, machine learning is emerging as a powerful alternative for fast, flexible, and scalable predictions. We introduce PeakWe…
▽ More
Accurate weather forecasts are essential for supporting a wide range of activities and decision-making processes, as well as mitigating the impacts of adverse weather events. While traditional numerical weather prediction (NWP) remains the cornerstone of operational forecasting, machine learning is emerging as a powerful alternative for fast, flexible, and scalable predictions. We introduce PeakWeather, a high-quality dataset of surface weather observations collected every 10 minutes over more than 8 years from the ground stations of the Federal Office of Meteorology and Climatology MeteoSwiss's measurement network. The dataset includes a diverse set of meteorological variables from 302 station locations distributed across Switzerland's complex topography and is complemented with topographical indices derived from digital height models for context. Ensemble forecasts from the currently operational high-resolution NWP model are provided as a baseline forecast against which to evaluate new approaches. The dataset's richness supports a broad spectrum of spatiotemporal tasks, including time series forecasting at various scales, graph structure learning, imputation, and virtual sensing. As such, PeakWeather serves as a real-world benchmark to advance both foundational machine learning research, meteorology, and sensor-based applications.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
How Memory in Optimization Algorithms Implicitly Modifies the Loss
Authors:
Matias D. Cattaneo,
Boris Shigida
Abstract:
In modern optimization methods used in deep learning, each update depends on the history of previous iterations, often referred to as memory, and this dependence decays fast as the iterates go further into the past. For example, gradient descent with momentum has exponentially decaying memory through exponentially averaged past gradients. We introduce a general technique for identifying a memoryle…
▽ More
In modern optimization methods used in deep learning, each update depends on the history of previous iterations, often referred to as memory, and this dependence decays fast as the iterates go further into the past. For example, gradient descent with momentum has exponentially decaying memory through exponentially averaged past gradients. We introduce a general technique for identifying a memoryless algorithm that approximates an optimization algorithm with memory. It is obtained by replacing all past iterates in the update by the current one, and then adding a correction term arising from memory (also a function of the current iterate). This correction term can be interpreted as a perturbation of the loss, and the nature of this perturbation can inform how memory implicitly (anti-)regularizes the optimization dynamics. As an application of our theory, we find that Lion does not have the kind of implicit anti-regularization induced by memory that AdamW does, providing a theory-based explanation for Lion's better generalization performance recently documented.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
On the Implicit Bias of Adam
Authors:
Matias D. Cattaneo,
Jason M. Klusowski,
Boris Shigida
Abstract:
In previous literature, backward error analysis was used to find ordinary differential equations (ODEs) approximating the gradient descent trajectory. It was found that finite step sizes implicitly regularize solutions because terms appearing in the ODEs penalize the two-norm of the loss gradients. We prove that the existence of similar implicit regularization in RMSProp and Adam depends on their…
▽ More
In previous literature, backward error analysis was used to find ordinary differential equations (ODEs) approximating the gradient descent trajectory. It was found that finite step sizes implicitly regularize solutions because terms appearing in the ODEs penalize the two-norm of the loss gradients. We prove that the existence of similar implicit regularization in RMSProp and Adam depends on their hyperparameters and the training stage, but with a different "norm" involved: the corresponding ODE terms either penalize the (perturbed) one-norm of the loss gradients or, conversely, impede its reduction (the latter case being typical). We also conduct numerical experiments and discuss how the proven facts can influence generalization.
△ Less
Submitted 16 June, 2024; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Adversarial AI in Insurance: Pervasiveness and Resilience
Authors:
Elisa Luciano,
Matteo Cattaneo,
Ron Kenett
Abstract:
The rapid and dynamic pace of Artificial Intelligence (AI) and Machine Learning (ML) is revolutionizing the insurance sector. AI offers significant, very much welcome advantages to insurance companies, and is fundamental to their customer-centricity strategy. It also poses challenges, in the project and implementation phase. Among those, we study Adversarial Attacks, which consist of the creation…
▽ More
The rapid and dynamic pace of Artificial Intelligence (AI) and Machine Learning (ML) is revolutionizing the insurance sector. AI offers significant, very much welcome advantages to insurance companies, and is fundamental to their customer-centricity strategy. It also poses challenges, in the project and implementation phase. Among those, we study Adversarial Attacks, which consist of the creation of modified input data to deceive an AI system and produce false outputs. We provide examples of attacks on insurance AI applications, categorize them, and argue on defence methods and precautionary systems, considering that they can involve few-shot and zero-shot multilabelling. A related topic, with growing interest, is the validation and verification of systems incorporating AI and ML components. These topics are discussed in various sections of this paper.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation
Authors:
Matias D. Cattaneo,
Jason M. Klusowski,
Peter M. Tian
Abstract:
Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (train…
▽ More
Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (trained by adaptive recursive partitioning) for such purposes by demonstrating that they can fail to achieve polynomial rates of convergence in uniform norm with non-vanishing probability, even with pruning. Instead, the convergence may be arbitrarily slow or, in some important special cases, such as honest regression trees, fail completely. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures, at the cost of losing interpretability and introducing two additional tuning parameters. The two hallmarks of random forests, subsampling and the random feature selection mechanism, are seen to each distinctively contribute to achieving nearly optimal performance for the model class considered.
△ Less
Submitted 6 February, 2024; v1 submitted 19 November, 2022;
originally announced November 2022.
-
Mandibular Teeth Movement Variations in Tipping Scenario: A Finite Element Study on Several Patients
Authors:
Torkan Gholamalizadeh,
Sune Darkner,
Paolo Maria Cattaneo,
Peter Søndergaard,
Kenny Erleben
Abstract:
Previous studies on computational modeling of tooth movement in orthodontic treatments are limited to a single model and fail in generalizing the simulation results to other patients. To this end, we consider multiple patients and focus on tooth movement variations under the identical load and boundary conditions both for intra- and inter-patient analyses. We introduce a novel computational analys…
▽ More
Previous studies on computational modeling of tooth movement in orthodontic treatments are limited to a single model and fail in generalizing the simulation results to other patients. To this end, we consider multiple patients and focus on tooth movement variations under the identical load and boundary conditions both for intra- and inter-patient analyses. We introduce a novel computational analysis tool based on finite element models (FEMs) addressing how to assess initial tooth displacement in the mandibular dentition across different patients for uncontrolled tipping scenarios with different load magnitudes applied to the mandibular dentition. This is done by modeling the movement of each patient's tooth as a nonlinear function of both load and tooth size. As the size of tooth can affect the resulting tooth displacement, a combination of two clinical biomarkers obtained from the tooth anatomy, i.e., crown height and root volume, is considered to make the proposed model generalizable to different patients and teeth.
△ Less
Submitted 11 October, 2020;
originally announced October 2020.
-
Status Report of the DPHEP Study Group: Towards a Global Effort for Sustainable Data Preservation in High Energy Physics
Authors:
Z. Akopov,
Silvia Amerio,
David Asner,
Eduard Avetisyan,
Olof Barring,
James Beacham,
Matthew Bellis,
Gregorio Bernardi,
Siegfried Bethke,
Amber Boehnlein,
Travis Brooks,
Thomas Browder,
Rene Brun,
Concetta Cartaro,
Marco Cattaneo,
Gang Chen,
David Corney,
Kyle Cranmer,
Ray Culbertson,
Sunje Dallmeier-Tiessen,
Dmitri Denisov,
Cristinel Diaconu,
Vitaliy Dodonov,
Tony Doyle,
Gregory Dubois-Felsmann
, et al. (65 additional authors not shown)
Abstract:
Data from high-energy physics (HEP) experiments are collected with significant financial and human effort and are mostly unique. An inter-experimental study group on HEP data preservation and long-term analysis was convened as a panel of the International Committee for Future Accelerators (ICFA). The group was formed by large collider-based experiments and investigated the technical and organisati…
▽ More
Data from high-energy physics (HEP) experiments are collected with significant financial and human effort and are mostly unique. An inter-experimental study group on HEP data preservation and long-term analysis was convened as a panel of the International Committee for Future Accelerators (ICFA). The group was formed by large collider-based experiments and investigated the technical and organisational aspects of HEP data preservation. An intermediate report was released in November 2009 addressing the general issues of data preservation in HEP. This paper includes and extends the intermediate report. It provides an analysis of the research case for data preservation and a detailed description of the various projects at experiment, laboratory and international levels. In addition, the paper provides a concrete proposal for an international organisation in charge of the data management and policies in high-energy physics.
△ Less
Submitted 21 May, 2012;
originally announced May 2012.