-
Optimal kernel regression bounds under energy-bounded noise
Authors:
Amon Lahr,
Johannes Köhler,
Anna Scampicchio,
Melanie N. Zeilinger
Abstract:
Non-conservative uncertainty bounds are key for both assessing an estimation algorithm's accuracy and in view of downstream tasks, such as its deployment in safety-critical contexts. In this paper, we derive a tight, non-asymptotic uncertainty bound for kernel-based estimation, which can also handle correlated noise sequences. Its computation relies on a mild norm-boundedness assumption on the unk…
▽ More
Non-conservative uncertainty bounds are key for both assessing an estimation algorithm's accuracy and in view of downstream tasks, such as its deployment in safety-critical contexts. In this paper, we derive a tight, non-asymptotic uncertainty bound for kernel-based estimation, which can also handle correlated noise sequences. Its computation relies on a mild norm-boundedness assumption on the unknown function and the noise, returning the worst-case function realization within the hypothesis class at an arbitrary query input location. The value of this function is shown to be given in terms of the posterior mean and covariance of a Gaussian process for an optimal choice of the measurement noise covariance. By rigorously analyzing the proposed approach and comparing it with other results in the literature, we show its effectiveness in returning tight and easy-to-compute bounds for kernel-based estimates.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Authors:
Mehdi Ali,
Manuel Brack,
Max Lübbering,
Elias Wendt,
Abbas Goher Khan,
Richard Rutmann,
Alex Jude,
Maurice Kraus,
Alexander Arno Weber,
David Kaczér,
Florian Mai,
Lucie Flek,
Rafet Sifa,
Nicolas Flores-Herr,
Joachim Köhler,
Patrick Schramowski,
Michael Fromm,
Kristian Kersting
Abstract:
High-quality multilingual training data is essential for effectively pretraining large language models (LLMs). Yet, the availability of suitable open-source multilingual datasets remains limited. Existing state-of-the-art datasets mostly rely on heuristic filtering methods, restricting both their cross-lingual transferability and scalability. Here, we introduce JQL, a systematic approach that effi…
▽ More
High-quality multilingual training data is essential for effectively pretraining large language models (LLMs). Yet, the availability of suitable open-source multilingual datasets remains limited. Existing state-of-the-art datasets mostly rely on heuristic filtering methods, restricting both their cross-lingual transferability and scalability. Here, we introduce JQL, a systematic approach that efficiently curates diverse and high-quality multilingual data at scale while significantly reducing computational demands. JQL distills LLMs' annotation capabilities into lightweight annotators based on pretrained multilingual embeddings. These models exhibit robust multilingual and cross-lingual performance, even for languages and scripts unseen during training. Evaluated empirically across 35 languages, the resulting annotation pipeline substantially outperforms current heuristic filtering methods like Fineweb2. JQL notably enhances downstream model training quality and increases data retention rates. Our research provides practical insights and valuable resources for multilingual data curation, raising the standards of multilingual dataset development.
△ Less
Submitted 31 May, 2025; v1 submitted 28 May, 2025;
originally announced May 2025.
-
A model-free approach to control barrier functions using funnel control
Authors:
Lukas Lanza,
Johannes Köhler,
Dario Dennstädt,
Thomas Berger,
Karl Worthmann
Abstract:
Control barrier functions (CBFs) are a popular approach to design feedback laws that achieve safety guarantees for nonlinear systems. The CBF-based controller design relies on the availability of a model to select feasible inputs from the set of CBF-based controls. In this paper, we develop a model-free approach to design CBF-based control laws, eliminating the need for knowledge of system dynamic…
▽ More
Control barrier functions (CBFs) are a popular approach to design feedback laws that achieve safety guarantees for nonlinear systems. The CBF-based controller design relies on the availability of a model to select feasible inputs from the set of CBF-based controls. In this paper, we develop a model-free approach to design CBF-based control laws, eliminating the need for knowledge of system dynamics or parameters. Specifically, we address safety requirements characterized by a time-varying distance to a reference trajectory in the output space and construct a CBF that depends only on the measured output. Utilizing this particular CBF, we determine a subset of CBF-based controls without relying on a model of the dynamics by using techniques from funnel control. The latter is a model-free high-gain adaptive control methodology, which achieves tracking guarantees via reactive feedback. In this paper, we discover and establish a connection between the modular controller synthesis via zeroing CBFs and model-free reactive feedback. The theoretical results are illustrated by a numerical simulation.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
viper: High-precision radial velocities from the optical to the infrared (Reaching 3 m/s in the K band of CRIRES+ with telluric modelling)
Authors:
J. Köhler,
M. Zechmeister,
A. Hatzes,
S. Chamarthi,
E. Nagel,
U. Seemann,
P. Ballester,
P. Bristow,
P. Chaturvedi,
R. J. Dorn,
E. Guenther,
V. D. Ivanov,
Y. Jung,
O. Kochukhov,
T. Marquart,
L. Nortmann,
R. Palsa,
N. Piskunov,
A. Reiners,
F. Rodler,
J. V. Smoker
Abstract:
In recent years, a number of new instruments and data reduction pipelines have been developed to obtain high-precision radial velocities (RVs). In particular in the optical, considerable progress has been made and RV precision below 50 cm/s has been reached. Yet, the RV precision in the near-infrared (NIR) is trailing behind. This is due to a number of factors, such as imprinted atmospheric absorp…
▽ More
In recent years, a number of new instruments and data reduction pipelines have been developed to obtain high-precision radial velocities (RVs). In particular in the optical, considerable progress has been made and RV precision below 50 cm/s has been reached. Yet, the RV precision in the near-infrared (NIR) is trailing behind. This is due to a number of factors, such as imprinted atmospheric absorption lines, lower stellar information content, different types of detectors, and usable calibration lamps. However, observations in the NIR are important for the search and study of exoplanets around cool low-mass stars that are faint at optical wavelengths. Not only are M dwarfs brightest in the NIR, the signal of stellar activity is also reduced at longer wavelengths. In this paper we introduce the RV pipeline viper (Velocity and IP EstimatoR). The philosophy of viper is to offer a publicly available and user-friendly code that is able to process data from various spectrographs. Originally designed to handle data from optical instruments, the code now has been extended to enable the processing of NIR data. viper uses a least-square fitting to model the stellar RV as well as the temporal and spatial variable IP. We have improved upon this method by adding a term for the telluric spectrum that enables the forward modelling of molecules present in the Earth's atmosphere. In this paper we use CRIRES+ observations in the K band to demonstrate viper's ability to handle data in the NIR. We show that it is possible to achieve an RV accuracy of 3 m/s over a period of 2.5 years with the use of a gas cell. Additionally, we present a study of the stability of atmospheric lines in the NIR. With viper it is possible to handle data taken with or without a gas cell, and we show that a long-term RV precision of around 10 m/s can be achieved when using only telluric lines for the wavelength calibration.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Finite-Sample-Based Reachability for Safe Control with Gaussian Process Dynamics
Authors:
Manish Prajapat,
Johannes Köhler,
Amon Lahr,
Andreas Krause,
Melanie N. Zeilinger
Abstract:
Gaussian Process (GP) regression is shown to be effective for learning unknown dynamics, enabling efficient and safety-aware control strategies across diverse applications. However, existing GP-based model predictive control (GP-MPC) methods either rely on approximations, thus lacking guarantees, or are overly conservative, which limits their practical utility. To close this gap, we present a samp…
▽ More
Gaussian Process (GP) regression is shown to be effective for learning unknown dynamics, enabling efficient and safety-aware control strategies across diverse applications. However, existing GP-based model predictive control (GP-MPC) methods either rely on approximations, thus lacking guarantees, or are overly conservative, which limits their practical utility. To close this gap, we present a sampling-based framework that efficiently propagates the model's epistemic uncertainty while avoiding conservatism. We establish a novel sample complexity result that enables the construction of a reachable set using a finite number of dynamics functions sampled from the GP posterior. Building on this, we design a sampling-based GP-MPC scheme that is recursively feasible and guarantees closed-loop safety and stability with high probability. Finally, we showcase the effectiveness of our method on two numerical examples, highlighting accurate reachable set over-approximation and safe closed-loop performance.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Abundance analysis of benchmark M dwarfs
Authors:
T. Olander,
U. Heiter,
N. Piskunov,
J. Köhler,
O. Kochukhov
Abstract:
Abundances of M dwarfs, being the most numerous stellar type in the Galaxy, can enhance our understanding of planet formation processes. They can also be used to study the chemical evolution of the Galaxy, where in particular alpha-capture elements play an important role. We aim to obtain abundances for Fe, Ti, and Ca for a small sample of well-known M dwarfs for which interferometric measurements…
▽ More
Abundances of M dwarfs, being the most numerous stellar type in the Galaxy, can enhance our understanding of planet formation processes. They can also be used to study the chemical evolution of the Galaxy, where in particular alpha-capture elements play an important role. We aim to obtain abundances for Fe, Ti, and Ca for a small sample of well-known M dwarfs for which interferometric measurements are available. These stars and their abundances are intended to serve as a benchmark for future large-scale spectroscopic studies. We analysed spectra obtained with the GIANO-B spectrograph. Turbospectrum and the wrapper TSFitPy were used with MARCS atmospheric models in order to fit synthetic spectra to the observed spectra. We performed a differential abundance analysis in which we also analysed a solar spectrum with the same method and then subtracted the derived abundances line-by-line. The median was taken as the final abundance for each element and each star. Our abundances of Fe, Ti, and Ca agree mostly within uncertainties when comparing with other values from the literature. However, there are few studies to compare with.
△ Less
Submitted 11 May, 2025;
originally announced May 2025.
-
No Planet around the K Giant Star 42 Draconis
Authors:
Artie P. Hatzes,
Volker Perdelwitz,
Marie Karjalainen,
Jana Köhler,
Michael Hartmann
Abstract:
Published radial velocity (RV) measurements of the K giant star 42 Dra reveal variations consistent with a 3.9 M_Jup mass companion in a 479-d orbit. This exoplanet can be confirmed if these variations are long-lived and coherent. Continued monitoring may also reveal other companions. We have acquired additional RV measurements of 42 Dra spanning fifteen years. Periodogram analyses were used to in…
▽ More
Published radial velocity (RV) measurements of the K giant star 42 Dra reveal variations consistent with a 3.9 M_Jup mass companion in a 479-d orbit. This exoplanet can be confirmed if these variations are long-lived and coherent. Continued monitoring may also reveal other companions. We have acquired additional RV measurements of 42 Dra spanning fifteen years. Periodogram analyses were used to investigate the stability of the planet RV signal. We also investigated variations in the spectral line shapes using the bisector velocity span as well as infrared photometry from the COBE mission. The new RV measurements do not follow the published planet orbit. An orbital solution using the 2004 - 2011 data yields a period and eccentricity consistent with the published values, but the RV amplitude has decreased by a factor of four from the earlier measurements. Including some additional RV measurements taken between 2014 and 2018 reveal the presence of a second period at 530 d. The beating of this period with the one at 479-d may account for the observed amplitude variations. The planet hypothesis is conclusively ruled out by COBE/DIRBE 1.25 micron photometry that shows variations with the planet orbital period as well as an additional 170 d period. The amplitude variations in the RV as well the COBE/DIRBE photometry firmly establish that there is no giant planet around 42 Dra. The presence of multi-periodic variations suggests that these may be stellar oscillations, most likely oscillatory convection modes. These oscillations may account for some of the long period RV variations attributed to planets around K giant stars. This may skew the statistics of planet occurrence around intermediate mass stars. Long-term monitoring with excellent sampling is required to exclude amplitude variations in the long-periods found in radial velocity of K giant stars.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
Autoregressive Distillation of Diffusion Transformers
Authors:
Yeongmin Kim,
Sotiris Anagnostidis,
Yuming Du,
Edgar Schönfeld,
Jonas Kohler,
Markos Georgopoulos,
Albert Pumarola,
Ali Thabet,
Artsiom Sanakoyeu
Abstract:
Diffusion models with transformer architectures have demonstrated promising capabilities in generating high-fidelity images and scalability for high resolution. However, iterative sampling process required for synthesis is very resource-intensive. A line of work has focused on distilling solutions to probability flow ODEs into few-step student models. Nevertheless, existing methods have been limit…
▽ More
Diffusion models with transformer architectures have demonstrated promising capabilities in generating high-fidelity images and scalability for high resolution. However, iterative sampling process required for synthesis is very resource-intensive. A line of work has focused on distilling solutions to probability flow ODEs into few-step student models. Nevertheless, existing methods have been limited by their reliance on the most recent denoised samples as input, rendering them susceptible to exposure bias. To address this limitation, we propose AutoRegressive Distillation (ARD), a novel approach that leverages the historical trajectory of the ODE to predict future steps. ARD offers two key benefits: 1) it mitigates exposure bias by utilizing a predicted historical trajectory that is less susceptible to accumulated errors, and 2) it leverages the previous history of the ODE trajectory as a more effective source of coarse-grained information. ARD modifies the teacher transformer architecture by adding token-wise time embedding to mark each input from the trajectory history and employs a block-wise causal attention mask for training. Furthermore, incorporating historical inputs only in lower transformer layers enhances performance and efficiency. We validate the effectiveness of ARD in a class-conditioned generation on ImageNet and T2I synthesis. Our model achieves a $5\times$ reduction in FID degradation compared to the baseline methods while requiring only 1.1\% extra FLOPs on ImageNet-256. Moreover, ARD reaches FID of 1.84 on ImageNet-256 in merely 4 steps and outperforms the publicly available 1024p text-to-image distilled models in prompt adherence score with a minimal drop in FID compared to the teacher. Project page: https://github.com/alsdudrla10/ARD.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Robust MPC for Uncertain Linear Systems -- Combining Model Adaptation and Iterative Learning
Authors:
Hannes Petrenz,
Johannes Köhler,
Francesco Borrelli
Abstract:
This paper presents a robust adaptive learning Model Predictive Control (MPC) framework for linear systems with parametric uncertainties and additive disturbances performing iterative tasks. The approach iteratively refines the parameter estimates using set membership estimation. Performance enhancement over iterations is achieved by learning the terminal cost from data. Safety is enforced using a…
▽ More
This paper presents a robust adaptive learning Model Predictive Control (MPC) framework for linear systems with parametric uncertainties and additive disturbances performing iterative tasks. The approach iteratively refines the parameter estimates using set membership estimation. Performance enhancement over iterations is achieved by learning the terminal cost from data. Safety is enforced using a terminal set, which is also learned iteratively. The proposed method guarantees recursive feasibility, constraint satisfaction, and a robust bound on the closed-loop cost. Numerical simulations on a mass-spring-damper system demonstrate improved computational efficiency and control performance compared to an existing robust adaptive MPC approach.
△ Less
Submitted 16 April, 2025; v1 submitted 15 April, 2025;
originally announced April 2025.
-
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
Authors:
Jaskirat Singh,
Junshen Kevin Chen,
Jonas Kohler,
Michael Cohen
Abstract:
Training-free consistent text-to-image generation depicting the same subjects across different images is a topic of widespread recent interest. Existing works in this direction predominantly rely on cross-frame self-attention; which improves subject-consistency by allowing tokens in each frame to pay attention to tokens in other frames during self-attention computation. While useful for single sub…
▽ More
Training-free consistent text-to-image generation depicting the same subjects across different images is a topic of widespread recent interest. Existing works in this direction predominantly rely on cross-frame self-attention; which improves subject-consistency by allowing tokens in each frame to pay attention to tokens in other frames during self-attention computation. While useful for single subjects, we find that it struggles when scaling to multiple characters. In this work, we first analyze the reason for these limitations. Our exploration reveals that the primary-issue stems from self-attention-leakage, which is exacerbated when trying to ensure consistency across multiple-characters. This happens when tokens from one subject pay attention to other characters, causing them to appear like each other (e.g., a dog appearing like a duck). Motivated by these findings, we propose StoryBooth: a training-free approach for improving multi-character consistency. In particular, we first leverage multi-modal chain-of-thought reasoning and region-based generation to apriori localize the different subjects across the desired story outputs. The final outputs are then generated using a modified diffusion model which consists of two novel layers: 1) a bounded cross-frame self-attention layer for reducing inter-character attention leakage, and 2) token-merging layer for improving consistency of fine-grain subject details. Through both qualitative and quantitative results we find that the proposed approach surpasses prior state-of-the-art, exhibiting improved consistency across both multiple-characters and fine-grain subject details.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Beyond Asymptotics: Targeted exploration with finite-sample guarantees
Authors:
Janani Venkatasubramanian,
Johannes Köhler,
Frank Allgöwer
Abstract:
In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The techni…
▽ More
In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Output-feedback model predictive control under dynamic uncertainties using integral quadratic constraints
Authors:
Lukas Schwenkel,
Johannes Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this work, we propose an output-feedback tube-based model predictive control (MPC) scheme for linear systems under dynamic uncertainties that are described via integral quadratic constraints (IQC). By leveraging IQCs, a large class of nonlinear and dynamic uncertainties can be addressed. We leverage recent IQC synthesis tools to design a dynamic controller and an observer that are robust to the…
▽ More
In this work, we propose an output-feedback tube-based model predictive control (MPC) scheme for linear systems under dynamic uncertainties that are described via integral quadratic constraints (IQC). By leveraging IQCs, a large class of nonlinear and dynamic uncertainties can be addressed. We leverage recent IQC synthesis tools to design a dynamic controller and an observer that are robust to these uncertainties and minimize the size of the resulting constraint tightening in the MPC. Thereby, we show that the robust estimation problem using IQCs with peak-to-peak performance can be convexified. We guarantee recursive feasibility, robust constraint satisfaction, and input-to-state stability of the resulting MPC scheme.
△ Less
Submitted 31 March, 2025;
originally announced April 2025.
-
Multi-objective robust controller synthesis with integral quadratic constraints in discrete-time
Authors:
Lukas Schwenkel,
Johannes Köhler,
Matthias A. Müller,
Carsten W. Scherer,
Frank Allgöwer
Abstract:
This article presents a novel framework for the robust controller synthesis problem in discrete-time systems using dynamic Integral Quadratic Constraints (IQCs). We present an algorithm to minimize closed-loop performance measures such as the $\mathcal H_\infty$-norm, the energy-to-peak gain, the peak-to-peak gain, or a multi-objective mix thereof. While IQCs provide a powerful tool for modeling s…
▽ More
This article presents a novel framework for the robust controller synthesis problem in discrete-time systems using dynamic Integral Quadratic Constraints (IQCs). We present an algorithm to minimize closed-loop performance measures such as the $\mathcal H_\infty$-norm, the energy-to-peak gain, the peak-to-peak gain, or a multi-objective mix thereof. While IQCs provide a powerful tool for modeling structured uncertainties and nonlinearities, existing synthesis methods are limited to the $\mathcal H_\infty$-norm, continuous-time systems, or special system structures. By minimizing the energy-to-peak and peak-to-peak gain, the proposed synthesis can be utilized to bound the peak of the output, which is crucial in many applications requiring robust constraint satisfaction, input-to-state stability, reachability analysis, or other pointwise-in-time bounds. Numerical examples demonstrate the robustness and performance of the controllers synthesized with the proposed algorithm.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
A Deep Learning Pipeline for Large Earthquake Analysis using High-Rate Global Navigation Satellite System Data
Authors:
Claudia Quinteros-Cartaya,
Javier Quintero-Arenas,
Andrea Padilla-Lafarga,
Carlos Moraila,
Johannes Faber,
Wei Li,
Jonas Köhler,
Nishtha Srivastava
Abstract:
Deep learning techniques for processing large and complex datasets have unlocked new opportunities for fast and reliable earthquake analysis using Global Navigation Satellite System (GNSS) data. This work presents a deep learning model, MagEs, to estimate earthquake magnitudes using data from high-rate GNSS stations. Furthermore, MagEs is integrated with the DetEQ model for earthquake detection wi…
▽ More
Deep learning techniques for processing large and complex datasets have unlocked new opportunities for fast and reliable earthquake analysis using Global Navigation Satellite System (GNSS) data. This work presents a deep learning model, MagEs, to estimate earthquake magnitudes using data from high-rate GNSS stations. Furthermore, MagEs is integrated with the DetEQ model for earthquake detection within the SAIPy package, creating a comprehensive pipeline for earthquake detection and magnitude estimation using HR-GNSS data. The MagEs model provides magnitude estimates within seconds of detection when using stations within 3 degrees of the epicenter, which are the most relevant for real-time applications. However, since it has been trained on data from stations up to 7.5 degrees away, it can also analyze data from larger distances. The model can process data from a single station at a time or combine data from up to three stations. The model was trained using synthetic data reflecting rupture scenarios in the Chile subduction zone, and the results confirm strong performance for Chilean earthquakes. Although tests from other tectonic regions also yielded good results, incorporating regional data through transfer learning could further improve its performance in diverse seismic settings. The model has not yet been deployed in an operational real-time monitoring system, but simulation tests that update data in a second-by-second manner demonstrate its potential for future real-time adaptation.
△ Less
Submitted 26 March, 2025;
originally announced March 2025.
-
Stochastic Model Predictive Control for Sub-Gaussian Noise
Authors:
Yunke Ao,
Johannes Köhler,
Manish Prajapat,
Yarden As,
Melanie Zeilinger,
Philipp Fürnstahl,
Andreas Krause
Abstract:
We propose a stochastic Model Predictive Control (MPC) framework that ensures closed-loop chance constraint satisfaction for linear systems with general sub-Gaussian process and measurement noise. By considering sub-Gaussian noise, we can provide guarantees for a large class of distributions, including time-varying distributions. Specifically, we first provide a new characterization of sub-Gaussia…
▽ More
We propose a stochastic Model Predictive Control (MPC) framework that ensures closed-loop chance constraint satisfaction for linear systems with general sub-Gaussian process and measurement noise. By considering sub-Gaussian noise, we can provide guarantees for a large class of distributions, including time-varying distributions. Specifically, we first provide a new characterization of sub-Gaussian random vectors using matrix variance proxies, which can more accurately represent the predicted state distribution. We then derive tail bounds under linear propagation for the new characterization, enabling tractable computation of probabilistic reachable sets of linear systems. Lastly, we utilize these probabilistic reachable sets to formulate a stochastic MPC scheme that provides closed-loop guarantees for general sub-Gaussian noise. We further demonstrate our approach in simulations, including a challenging task of surgical planning from image observations.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Authors:
Sotiris Anagnostidis,
Gregor Bachmann,
Yeongmin Kim,
Jonas Kohler,
Markos Georgopoulos,
Artsiom Sanakoyeu,
Yuming Du,
Albert Pumarola,
Ali Thabet,
Edgar Schönfeld
Abstract:
Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we revisit the conventional static paradigm that allocates a fixed compute budget per denoising iteration and propose a dynamic strategy instead. Our simple and sampl…
▽ More
Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we revisit the conventional static paradigm that allocates a fixed compute budget per denoising iteration and propose a dynamic strategy instead. Our simple and sample-efficient framework enables pre-trained DiT models to be converted into \emph{flexible} ones -- dubbed FlexiDiT -- allowing them to process inputs at varying compute budgets. We demonstrate how a single \emph{flexible} model can generate images without any drop in quality, while reducing the required FLOPs by more than $40$\% compared to their static counterparts, for both class-conditioned and text-conditioned image generation. Our method is general and agnostic to input and conditioning modalities. We show how our approach can be readily extended for video generation, where FlexiDiT models generate samples with up to $75$\% less compute without compromising performance.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Stochastic MPC with Online-optimized Policies and Closed-loop Guarantees
Authors:
Marcell Bartos,
Alexandre Didier,
Jerome Sieber,
Johannes Köhler,
Melanie N. Zeilinger
Abstract:
This paper proposes a stochastic model predictive control method for linear systems affected by additive Gaussian disturbances. Closed-loop satisfaction of probabilistic constraints and recursive feasibility of the underlying convex optimization problem is guaranteed. Optimization over feedback policies online increases performance and reduces conservatism compared to fixed-feedback approaches. Th…
▽ More
This paper proposes a stochastic model predictive control method for linear systems affected by additive Gaussian disturbances. Closed-loop satisfaction of probabilistic constraints and recursive feasibility of the underlying convex optimization problem is guaranteed. Optimization over feedback policies online increases performance and reduces conservatism compared to fixed-feedback approaches. The central mechanism is a finitely determined maximal admissible set for probabilistic constraints, together with the reconditioning of the predicted probabilistic constraints on the current knowledge at every time step. The proposed method's reduced conservatism and improved performance in terms of the achieved closed-loop cost is demonstrated in a numerical example.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
A search for the anomalous events detected by ANITA using the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato,
A. Bartz Mocellin
, et al. (352 additional authors not shown)
Abstract:
A dedicated search for upward-going air showers at zenith angles exceeding $110^\circ$ and energies $E>0.1$ EeV has been performed using the Fluorescence Detector of the Pierre Auger Observatory. The search is motivated by two "anomalous" radio pulses observed by the ANITA flights I and III which appear inconsistent with the Standard Model of particle physics. Using simulations of both regular cos…
▽ More
A dedicated search for upward-going air showers at zenith angles exceeding $110^\circ$ and energies $E>0.1$ EeV has been performed using the Fluorescence Detector of the Pierre Auger Observatory. The search is motivated by two "anomalous" radio pulses observed by the ANITA flights I and III which appear inconsistent with the Standard Model of particle physics. Using simulations of both regular cosmic ray showers and upward-going events, a selection procedure has been defined to separate potential upward-going candidate events and the corresponding exposure has been calculated in the energy range [0.1-33] EeV. One event has been found in the search period between 1 Jan 2004 and 31 Dec 2018, consistent with an expected background of $0.27 \pm 0.12$ events from mis-reconstructed cosmic ray showers. This translates to an upper bound on the integral flux of $(7.2 \pm 0.2) \times 10^{-21}$ cm$^{-2}$ sr$^{-1}$ y$^{-1}$ and $(3.6 \pm 0.2) \times 10^{-20}$ cm$^{-2}$ sr$^{-1}$ y$^{-1}$ for an $E^{-1}$ and $E^{-2}$ spectrum, respectively. An upward-going flux of showers normalized to the ANITA observations is shown to predict over 34 events for an $E^{-3}$ spectrum and over 8.1 events for a conservative $E^{-5}$ spectrum, in strong disagreement with the interpretation of the anomalous events as upward-going showers.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
Search for a diffuse flux of photons with energies above tens of PeV at the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
A. Ambrosone,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato
, et al. (337 additional authors not shown)
Abstract:
Diffuse photons of energy above 0.1 PeV, produced through the interactions between cosmic rays and either interstellar matter or background radiation fields, are powerful tracers of the distribution of cosmic rays in the Galaxy. Furthermore, the measurement of a diffuse photon flux would be an important probe to test models of super-heavy dark matter decaying into gamma-rays. In this work, we sear…
▽ More
Diffuse photons of energy above 0.1 PeV, produced through the interactions between cosmic rays and either interstellar matter or background radiation fields, are powerful tracers of the distribution of cosmic rays in the Galaxy. Furthermore, the measurement of a diffuse photon flux would be an important probe to test models of super-heavy dark matter decaying into gamma-rays. In this work, we search for a diffuse photon flux in the energy range between 50 PeV and 200 PeV using data from the Pierre Auger Observatory. For the first time, we combine the air-shower measurements from a 2 km$^2$ surface array consisting of 19 water-Cherenkov surface detectors, spaced at 433 m, with the muon measurements from an array of buried scintillators placed in the same area. Using 15 months of data, collected while the array was still under construction, we derive upper limits to the integral photon flux ranging from 13.3 to 13.8 km$^{-2}$ sr$^{-1}$ yr$^{-1}$ above tens of PeV. We extend the Pierre Auger Observatory photon search program towards lower energies, covering more than three decades of cosmic-ray energy. This work lays the foundation for future diffuse photon searches: with the data from the next 10 years of operation of the Observatory, this limit is expected to improve by a factor of $\sim$20.
△ Less
Submitted 17 March, 2025; v1 submitted 4 February, 2025;
originally announced February 2025.
-
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
Authors:
Gregor Bachmann,
Sotiris Anagnostidis,
Albert Pumarola,
Markos Georgopoulos,
Artsiom Sanakoyeu,
Yuming Du,
Edgar Schönfeld,
Ali Thabet,
Jonas Kohler
Abstract:
The performance of large language models (LLMs) is closely linked to their underlying size, leading to ever-growing networks and hence slower inference. Speculative decoding has been proposed as a technique to accelerate autoregressive generation, leveraging a fast draft model to propose candidate tokens, which are then verified in parallel based on their likelihood under the target model. While t…
▽ More
The performance of large language models (LLMs) is closely linked to their underlying size, leading to ever-growing networks and hence slower inference. Speculative decoding has been proposed as a technique to accelerate autoregressive generation, leveraging a fast draft model to propose candidate tokens, which are then verified in parallel based on their likelihood under the target model. While this approach guarantees to reproduce the target output, it incurs a substantial penalty: many high-quality draft tokens are rejected, even when they represent objectively valid continuations. Indeed, we show that even powerful draft models such as GPT-4o, as well as human text cannot achieve high acceptance rates under the standard verification scheme. This severely limits the speedup potential of current speculative decoding methods, as an early rejection becomes overwhelmingly likely when solely relying on alignment of draft and target.
We thus ask the following question: Can we adapt verification to recognize correct, but non-aligned replies? To this end, we draw inspiration from the LLM-as-a-judge framework, which demonstrated that LLMs are able to rate answers in a versatile way. We carefully design a dataset to elicit the same capability in the target model by training a compact module on top of the embeddings to produce ``judgements" of the current continuation. We showcase our strategy on the Llama-3.1 family, where our 8b/405B-Judge achieves a speedup of 9x over Llama-405B, while maintaining its quality on a large range of benchmarks. These benefits remain present even in optimized inference frameworks, where our method reaches up to 141 tokens/s for 8B/70B-Judge and 129 tokens/s for 8B/405B on 2 and 8 H100s respectively.
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
Robust targeted exploration for systems with non-stochastic disturbances
Authors:
Janani Venkatasubramanian,
Johannes Köhler,
Mark Cannon,
Frank Allgöwer
Abstract:
In this paper, we introduce a novel targeted exploration strategy designed specifically for uncertain linear time-invariant systems with energy-bounded disturbances, i.e., without making any assumptions on the distribution of the disturbances. We use classical results characterizing the set of non-falsified parameters consistent with energy-bounded disturbances. We derive a semidefinite program wh…
▽ More
In this paper, we introduce a novel targeted exploration strategy designed specifically for uncertain linear time-invariant systems with energy-bounded disturbances, i.e., without making any assumptions on the distribution of the disturbances. We use classical results characterizing the set of non-falsified parameters consistent with energy-bounded disturbances. We derive a semidefinite program which computes an exploration strategy that guarantees a desired accuracy of the parameter estimate. This design is based on sufficient conditions on the spectral content of the exploration data that robustly accounts for initial parametric uncertainty. Finally, we highlight the applicability of the exploration strategy through a numerical example involving an unmodeled nonlinearity.
△ Less
Submitted 29 December, 2024;
originally announced December 2024.
-
Adaptive Economic Model Predictive Control: Performance Guarantees for Nonlinear Systems
Authors:
Maximilian Degner,
Raffaele Soloperto,
Melanie N. Zeilinger,
John Lygeros,
Johannes Köhler
Abstract:
We consider the problem of optimizing the economic performance of nonlinear constrained systems subject to uncertain time-varying parameters and bounded disturbances. In particular, we propose an adaptive economic model predictive control (MPC) framework that: (i) directly minimizes transient economic costs, (ii) addresses parametric uncertainty through online model adaptation, (iii) determines op…
▽ More
We consider the problem of optimizing the economic performance of nonlinear constrained systems subject to uncertain time-varying parameters and bounded disturbances. In particular, we propose an adaptive economic model predictive control (MPC) framework that: (i) directly minimizes transient economic costs, (ii) addresses parametric uncertainty through online model adaptation, (iii) determines optimal setpoints online, and (iv) ensures robustness by using a tube-based approach. The proposed design ensures recursive feasibility, robust constraint satisfaction, and a transient performance bound. In case the disturbances have a finite energy and the parameter variations have a finite path length, the asymptotic average performance is (approximately) not worse than the performance obtained when operating at the best reachable steady-state. We highlight performance benefits in a numerical example involving a chemical reactor with unknown time-invariant and time-varying parameters.
△ Less
Submitted 10 February, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Online convex optimization for constrained control of nonlinear systems
Authors:
Marko Nonhoff,
Johannes Köhler,
Matthias A. Müller
Abstract:
This paper investigates the problem of controlling nonlinear dynamical systems subject to state and input constraints while minimizing time-varying and a priori unknown cost functions. We propose a modular approach that combines the online convex optimization framework and reference governors to solve this problem. Our method is general in the sense that we do not limit our analysis to a specific…
▽ More
This paper investigates the problem of controlling nonlinear dynamical systems subject to state and input constraints while minimizing time-varying and a priori unknown cost functions. We propose a modular approach that combines the online convex optimization framework and reference governors to solve this problem. Our method is general in the sense that we do not limit our analysis to a specific choice of online convex optimization algorithm or reference governor. We show that the dynamic regret of the proposed framework is bounded linearly in both the dynamic regret and the path length of the chosen online convex optimization algorithm, even though the online convex optimization algorithm does not account for the underlying dynamics. We prove that a linear bound with respect to the online convex optimization algorithm's dynamic regret is optimal, i.e., cannot be improved upon. Furthermore, for a standard class of online convex optimization algorithms, our proposed framework attains a bound on its dynamic regret that is linear only in the variation of the cost functions, which is known to be an optimal bound. Finally, we demonstrate implementation and flexibility of the proposed framework by comparing different combinations of online convex optimization algorithms and reference governors to control a nonlinear chemical reactor in a numerical experiment.
△ Less
Submitted 1 December, 2024;
originally announced December 2024.
-
High Magnitude Earthquake Identification Using an Anomaly Detection Approach on HR GNSS Data
Authors:
Javier Quintero Arenas,
Claudia Quinteros Cartaya,
Andrea Padilla Lafarga,
Carlos Moraila,
Johannes Faber,
Jonas Koehler,
Nishtha Srivastava
Abstract:
Earthquake early warning systems are crucial for protecting areas that are subject to these natural disasters. An essential part of these systems is the detection procedure. Traditionally these systems work with seismograph data, but high rate GNSS data has become a promising alternative for the usage in large earthquake early warning systems. Besides traditional methods, deep learning approaches…
▽ More
Earthquake early warning systems are crucial for protecting areas that are subject to these natural disasters. An essential part of these systems is the detection procedure. Traditionally these systems work with seismograph data, but high rate GNSS data has become a promising alternative for the usage in large earthquake early warning systems. Besides traditional methods, deep learning approaches have gained recent popularity in this field, as they are able to leverage the large amounts of real and synthetic seismic data. Nevertheless, the usage of deep learning on GNSS data remains a comparatively new topic. This work contributes to the field of early warning systems by proposing an autoencoder based deep learning pipeline that aims to be lightweight and customizable for the detection of anomalies viz. high magnitude earthquakes in GNSS data. This model, DetEQ, is trained using the noise data recordings from nine stations located in Chile. The detection pipeline encompasses: (i) the generation of an anomaly score using the ground truth and reconstructed output from the autoencoder, (ii) the detection of relevant seismic events through an appropriate threshold, and (iii) the filtering of local events, that would lead to false positives. Robustness of the model was tested on the HR GNSS real data of 2011 Mw 6.8 Concepcion earthquake recorded at six stations. The results highlight the potential of GNSS based deep learning models for effective earthquake detection.
△ Less
Submitted 29 November, 2024;
originally announced December 2024.
-
Causal Data Fusion for Panel Data without Pre-Intervention Period
Authors:
Zou Yang,
Seung Hee Lee,
Julia R. Köhler,
AmirEmad Ghassami
Abstract:
Traditional panel data causal inference frameworks, such as difference-in-differences and synthetic control methods, rely on pre-intervention data to estimate counterfactuals. However, such data may not be available in real-world settings when interventions are implemented in response to sudden events, such as public health crises or epidemiological shocks. In this paper, we introduce two data fus…
▽ More
Traditional panel data causal inference frameworks, such as difference-in-differences and synthetic control methods, rely on pre-intervention data to estimate counterfactuals. However, such data may not be available in real-world settings when interventions are implemented in response to sudden events, such as public health crises or epidemiological shocks. In this paper, we introduce two data fusion methods for causal inference from panel data in scenarios where pre-intervention data is unavailable. These methods leverage auxiliary reference domains with related panel data to estimate causal effects in the target domain, overcoming the limitations imposed by the absence of pre-intervention data. We show the efficacy of these methods by obtaining converging bounds on the bias as well as through a simulation study. Our proposed methodology renders causal inference feasible in urgent and data-constrained environments where the assumptions of the existing causal inference frameworks are not met. As an application of the proposed methodology, we study the causal effect of the community organization activity on the COVID-19 vaccination rate among the Hispanic sub-population in the city of Chelsea, Massachusetts.
△ Less
Submitted 8 March, 2025; v1 submitted 21 October, 2024;
originally announced October 2024.
-
Movie Gen: A Cast of Media Foundation Models
Authors:
Adam Polyak,
Amit Zohar,
Andrew Brown,
Andros Tjandra,
Animesh Sinha,
Ann Lee,
Apoorv Vyas,
Bowen Shi,
Chih-Yao Ma,
Ching-Yao Chuang,
David Yan,
Dhruv Choudhary,
Dingkang Wang,
Geet Sethi,
Guan Pang,
Haoyu Ma,
Ishan Misra,
Ji Hou,
Jialiang Wang,
Kiran Jagadeesh,
Kunpeng Li,
Luxin Zhang,
Mannat Singh,
Mary Williamson,
Matt Le
, et al. (63 additional authors not shown)
Abstract:
We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user's image. Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization,…
▽ More
We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user's image. Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization, video editing, video-to-audio generation, and text-to-audio generation. Our largest video generation model is a 30B parameter transformer trained with a maximum context length of 73K video tokens, corresponding to a generated video of 16 seconds at 16 frames-per-second. We show multiple technical innovations and simplifications on the architecture, latent spaces, training objectives and recipes, data curation, evaluation protocols, parallelization techniques, and inference optimizations that allow us to reap the benefits of scaling pre-training data, model size, and training compute for training large scale media generation models. We hope this paper helps the research community to accelerate progress and innovation in media generation models. All videos from this paper are available at https://go.fb.me/MovieGenResearchVideos.
△ Less
Submitted 26 February, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
Towards Multilingual LLM Evaluation for European Languages
Authors:
Klaudia Thellmann,
Bernhard Stadler,
Michael Fromm,
Jasper Schulze Buschhoff,
Alex Jude,
Fabio Barth,
Johannes Leveling,
Nicolas Flores-Herr,
Joachim Köhler,
René Jäkel,
Mehdi Ali
Abstract:
The rise of Large Language Models (LLMs) has revolutionized natural language processing across numerous languages and tasks. However, evaluating LLM performance in a consistent and meaningful way across multiple European languages remains challenging, especially due to the scarcity of language-parallel multilingual benchmarks. We introduce a multilingual evaluation approach tailored for European l…
▽ More
The rise of Large Language Models (LLMs) has revolutionized natural language processing across numerous languages and tasks. However, evaluating LLM performance in a consistent and meaningful way across multiple European languages remains challenging, especially due to the scarcity of language-parallel multilingual benchmarks. We introduce a multilingual evaluation approach tailored for European languages. We employ translated versions of five widely-used benchmarks to assess the capabilities of 40 LLMs across 21 European languages. Our contributions include examining the effectiveness of translated benchmarks, assessing the impact of different translation services, and offering a multilingual evaluation framework for LLMs that includes newly created datasets: EU20-MMLU, EU20-HellaSwag, EU20-ARC, EU20-TruthfulQA, and EU20-GSM8K. The benchmarks and results are made publicly available to encourage further research in multilingual LLM evaluation.
△ Less
Submitted 17 October, 2024; v1 submitted 11 October, 2024;
originally announced October 2024.
-
Data Processing for the OpenGPT-X Model Family
Authors:
Nicolo' Brandizzi,
Hammam Abdelwahab,
Anirban Bhowmick,
Lennard Helmer,
Benny Jörg Stein,
Pavel Denisov,
Qasid Saleem,
Michael Fromm,
Mehdi Ali,
Richard Rutmann,
Farzad Naderi,
Mohamad Saif Agy,
Alexander Schwirjow,
Fabian Küch,
Luzian Hahn,
Malte Ostendorff,
Pedro Ortiz Suarez,
Georg Rehm,
Dennis Wegener,
Nicolas Flores-Herr,
Joachim Köhler,
Johannes Leveling
Abstract:
This paper presents a comprehensive overview of the data preparation pipeline developed for the OpenGPT-X project, a large-scale initiative aimed at creating open and high-performance multilingual large language models (LLMs). The project goal is to deliver models that cover all major European languages, with a particular focus on real-world applications within the European Union. We explain all d…
▽ More
This paper presents a comprehensive overview of the data preparation pipeline developed for the OpenGPT-X project, a large-scale initiative aimed at creating open and high-performance multilingual large language models (LLMs). The project goal is to deliver models that cover all major European languages, with a particular focus on real-world applications within the European Union. We explain all data processing steps, starting with the data selection and requirement definition to the preparation of the final datasets for model training. We distinguish between curated data and web data, as each of these categories is handled by distinct pipelines, with curated data undergoing minimal filtering and web data requiring extensive filtering and deduplication. This distinction guided the development of specialized algorithmic solutions for both pipelines. In addition to describing the processing methodologies, we provide an in-depth analysis of the datasets, increasing transparency and alignment with European data regulations. Finally, we share key insights and challenges faced during the project, offering recommendations for future endeavors in large-scale multilingual data preparation for LLMs.
△ Less
Submitted 28 April, 2025; v1 submitted 11 October, 2024;
originally announced October 2024.
-
Music-triggered fashion design: from songs to the metaverse
Authors:
Martina Delgado,
Marta Llopart,
Eva Sarabia,
Sandra Taboada,
Pol Vierge,
Fernando Vilariño,
Joan Moya Kohler,
Julieta Grimberg Golijov,
Matías Bilkis
Abstract:
The advent of increasingly-growing virtual realities poses unprecedented opportunities and challenges to different societies. Artistic collectives are not an exception, and we here aim to put special attention into musicians. Compositions, lyrics and even show-advertisements are constituents of a message that artists transmit about their reality. As such, artistic creations are ultimately linked t…
▽ More
The advent of increasingly-growing virtual realities poses unprecedented opportunities and challenges to different societies. Artistic collectives are not an exception, and we here aim to put special attention into musicians. Compositions, lyrics and even show-advertisements are constituents of a message that artists transmit about their reality. As such, artistic creations are ultimately linked to feelings and emotions, with aesthetics playing a crucial role when it comes to transmit artist's intentions. In this context, we here analyze how virtual realities can help to broaden the opportunities for musicians to bridge with their audiences, by devising a dynamical fashion-design recommendation system inspired by sound stimulus. We present our first steps towards re-defining musical experiences in the metaverse, opening up alternative opportunities for artists to connect both with real and virtual (\textit{e.g.} machine-learning agents operating in the metaverse) in potentially broader ways.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs
Authors:
Mehdi Ali,
Michael Fromm,
Klaudia Thellmann,
Jan Ebert,
Alexander Arno Weber,
Richard Rutmann,
Charvi Jain,
Max Lübbering,
Daniel Steinigen,
Johannes Leveling,
Katrin Klug,
Jasper Schulze Buschhoff,
Lena Jurkschat,
Hammam Abdelwahab,
Benny Jörg Stein,
Karl-Heinz Sylla,
Pavel Denisov,
Nicolo' Brandizzi,
Qasid Saleem,
Anirban Bhowmick,
Lennard Helmer,
Chelsea John,
Pedro Ortiz Suarez,
Malte Ostendorff,
Alex Jude
, et al. (14 additional authors not shown)
Abstract:
We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' dev…
▽ More
We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.
△ Less
Submitted 15 October, 2024; v1 submitted 30 September, 2024;
originally announced October 2024.
-
Stochastic Data-Driven Predictive Control: Chance-Constraint Satisfaction with Identified Multi-step Predictors
Authors:
Haldun Balim,
Andrea Carron,
Melanie N. Zeilinger,
Johannes Köhler
Abstract:
We propose a novel data-driven stochastic model predictive control framework for uncertain linear systems with noisy output measurements. Our approach leverages multi-step predictors to efficiently propagate uncertainty, ensuring chance constraint satisfaction. In particular, we present a strategy to identify multi-step predictors and quantify the associated uncertainty using a surrogate (data-dri…
▽ More
We propose a novel data-driven stochastic model predictive control framework for uncertain linear systems with noisy output measurements. Our approach leverages multi-step predictors to efficiently propagate uncertainty, ensuring chance constraint satisfaction. In particular, we present a strategy to identify multi-step predictors and quantify the associated uncertainty using a surrogate (data-driven) state space model. Then, we utilize the derived distribution to formulate a constraint tightening that ensures chance constraint satisfaction despite the parametric uncertainty. A numerical example highlights the reduced conservatism of handling parametric uncertainty in the proposed method compared to state-of-the-art solutions.
△ Less
Submitted 15 March, 2025; v1 submitted 16 September, 2024;
originally announced September 2024.
-
Towards safe and tractable Gaussian process-based MPC: Efficient sampling within a sequential quadratic programming framework
Authors:
Manish Prajapat,
Amon Lahr,
Johannes Köhler,
Andreas Krause,
Melanie N. Zeilinger
Abstract:
Learning uncertain dynamics models using Gaussian process~(GP) regression has been demonstrated to enable high-performance and safety-aware control strategies for challenging real-world applications. Yet, for computational tractability, most approaches for Gaussian process-based model predictive control (GP-MPC) are based on approximations of the reachable set that are either overly conservative o…
▽ More
Learning uncertain dynamics models using Gaussian process~(GP) regression has been demonstrated to enable high-performance and safety-aware control strategies for challenging real-world applications. Yet, for computational tractability, most approaches for Gaussian process-based model predictive control (GP-MPC) are based on approximations of the reachable set that are either overly conservative or impede the controller's safety guarantees. To address these challenges, we propose a robust GP-MPC formulation that guarantees constraint satisfaction with high probability. For its tractable implementation, we propose a sampling-based GP-MPC approach that iteratively generates consistent dynamics samples from the GP within a sequential quadratic programming framework. We highlight the improved reachable set approximation compared to existing methods, as well as real-time feasible computation times, using two numerical examples.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
The Giant Radio Array for Neutrino Detection (GRAND) Collaboration -- Contributions to the 10th International Workshop on Acoustic and Radio EeV Neutrino Detection Activities (ARENA 2024)
Authors:
Rafael Alves Batista,
Aurélien Benoit-Lévy,
Teresa Bister,
Martina Bohacova,
Mauricio Bustamante,
Washington Carvalho,
Yiren Chen,
LingMei Cheng,
Simon Chiche,
Jean-Marc Colley,
Pablo Correa,
Nicoleta Cucu Laurenciu,
Zigao Dai,
Rogerio M. de Almeida,
Beatriz de Errico,
Sijbrand de Jong,
João R. T. de Mello Neto,
Krijn D de Vries,
Valentin Decoene,
Peter B. Denton,
Bohao Duan,
Kaikai Duan,
Ralph Engel,
William Erba,
Yizhong Fan
, et al. (100 additional authors not shown)
Abstract:
This is an index of the contributions by the Giant Radio Array for Neutrino Detection (GRAND) Collaboration to the 10th International Workshop on Acoustic and Radio EeV Neutrino Detection Activities (ARENA 2024, University of Chicago, June 11-14, 2024). The contributions include an overview of GRAND in its present and future incarnations, methods of radio-detection that are being developed for the…
▽ More
This is an index of the contributions by the Giant Radio Array for Neutrino Detection (GRAND) Collaboration to the 10th International Workshop on Acoustic and Radio EeV Neutrino Detection Activities (ARENA 2024, University of Chicago, June 11-14, 2024). The contributions include an overview of GRAND in its present and future incarnations, methods of radio-detection that are being developed for them, and ongoing joint work between the GRAND and BEACON experiments.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Highly Accurate Real-space Electron Densities with Neural Networks
Authors:
Lixue Cheng,
P. Bernát Szabó,
Zeno Schätzle,
Derk P. Kooi,
Jonas Köhler,
Klaas J. H. Giesbertz,
Frank Noé,
Jan Hermann,
Paola Gori-Giorgi,
Adam Foster
Abstract:
Variational ab-initio methods in quantum chemistry stand out among other methods in providing direct access to the wave function. This allows in principle straightforward extraction of any other observable of interest, besides the energy, but in practice this extraction is often technically difficult and computationally impractical. Here, we consider the electron density as a central observable in…
▽ More
Variational ab-initio methods in quantum chemistry stand out among other methods in providing direct access to the wave function. This allows in principle straightforward extraction of any other observable of interest, besides the energy, but in practice this extraction is often technically difficult and computationally impractical. Here, we consider the electron density as a central observable in quantum chemistry and introduce a novel method to obtain accurate densities from real-space many-electron wave functions by representing the density with a neural network that captures known asymptotic properties and is trained from the wave function by score matching and noise-contrastive estimation. We use variational quantum Monte Carlo with deep-learning ansätze (deep QMC) to obtain highly accurate wave functions free of basis set errors, and from them, using our novel method, correspondingly accurate electron densities, which we demonstrate by calculating dipole moments, nuclear forces, contact densities, and other density-based properties.
△ Less
Submitted 1 November, 2024; v1 submitted 2 September, 2024;
originally announced September 2024.
-
Next-Generation Triggering: A Novel Event-Level Approach
Authors:
Jelena Köhler,
Aurélien Benoit-Lévy,
Pablo Correa,
Arsène Ferriere,
Tim Huege,
Kumiko Kotera,
Olivier Martineau-Huynh,
Simon Prunet,
Markus Roth
Abstract:
Large-scale cosmic-ray detectors like the Giant Radio Array for Neutrino Detection (GRAND) are pushing the boundaries of our ability to identify air shower events. Existing trigger schemes rely solely on the timing of signals detected by individual antennas, which brings many challenges in distinguishing true air shower signals from background. This work explores novel event-level radio trigger me…
▽ More
Large-scale cosmic-ray detectors like the Giant Radio Array for Neutrino Detection (GRAND) are pushing the boundaries of our ability to identify air shower events. Existing trigger schemes rely solely on the timing of signals detected by individual antennas, which brings many challenges in distinguishing true air shower signals from background. This work explores novel event-level radio trigger methods specifically designed for GRAND, but also applicable to other systems, such as the Radio Detector (RD) of the Pierre Auger Observatory. In addition to an upgraded plane wave front reconstruction technique, we introduce orthogonal and complementary approaches that analyze the radio-emission footprint, the spatial distribution of signal strength across triggered antennas, to refine event selection. We test our methods on mock data sets constructed with simulated showers and real background noise measured with the GRAND prototype, to assess the performance potential in terms of sensitivity and background rejection in GRAND. Our preliminary results are a first step to identifying the most discriminating radio signal features at event-level, and optimizing the techniques for future implementation on experimental data.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
GRANDlib: A simulation pipeline for the Giant Radio Array for Neutrino Detection (GRAND)
Authors:
GRAND Collaboration,
Rafael Alves Batista,
Aurélien Benoit-Lévy,
Teresa Bister,
Martina Bohacova,
Mauricio Bustamante,
Washington Carvalho,
Yiren Chen,
LingMei Cheng,
Simon Chiche,
Jean-Marc Colley,
Pablo Correa,
Nicoleta Cucu Laurenciu,
Zigao Dai,
Rogerio M. de Almeida,
Beatriz de Errico,
Sijbrand de Jong,
João R. T. de Mello Neto,
Krijn D. de Vries,
Valentin Decoene,
Peter B. Denton,
Bohao Duan,
Kaikai Duan,
Ralph Engel,
William Erba
, et al. (90 additional authors not shown)
Abstract:
The operation of upcoming ultra-high-energy cosmic-ray, gamma-ray, and neutrino radio-detection experiments, like the Giant Radio Array for Neutrino Detection (GRAND), poses significant computational challenges involving the production of numerous simulations of particle showers and their detection, and a high data throughput. GRANDlib is an open-source software tool designed to meet these challen…
▽ More
The operation of upcoming ultra-high-energy cosmic-ray, gamma-ray, and neutrino radio-detection experiments, like the Giant Radio Array for Neutrino Detection (GRAND), poses significant computational challenges involving the production of numerous simulations of particle showers and their detection, and a high data throughput. GRANDlib is an open-source software tool designed to meet these challenges. Its primary goal is to perform end-to-end simulations of the detector operation, from the interaction of ultra-high-energy particles, through -- by interfacing with external air-shower simulations -- the ensuing particle shower development and its radio emission, to its detection by antenna arrays and its processing by data-acquisition systems. Additionally, GRANDlib manages the visualization, storage, and retrieval of experimental and simulated data. We present an overview of GRANDlib to serve as the basis of future GRAND analyses.
△ Less
Submitted 11 December, 2024; v1 submitted 20 August, 2024;
originally announced August 2024.
-
Analytical Uncertainty-Based Loss Weighting in Multi-Task Learning
Authors:
Lukas Kirchdorfer,
Cathrin Elich,
Simon Kutsche,
Heiner Stuckenschmidt,
Lukas Schott,
Jan M. Köhler
Abstract:
With the rise of neural networks in various domains, multi-task learning (MTL) gained significant relevance. A key challenge in MTL is balancing individual task losses during neural network training to improve performance and efficiency through knowledge sharing across tasks. To address these challenges, we propose a novel task-weighting method by building on the most prevalent approach of Uncerta…
▽ More
With the rise of neural networks in various domains, multi-task learning (MTL) gained significant relevance. A key challenge in MTL is balancing individual task losses during neural network training to improve performance and efficiency through knowledge sharing across tasks. To address these challenges, we propose a novel task-weighting method by building on the most prevalent approach of Uncertainty Weighting and computing analytically optimal uncertainty-based weights, normalized by a softmax function with tunable temperature. Our approach yields comparable results to the combinatorially prohibitive, brute-force approach of Scalarization while offering a more cost-effective yet high-performing alternative. We conduct an extensive benchmark on various datasets and architectures. Our method consistently outperforms six other common weighting methods. Furthermore, we report noteworthy experimental findings for the practical application of MTL. For example, larger networks diminish the influence of weighting methods, and tuning the weight decay has a low impact compared to the learning rate.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Large-scale cosmic ray anisotropies with 19 years of data from the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
A. Ambrosone,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova
, et al. (333 additional authors not shown)
Abstract:
Results are presented for the measurement of large-scale anisotropies in the arrival directions of ultra-high-energy cosmic rays detected at the Pierre Auger Observatory during 19 years of operation, prior to AugerPrime, the upgrade of the Observatory. The 3D dipole amplitude and direction are reconstructed above $4\,$EeV in four energy bins. Besides the established dipolar anisotropy in right asc…
▽ More
Results are presented for the measurement of large-scale anisotropies in the arrival directions of ultra-high-energy cosmic rays detected at the Pierre Auger Observatory during 19 years of operation, prior to AugerPrime, the upgrade of the Observatory. The 3D dipole amplitude and direction are reconstructed above $4\,$EeV in four energy bins. Besides the established dipolar anisotropy in right ascension above $8\,$EeV, the Fourier amplitude of the $8$ to $16\,$EeV energy bin is now also above the $5σ$ discovery level. No time variation of the dipole moment above $8\,$EeV is found, setting an upper limit to the rate of change of such variations of $0.3\%$ per year at the $95\%$ confidence level. Additionally, the results for the angular power spectrum are shown, demonstrating no other statistically significant multipoles. The results for the equatorial dipole component down to $0.03\,$EeV are presented, using for the first time a data set obtained with a trigger that has been optimized for lower energies. Finally, model predictions are discussed and compared with observations, based on two source emission scenarios obtained in the combined fit of spectrum and composition above $0.6\,$EeV.
△ Less
Submitted 23 January, 2025; v1 submitted 9 August, 2024;
originally announced August 2024.
-
From Data to Predictive Control: A Framework for Stochastic Linear Systems with Output Measurements
Authors:
Haldun Balim,
Andrea Carron,
Melanie N. Zeilinger,
Johannes Köhler
Abstract:
We introduce data to predictive control, D2PC, a framework to facilitate the design of robust and predictive controllers from data. The proposed framework is designed for discrete-time stochastic linear systems with output measurements and provides a principled design of a predictive controller based on data. The framework starts with a parameter identification method based on the Expectation-Maxi…
▽ More
We introduce data to predictive control, D2PC, a framework to facilitate the design of robust and predictive controllers from data. The proposed framework is designed for discrete-time stochastic linear systems with output measurements and provides a principled design of a predictive controller based on data. The framework starts with a parameter identification method based on the Expectation-Maximization algorithm, which incorporates pre-defined structural constraints. Additionally, we provide an asymptotically correct method to quantify uncertainty in parameter estimates. Next, we develop a strategy to synthesize robust dynamic output-feedback controllers tailored to the derived uncertainty characterization. Finally, we introduce a predictive control scheme that guarantees recursive feasibility and satisfaction of chance constraints. This framework marks a significant advancement in integrating data into robust and predictive control schemes. We demonstrate the efficacy of D2PC through a numerical example involving a 10-dimensional spring-mass-damper system.
△ Less
Submitted 5 March, 2025; v1 submitted 24 July, 2024;
originally announced July 2024.
-
Predictive control for nonlinear stochastic systems: Closed-loop guarantees with unbounded noise
Authors:
Johannes Köhler,
Melanie N. Zeilinger
Abstract:
We present a stochastic model predictive control framework for nonlinear systems subject to unbounded process noise with closed-loop guarantees. First, we provide a conceptual shrinking-horizon framework that utilizes general probabilistic reachable sets and minimizes the expected cost. Then, we provide a tractable receding-horizon formulation that uses a nominal state to minimize a deterministic…
▽ More
We present a stochastic model predictive control framework for nonlinear systems subject to unbounded process noise with closed-loop guarantees. First, we provide a conceptual shrinking-horizon framework that utilizes general probabilistic reachable sets and minimizes the expected cost. Then, we provide a tractable receding-horizon formulation that uses a nominal state to minimize a deterministic quadratic cost and satisfy tightened constraints. Our theoretical analysis demonstrates recursive feasibility, satisfaction of chance constraints, and bounds on the expected cost for the resulting closed-loop system. We provide a constructive design for probabilistic reachable sets of nonlinear continuously differentiable systems using stochastic contraction metrics and an assumed bound on the covariance matrices. Numerical simulations highlight the computational efficiency and theoretical guarantees of the proposed method. Overall, this paper provides a framework for computationally tractable stochastic predictive control with closed-loop guarantees for nonlinear systems with unbounded noise.
△ Less
Submitted 4 June, 2025; v1 submitted 18 July, 2024;
originally announced July 2024.
-
The flux of ultra-high-energy cosmic rays along the supergalactic plane measured at the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato
, et al. (342 additional authors not shown)
Abstract:
Ultra-high-energy cosmic rays are known to be mainly of extragalactic origin, and their propagation is limited by energy losses, so their arrival directions are expected to correlate with the large-scale structure of the local Universe. In this work, we investigate the possible presence of intermediate-scale excesses in the flux of the most energetic cosmic rays from the direction of the supergala…
▽ More
Ultra-high-energy cosmic rays are known to be mainly of extragalactic origin, and their propagation is limited by energy losses, so their arrival directions are expected to correlate with the large-scale structure of the local Universe. In this work, we investigate the possible presence of intermediate-scale excesses in the flux of the most energetic cosmic rays from the direction of the supergalactic plane region using events with energies above 20 EeV recorded with the surface detector array of the Pierre Auger Observatory up to 31 December 2022, with a total exposure of 135,000 km^2 sr yr. The strongest indication for an excess that we find, with a post-trial significance of 3.1σ, is in the Centaurus region, as in our previous reports, and it extends down to lower energies than previously studied. We do not find any strong hints of excesses from any other region of the supergalactic plane at the same angular scale. In particular, our results do not confirm the reports by the Telescope Array collaboration of excesses from two regions in the Northern Hemisphere at the edge of the field of view of the Pierre Auger Observatory. With a comparable exposure, our results in those regions are in good agreement with the expectations from an isotropic distribution.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Understanding building blocks of photonic logic gates: Reversible, read-write-erase cycling using photoswitchable beads in micropatterned arrays
Authors:
Heyou Zhang,
Pankaj Dharpure,
Michael Philipp,
Paul Mulvaney,
Mukundan Thelakkat,
Jürgen Köhler
Abstract:
Using surface-templated electrophoretic deposition, we have created arrays of polymer beads (photonic units) incorporating photo-switchable DAE molecules, which can be reversibly and individually switched between high and low emission states by direct photo-excitation, without any energy or electron transfer processes within the molecular system. The micropatterned array of these photonic units is…
▽ More
Using surface-templated electrophoretic deposition, we have created arrays of polymer beads (photonic units) incorporating photo-switchable DAE molecules, which can be reversibly and individually switched between high and low emission states by direct photo-excitation, without any energy or electron transfer processes within the molecular system. The micropatterned array of these photonic units is spectroscopically characterized in detail and optimized with respect to both signal contrast and cross-talk. The optimum optical parameters including laser intensity, wavelength and duration of irradiation are elucidated and ideal conditions for creating reversible on/off cycles in a micropatterned array are determined. 500 such cycles are demonstrated with no obvious on/off contrast attenuation. The ability to process binary information is demonstrated by selectively writing information to the given photonic unit, reading the resultant emissive signal pattern and finally erasing the information again, which in turn demonstrates the possibility of continuous recording. This basic study paves the way for building complex circuits using spatially well-arranged photonic units.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Embedded Hierarchical MPC for Autonomous Navigation
Authors:
Dennis Benders,
Johannes Köhler,
Thijs Niesten,
Robert Babuška,
Javier Alonso-Mora,
Laura Ferranti
Abstract:
To efficiently deploy robotic systems in society, mobile robots must move autonomously and safely through complex environments. Nonlinear model predictive control (MPC) methods provide a natural way to find a dynamically feasible trajectory through the environment without colliding with nearby obstacles. However, the limited computation power available on typical embedded robotic systems, such as…
▽ More
To efficiently deploy robotic systems in society, mobile robots must move autonomously and safely through complex environments. Nonlinear model predictive control (MPC) methods provide a natural way to find a dynamically feasible trajectory through the environment without colliding with nearby obstacles. However, the limited computation power available on typical embedded robotic systems, such as quadrotors, poses a challenge to running MPC in real time, including its most expensive tasks: constraints generation and optimization. To address this problem, we propose a novel hierarchical MPC scheme that consists of a planning and a tracking layer. The planner constructs a trajectory with a long prediction horizon at a slow rate, while the tracker ensures trajectory tracking at a relatively fast rate. We prove that the proposed framework avoids collisions and is recursively feasible. Furthermore, we demonstrate its effectiveness in simulations and lab experiments with a quadrotor that needs to reach a goal position in a complex static environment. The code is efficiently implemented on the quadrotor's embedded computer to ensure real-time feasibility. Compared to a state-of-the-art single-layer MPC formulation, this allows us to increase the planning horizon by a factor of 5, which results in significantly better performance.
△ Less
Submitted 9 May, 2025; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Search for photons above 10$^{18}$ eV by simultaneously measuring the atmospheric depth and the muon content of air showers at the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato
, et al. (342 additional authors not shown)
Abstract:
The Pierre Auger Observatory is the most sensitive instrument to detect photons with energies above $10^{17}$ eV. It measures extensive air showers generated by ultra high energy cosmic rays using a hybrid technique that exploits the combination of a fluorescence detector with a ground array of particle detectors. The signatures of a photon-induced air shower are a larger atmospheric depth of the…
▽ More
The Pierre Auger Observatory is the most sensitive instrument to detect photons with energies above $10^{17}$ eV. It measures extensive air showers generated by ultra high energy cosmic rays using a hybrid technique that exploits the combination of a fluorescence detector with a ground array of particle detectors. The signatures of a photon-induced air shower are a larger atmospheric depth of the shower maximum ($X_{max}$) and a steeper lateral distribution function, along with a lower number of muons with respect to the bulk of hadron-induced cascades. In this work, a new analysis technique in the energy interval between 1 and 30 EeV (1 EeV = $10^{18}$ eV) has been developed by combining the fluorescence detector-based measurement of $X_{max}$ with the specific features of the surface detector signal through a parameter related to the air shower muon content, derived from the universality of the air shower development. No evidence of a statistically significant signal due to photon primaries was found using data collected in about 12 years of operation. Thus, upper bounds to the integral photon flux have been set using a detailed calculation of the detector exposure, in combination with a data-driven background estimation. The derived 95% confidence level upper limits are 0.0403, 0.01113, 0.0035, 0.0023, and 0.0021 km$^{-2}$ sr$^{-1}$ yr$^{-1}$ above 1, 2, 3, 5, and 10 EeV, respectively, leading to the most stringent upper limits on the photon flux in the EeV range. Compared with past results, the upper limits were improved by about 40% for the lowest energy threshold and by a factor 3 above 3 EeV, where no candidates were found and the expected background is negligible. The presented limits can be used to probe the assumptions on chemical composition of ultra-high energy cosmic rays and allow for the constraint of the mass and lifetime phase space of super-heavy dark matter particles.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Measurement of the Depth of Maximum of Air-Shower Profiles with energies between $\mathbf{10^{18.5}}$ and $\mathbf{10^{20}}$ eV using the Surface Detector of the Pierre Auger Observatory and Deep Learning
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato
, et al. (342 additional authors not shown)
Abstract:
We report an investigation of the mass composition of cosmic rays with energies from 3 to 100 EeV (1 EeV=$10^{18}$ eV) using the distributions of the depth of shower maximum $X_\mathrm{max}$. The analysis relies on ${\sim}50,000$ events recorded by the Surface Detector of the Pierre Auger Observatory and a deep-learning-based reconstruction algorithm. Above energies of 5 EeV, the data set offers a…
▽ More
We report an investigation of the mass composition of cosmic rays with energies from 3 to 100 EeV (1 EeV=$10^{18}$ eV) using the distributions of the depth of shower maximum $X_\mathrm{max}$. The analysis relies on ${\sim}50,000$ events recorded by the Surface Detector of the Pierre Auger Observatory and a deep-learning-based reconstruction algorithm. Above energies of 5 EeV, the data set offers a 10-fold increase in statistics with respect to fluorescence measurements at the Observatory. After cross-calibration using the Fluorescence Detector, this enables the first measurement of the evolution of the mean and the standard deviation of the $X_\mathrm{max}$ distributions up to 100 EeV. Our findings are threefold:
(1.) The evolution of the mean logarithmic mass towards a heavier composition with increasing energy can be confirmed and is extended to 100 EeV.
(2.) The evolution of the fluctuations of $X_\mathrm{max}$ towards a heavier and purer composition with increasing energy can be confirmed with high statistics. We report a rather heavy composition and small fluctuations in $X_\mathrm{max}$ at the highest energies.
(3.) We find indications for a characteristic structure beyond a constant change in the mean logarithmic mass, featuring three breaks that are observed in proximity to the ankle, instep, and suppression features in the energy spectrum.
△ Less
Submitted 6 February, 2025; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Inference of the Mass Composition of Cosmic Rays with energies from $\mathbf{10^{18.5}}$ to $\mathbf{10^{20}}$ eV using the Pierre Auger Observatory and Deep Learning
Authors:
The Pierre Auger Collaboration,
A. Abdul Halim,
P. Abreu,
M. Aglietta,
I. Allekotte,
K. Almeida Cheminant,
A. Almela,
R. Aloisio,
J. Alvarez-Muñiz,
J. Ammerman Yebra,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
L. Andrade Dourado,
S. Andringa,
L. Apollonio,
C. Aramo,
P. R. Araújo Ferreira,
E. Arnone,
J. C. Arteaga Velázquez,
P. Assis,
G. Avila,
E. Avocone,
A. Bakalova,
F. Barbato
, et al. (342 additional authors not shown)
Abstract:
We present measurements of the atmospheric depth of the shower maximum $X_\mathrm{max}$, inferred for the first time on an event-by-event level using the Surface Detector of the Pierre Auger Observatory. Using deep learning, we were able to extend measurements of the $X_\mathrm{max}$ distributions up to energies of 100 EeV ($10^{20}$ eV), not yet revealed by current measurements, providing new ins…
▽ More
We present measurements of the atmospheric depth of the shower maximum $X_\mathrm{max}$, inferred for the first time on an event-by-event level using the Surface Detector of the Pierre Auger Observatory. Using deep learning, we were able to extend measurements of the $X_\mathrm{max}$ distributions up to energies of 100 EeV ($10^{20}$ eV), not yet revealed by current measurements, providing new insights into the mass composition of cosmic rays at extreme energies. Gaining a 10-fold increase in statistics compared to the Fluorescence Detector data, we find evidence that the rate of change of the average $X_\mathrm{max}$ with the logarithm of energy features three breaks at $6.5\pm0.6~(\mathrm{stat})\pm1~(\mathrm{sys})$ EeV, $11\pm 2~(\mathrm{stat})\pm1~(\mathrm{sys})$ EeV, and $31\pm5~(\mathrm{stat})\pm3~(\mathrm{sys})$ EeV, in the vicinity to the three prominent features (ankle, instep, suppression) of the cosmic-ray flux. The energy evolution of the mean and standard deviation of the measured $X_\mathrm{max}$ distributions indicates that the mass composition becomes increasingly heavier and purer, thus being incompatible with a large fraction of light nuclei between 50 EeV and 100 EeV.
△ Less
Submitted 6 February, 2025; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Model predictive control for tracking using artificial references: Fundamentals, recent results and practical implementation
Authors:
Pablo Krupa,
Johannes Köhler,
Antonio Ferramosca,
Ignacio Alvarado,
Melanie N. Zeilinger,
Teodoro Alamo,
Daniel Limon
Abstract:
This paper provides a comprehensive tutorial on a family of Model Predictive Control (MPC) formulations, known as MPC for tracking, which are characterized by including an artificial reference as part of the decision variables in the optimization problem. These formulations have several benefits with respect to the classical MPC formulations, including guaranteed recursive feasibility under online…
▽ More
This paper provides a comprehensive tutorial on a family of Model Predictive Control (MPC) formulations, known as MPC for tracking, which are characterized by including an artificial reference as part of the decision variables in the optimization problem. These formulations have several benefits with respect to the classical MPC formulations, including guaranteed recursive feasibility under online reference changes, as well as asymptotic stability and an increased domain of attraction. This tutorial paper introduces the concept of using an artificial reference in MPC, presenting the benefits and theoretical guarantees obtained by its use. We then provide a survey of the main advances and extensions of the original linear MPC for tracking, including its non-linear extension. Additionally, we discuss its application to learning-based MPC, and discuss optimization aspects related to its implementation.
△ Less
Submitted 9 December, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Challenge-Device-Synthesis: A multi-disciplinary approach for the development of social innovation competences for students of Artificial Intelligence
Authors:
Matías Bilkis,
Joan Moya Kohler,
Fernando Vilariño
Abstract:
The advent of Artificial Intelligence is expected to imply profound changes in the short-term. It is therefore imperative for Academia, and particularly for the Computer Science scope, to develop cross-disciplinary tools that bond AI developments to their social dimension. To this aim, we introduce the Challenge-Device-Synthesis methodology (CDS), in which a specific challenge is presented to the…
▽ More
The advent of Artificial Intelligence is expected to imply profound changes in the short-term. It is therefore imperative for Academia, and particularly for the Computer Science scope, to develop cross-disciplinary tools that bond AI developments to their social dimension. To this aim, we introduce the Challenge-Device-Synthesis methodology (CDS), in which a specific challenge is presented to the students of AI, who are required to develop a device as a solution for the challenge. The device becomes the object of study for the different dimensions of social transformation, and the conclusions addressed by the students during the discussion around the device are presented in a synthesis piece in the shape of a 10-page scientific paper. The latter is evaluated taking into account both the depth of analysis and the level to which it genuinely reflects the social transformations associated with the proposed AI-based device. We provide data obtained during the pilot for the implementation phase of CDS within the subject of Social Innovation, a 6-ECTS subject from the 6th semester of the Degree of Artificial Intelligence, UAB-Barcelona. We provide details on temporalisation, task distribution, methodological tools used and assessment delivery procedure, as well as qualitative analysis of the results obtained.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Adaptive tracking MPC for nonlinear systems via online linear system identification
Authors:
Tatiana Strelnikova,
Johannes Köhler,
Julian Berberich
Abstract:
This paper presents an adaptive tracking model predictive control (MPC) scheme to control unknown nonlinear systems based on an adaptively estimated linear model. The model is determined based on linear system identification using a moving window of past measurements, and it serves as a local approximation of the underlying nonlinear dynamics. We prove that the presented scheme ensures practical e…
▽ More
This paper presents an adaptive tracking model predictive control (MPC) scheme to control unknown nonlinear systems based on an adaptively estimated linear model. The model is determined based on linear system identification using a moving window of past measurements, and it serves as a local approximation of the underlying nonlinear dynamics. We prove that the presented scheme ensures practical exponential stability of the (unknown) optimal reachable equilibrium for a given output setpoint. Finally, we apply the proposed scheme in simulation and compare it to an alternative direct data-driven MPC scheme based on the Fundamental Lemma.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation
Authors:
Jonas Kohler,
Albert Pumarola,
Edgar Schönfeld,
Artsiom Sanakoyeu,
Roshan Sumbaly,
Peter Vajda,
Ali Thabet
Abstract:
Diffusion models are a powerful generative framework, but come with expensive inference. Existing acceleration methods often compromise image quality or fail under complex conditioning when operating in an extremely low-step regime. In this work, we propose a novel distillation framework tailored to enable high-fidelity, diverse sample generation using just one to three steps. Our approach compris…
▽ More
Diffusion models are a powerful generative framework, but come with expensive inference. Existing acceleration methods often compromise image quality or fail under complex conditioning when operating in an extremely low-step regime. In this work, we propose a novel distillation framework tailored to enable high-fidelity, diverse sample generation using just one to three steps. Our approach comprises three key components: (i) Backward Distillation, which mitigates training-inference discrepancies by calibrating the student on its own backward trajectory; (ii) Shifted Reconstruction Loss that dynamically adapts knowledge transfer based on the current time step; and (iii) Noise Correction, an inference-time technique that enhances sample quality by addressing singularities in noise prediction. Through extensive experiments, we demonstrate that our method outperforms existing competitors in quantitative metrics and human evaluations. Remarkably, it achieves performance comparable to the teacher model using only three denoising steps, enabling efficient high-quality generation.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.