Search | arXiv e-print repository

MRSD: Multi-Resolution Skill Discovery for HRL Agents

Authors: Shashank Sharma, Janina Hoffmann, Vinay Namboodiri

Abstract: Hierarchical reinforcement learning (HRL) relies on abstract skills to solve long-horizon tasks efficiently. While existing skill discovery methods learns these skills automatically, they are limited to a single skill per task. In contrast, humans learn and use both fine-grained and coarse motor skills simultaneously. Inspired by human motor control, we propose Multi-Resolution Skill Discovery (MR… ▽ More Hierarchical reinforcement learning (HRL) relies on abstract skills to solve long-horizon tasks efficiently. While existing skill discovery methods learns these skills automatically, they are limited to a single skill per task. In contrast, humans learn and use both fine-grained and coarse motor skills simultaneously. Inspired by human motor control, we propose Multi-Resolution Skill Discovery (MRSD), an HRL framework that learns multiple skill encoders at different temporal resolutions in parallel. A high-level manager dynamically selects among these skills, enabling adaptive control strategies over time. We evaluate MRSD on tasks from the DeepMind Control Suite and show that it outperforms prior state-of-the-art skill discovery and HRL methods, achieving faster convergence and higher final performance. Our findings highlight the benefits of integrating multi-resolution skills in HRL, paving the way for more versatile and efficient agents. △ Less

Submitted 27 May, 2025; originally announced May 2025.

arXiv:2505.19078

doi 10.4204/EPTCS.420

Proceedings 16th International Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software

Authors: Farzaneh Derakhshan, Jan Hoffmann

Abstract: This volume contains the proceedings of PLACES 2025, the 16th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The workshop is scheduled to take place in Hamilton, Canada, on May 4, 2025, as a satellite event of ETAPS, the European Joint Conferences on Theory and Practice of Software. PLACES offers a forum for exchanging new ideas on how… ▽ More This volume contains the proceedings of PLACES 2025, the 16th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The workshop is scheduled to take place in Hamilton, Canada, on May 4, 2025, as a satellite event of ETAPS, the European Joint Conferences on Theory and Practice of Software. PLACES offers a forum for exchanging new ideas on how to address the challenges of concurrent and distributed programming and how to improve the foundations of modern and future computer applications. PLACES welcomes researchers from various fields, and its topics include the design of new programming languages, models for concurrent and distributed systems, type systems, program verification, and applications in various areas (e.g., microservices, sensor networks, blockchains, event processing, business process management). △ Less

Submitted 25 May, 2025; originally announced May 2025.

Journal ref: EPTCS 420, 2025

arXiv:2505.01353 [pdf, other]

Differentiable Nonlinear Model Predictive Control

Authors: Jonathan Frey, Katrin Baumgärtner, Gianluca Frison, Dirk Reinhardt, Jasper Hoffmann, Leonard Fichtner, Sebastien Gros, Moritz Diehl

Abstract: The efficient computation of parametric solution sensitivities is a key challenge in the integration of learning-enhanced methods with nonlinear model predictive control (MPC), as their availability is crucial for many learning algorithms. While approaches presented in the machine learning community are limited to convex or unconstrained formulations, this paper discusses the computation of soluti… ▽ More The efficient computation of parametric solution sensitivities is a key challenge in the integration of learning-enhanced methods with nonlinear model predictive control (MPC), as their availability is crucial for many learning algorithms. While approaches presented in the machine learning community are limited to convex or unconstrained formulations, this paper discusses the computation of solution sensitivities of general nonlinear programs (NLPs) using the implicit function theorem (IFT) and smoothed optimality conditions treated in interior-point methods (IPM). We detail sensitivity computation within a sequential quadratic programming (SQP) method which employs an IPM for the quadratic subproblems. The publication is accompanied by an efficient open-source implementation within the framework, providing both forward and adjoint sensitivities for general optimal control problems, achieving speedups exceeding 3x over the state-of-the-art solver mpc.pytorch. △ Less

Submitted 2 May, 2025; originally announced May 2025.

Comments: 19 page, 4 figures, 2 tables

arXiv:2505.00439 [pdf, other]

Per-Domain Generalizing Policies: On Validation Instances and Scaling Behavior

Authors: Timo P. Gros, Nicola J. Müller, Daniel Fiser, Isabel Valera, Verena Wolf, Jörg Hoffmann

Abstract: Recent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on t… ▽ More Recent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on the fly, increasing instance size so long as informative and feasible.We also introduce refined methodology for evaluating scaling behavior, generating test instances systematically to guarantee a given confidence in coverage performance for each instance size. In experiments, dynamic validation improves scaling behavior of GNN policies in all 9 domains used. △ Less

Submitted 1 May, 2025; originally announced May 2025.

Comments: 7 pages, 3 tables, 3 figures, 3 algorithms

arXiv:2504.01431 [pdf, other]

Multi-convex Programming for Discrete Latent Factor Models Prototyping

Authors: Hao Zhu, Shengchao Yan, Jasper Hoffmann, Joschka Boedecker

Abstract: Discrete latent factor models (DLFMs) are widely used in various domains such as machine learning, economics, neuroscience, psychology, etc. Currently, fitting a DLFM to some dataset relies on a customized solver for individual models, which requires lots of effort to implement and is limited to the targeted specific instance of DLFMs. In this paper, we propose a generic framework based on CVXPY,… ▽ More Discrete latent factor models (DLFMs) are widely used in various domains such as machine learning, economics, neuroscience, psychology, etc. Currently, fitting a DLFM to some dataset relies on a customized solver for individual models, which requires lots of effort to implement and is limited to the targeted specific instance of DLFMs. In this paper, we propose a generic framework based on CVXPY, which allows users to specify and solve the fitting problem of a wide range of DLFMs, including both regression and classification models, within a very short script. Our framework is flexible and inherently supports the integration of regularization terms and constraints on the DLFM parameters and latent factors, such that the users can easily prototype the DLFM structure according to their dataset and application scenario. We introduce our open-source Python implementation and illustrate the framework in several examples. △ Less

Submitted 2 April, 2025; originally announced April 2025.

MSC Class: 90C25 (Primary); 90C59; 90C90

arXiv:2503.05662 [pdf, other]

On Mitigating Affinity Bias through Bandits with Evolving Biased Feedback

Authors: Matthew Faw, Constantine Caramanis, Jessica Hoffmann

Abstract: Unconscious bias has been shown to influence how we assess our peers, with consequences for hiring, promotions and admissions. In this work, we focus on affinity bias, the component of unconscious bias which leads us to prefer people who are similar to us, despite no deliberate intention of favoritism. In a world where the people hired today become part of the hiring committee of tomorrow, we are… ▽ More Unconscious bias has been shown to influence how we assess our peers, with consequences for hiring, promotions and admissions. In this work, we focus on affinity bias, the component of unconscious bias which leads us to prefer people who are similar to us, despite no deliberate intention of favoritism. In a world where the people hired today become part of the hiring committee of tomorrow, we are particularly interested in understanding (and mitigating) how affinity bias affects this feedback loop. This problem has two distinctive features: 1) we only observe the biased value of a candidate, but we want to optimize with respect to their real value 2) the bias towards a candidate with a specific set of traits depends on the fraction of people in the hiring committee with the same set of traits. We introduce a new bandits variant that exhibits those two features, which we call affinity bandits. Unsurprisingly, classical algorithms such as UCB often fail to identify the best arm in this setting. We prove a new instance-dependent regret lower bound, which is larger than that in the standard bandit setting by a multiplicative function of $K$. Since we treat rewards that are time-varying and dependent on the policy's past actions, deriving this lower bound requires developing proof techniques beyond the standard bandit techniques. Finally, we design an elimination-style algorithm which nearly matches this regret, despite never observing the real rewards. △ Less

Submitted 7 March, 2025; originally announced March 2025.

arXiv:2503.03654 [pdf, other]

Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset

Authors: Jessica Hoffmann, Christiane Ahlheim, Zac Yu, Aria Walfrand, Jarvis Jin, Marie Tano, Ahmad Beirami, Erin van Liemt, Nithum Thain, Hakim Sidahmed, Lucas Dixon

Abstract: This paper describes the construction of a dataset and the evaluation of training methods to improve generative large language models' (LLMs) ability to answer queries on sensitive topics with a Neutral Point of View (NPOV), i.e., to provide significantly more informative, diverse and impartial answers. The dataset, the SHQ-NPOV dataset, comprises 300 high-quality, human-written quadruplets: a que… ▽ More This paper describes the construction of a dataset and the evaluation of training methods to improve generative large language models' (LLMs) ability to answer queries on sensitive topics with a Neutral Point of View (NPOV), i.e., to provide significantly more informative, diverse and impartial answers. The dataset, the SHQ-NPOV dataset, comprises 300 high-quality, human-written quadruplets: a query on a sensitive topic, an answer, an NPOV rating, and a set of links to source texts elaborating the various points of view. The first key contribution of this paper is a new methodology to create such datasets through iterative rounds of human peer-critique and annotator training, which we release alongside the dataset. The second key contribution is the identification of a highly effective training regime for parameter-efficient reinforcement learning (PE-RL) to improve NPOV generation. We compare and extensively evaluate PE-RL and multiple baselines-including LoRA finetuning (a strong baseline), SFT and RLHF. PE-RL not only improves on overall NPOV quality compared to the strongest baseline ($97.06\%\rightarrow 99.08\%$), but also scores much higher on features linguists identify as key to separating good answers from the best answers ($60.25\%\rightarrow 85.21\%$ for presence of supportive details, $68.74\%\rightarrow 91.43\%$ for absence of oversimplification). A qualitative analysis corroborates this. Finally, our evaluation finds no statistical differences between results on topics that appear in the training dataset and those on separated evaluation topics, which provides strong evidence that our approach to training PE-RL exhibits very effective out of topic generalization. △ Less

Submitted 5 March, 2025; originally announced March 2025.

arXiv:2502.15088 [pdf, other]

doi 10.1103/PhysRevB.111.134108

Soft phonon and the central peak at the cubic-to-tetragonal phase transition in SrTiO$_3$

Authors: Avishek Maity, Klaus Habicht, Michael Merz, Ayman H. Said, Christo Guguschev, Danny Kojda, Britta Ryll, Jan-Ekkehard Hoffmann, Andrea Dittmar, Thomas Keller, Frank Weber

Abstract: The continuous displacive phase transition in SrTiO$_3$ near $T_c \approx 105$ K features a central elastic peak in neutron scattering investigations at temperatures above $T_c$, i.e., before the corresponding soft phonon mode is overdamped upon cooling. The origin of this central peak is still not understood. Here, we report an inelastic x-ray scattering investigation of the cubic-to-tetragonal p… ▽ More The continuous displacive phase transition in SrTiO$_3$ near $T_c \approx 105$ K features a central elastic peak in neutron scattering investigations at temperatures above $T_c$, i.e., before the corresponding soft phonon mode is overdamped upon cooling. The origin of this central peak is still not understood. Here, we report an inelastic x-ray scattering investigation of the cubic-to-tetragonal phase transition in SrTiO$_3$. We compare quantitatively measurements of the soft phonon mode on two differently grown samples and discuss the findings regarding results from thermodynamic and transport probes such as specific heat and thermal conductivity. Furthermore, we use inelastic x-ray scattering to perform elastic scans with both high momentum- and milli-electronvolt energy-resolution and, thus, be able to separate elastic intensities of the central peak from low-energy quasielastic phonon scattering. Our results indicate that the evolution of the soft mode is similar in both samples though the intensities of the central peak differ by a factor of four. Measurements revealing anisotropic correlation lengths on cooling towards $T_c$, indicate that local properties of the crystals to which collective lattice excitations are insensitive are likely at the origin of the central elastic line in SrTiO$_3$. △ Less

Submitted 29 March, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

Comments: Manuscript contains 9 pages, 4 figures. Supplementary information contains 31 pages, 17 figures, 3 tables

Journal ref: Physical Review B, 111, 134108 (2025)

arXiv:2502.12676 [pdf]

A thin film source in a solid-state diffusion experiment: CoO on SrTiO3

Authors: Qian Ma, Jan Erik Rybak, Natalie Jacqueline Ottinger, Timo Kassubek, Jörg Hoffmann, Karl-Michael Weitzel, Cynthia A. Volkert, Christian Jooss

Abstract: To realize a chemical diffusion experiment for simple quantitative analysis of one-dimensional diffusion profiles requires the fabrication of a planar and chemically sharp interface between two phases, one serving as the diffusion source and the other as the material to be studied. We demonstrate a thin film source on top of single crystals or epitaxial films for the example of cobalt (II) oxide (… ▽ More To realize a chemical diffusion experiment for simple quantitative analysis of one-dimensional diffusion profiles requires the fabrication of a planar and chemically sharp interface between two phases, one serving as the diffusion source and the other as the material to be studied. We demonstrate a thin film source on top of single crystals or epitaxial films for the example of cobalt (II) oxide (CoO) grown on top of SrTiO3 (STO) by ion beam sputtering. After deposition at room temperature, a nanocrystalline film with flat and chemically sharp interface is present. Diffusion annealing leads to a partial formation of the Co3O4 phase and recrystallization accompanied by a strong increase of the surface and the interface roughness. We report the conditions, where compact and stable CoO layers with flat interface can be maintained, serving as a constant source for Co diffusion. Exemplarily, the formation of a Co-diffusion profile is demonstrated after annealing of 240 h at 1163 K and comparatively studied by using three different methods: Energy dispersive x-ray spectroscopy (EDX) in a transmission electron microscope (TEM), atom probe tomography (APT) and time of flight secondary ion mass spectroscopy (TOF SIMS). Local and rather macroscopic concentration profiling do well agree within error. △ Less

Submitted 18 February, 2025; originally announced February 2025.

arXiv:2502.02133 [pdf, other]

Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

Authors: Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shamburaj Sawant, Joschka Boedecker, Moritz Diehl, Sebastien Gros

Abstract: The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and… ▽ More The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, leading to a large and growing set of complex ideas leveraging MPC and RL. This work illuminates the differences, similarities, and fundamentals that allow for different combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor-critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy. △ Less

Submitted 4 February, 2025; originally announced February 2025.

arXiv:2502.01956 [pdf, other]

DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents

Authors: Shashank Sharma, Janina Hoffmann, Vinay Namboodiri

Abstract: Hierarchical Reinforcement Learning (HRL) agents often struggle with long-horizon visual planning due to their reliance on error-prone distance metrics. We propose Discrete Hierarchical Planning (DHP), a method that replaces continuous distance estimates with discrete reachability checks to evaluate subgoal feasibility. DHP recursively constructs tree-structured plans by decomposing long-term goal… ▽ More Hierarchical Reinforcement Learning (HRL) agents often struggle with long-horizon visual planning due to their reliance on error-prone distance metrics. We propose Discrete Hierarchical Planning (DHP), a method that replaces continuous distance estimates with discrete reachability checks to evaluate subgoal feasibility. DHP recursively constructs tree-structured plans by decomposing long-term goals into sequences of simpler subtasks, using a novel advantage estimation strategy that inherently rewards shorter plans and generalizes beyond training depths. In addition, to address the data efficiency challenge, we introduce an exploration strategy that generates targeted training examples for the planning modules without needing expert data. Experiments in 25-room navigation environments demonstrate $100\%$ success rate (vs $82\%$ baseline) and $73$-step average episode length (vs $158$-step baseline). The method also generalizes to momentum-based control tasks and requires only $\log N$ steps for replanning. Theoretical analysis and ablations validate our design choices. △ Less

Submitted 27 May, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

arXiv:2501.16188 [pdf, other]

$\bar{b}\bar{b}ud$ Tetraquarks with $I(J^P)=0(1^-)$ and $\bar{b}\bar{c}ud$ Tetraquarks with $I(J^P)=0(0^+)$ and $I(J^P)=0(1^+)$ from Lattice QCD Antistatic-Antistatic Potentials

Authors: Jakob Hoffmann, Lasse Müller, Marc Wagner

Abstract: We study heavy spin effects in $\bar{b}\bar{b}ud$ and $\bar{b}\bar{c}ud$ four-quark systems using the Born-Oppenheimer approximation and existing antistatic-antistatic potentials computed with lattice QCD. We report about a recent refined investigation of the $\bar{b}\bar{b}ud$ system with $I(J^P)=0(1^-)$, where we predicted a tetraquark resonance slightly above the $B^{*}B^{*}$ threshold. Further… ▽ More We study heavy spin effects in $\bar{b}\bar{b}ud$ and $\bar{b}\bar{c}ud$ four-quark systems using the Born-Oppenheimer approximation and existing antistatic-antistatic potentials computed with lattice QCD. We report about a recent refined investigation of the $\bar{b}\bar{b}ud$ system with $I(J^P)=0(1^-)$, where we predicted a tetraquark resonance slightly above the $B^{*}B^{*}$ threshold. Furthermore, we extend our Born-Oppenheimer approach to $\bar{b}\bar{c}ud$ four-quark systems. For quantum numbers $I(J^P)=0(0^+)$ as well as $I(J^P)=0(1^+)$ we find virtual bound states rather far away from the lowest meson-meson thresholds. △ Less

Submitted 27 January, 2025; originally announced January 2025.

Comments: 9 pages, 2 figures

arXiv:2412.06607 [pdf, other]

doi 10.1103/PhysRevD.111.054507

Prediction of an $I(J^{P})=0(1^{-})$ $\bar{b}\bar{b}ud$ Tetraquark Resonance Close to the $B^\ast B^\ast$ Threshold Using Lattice QCD Potentials

Authors: Jakob Hoffmann, Marc Wagner

Abstract: We use antistatic-antistatic potentials computed with lattice QCD and a coupled-channel Born-Oppenheimer approach to explore the existence of a $\bar{b} \bar{b} u d$ tetraquark resonance with quantum numbers $I(J^P) = 0(1^-)$. A pole in the $\mbox{T}$ matrix signals a resonance with mass $m = 2 m_B + 94.0^{+1.3}_{-5.4} \, \text{MeV}$ and decay width $Γ= 140^{+86}_{-66} \, \text{MeV}$, i.e. very cl… ▽ More We use antistatic-antistatic potentials computed with lattice QCD and a coupled-channel Born-Oppenheimer approach to explore the existence of a $\bar{b} \bar{b} u d$ tetraquark resonance with quantum numbers $I(J^P) = 0(1^-)$. A pole in the $\mbox{T}$ matrix signals a resonance with mass $m = 2 m_B + 94.0^{+1.3}_{-5.4} \, \text{MeV}$ and decay width $Γ= 140^{+86}_{-66} \, \text{MeV}$, i.e. very close to the $B^\ast B^\ast$ threshold. We also compute branching ratios, which clearly indicate that this resonance is mainly composed of a $B^\ast B^\ast$ meson pair with a significantly smaller $B B$ contribution. By varying the potential matrix responsible for the coupling of the $B B$ and the $B^\ast B^\ast$ channel as well as the $b$ quark mass, we provide additional insights and understanding concerning the formation and existence of the resonance. We also comment on the importance of our findings and the main takeaways for a possible future full lattice QCD investigation of this $I(J^P) = 0(1^-)$ $\bar{b} \bar{b} u d$ tetraquark resonance. △ Less

Submitted 21 March, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

Comments: 20 pages, 6 figures

Journal ref: Phys. Rev. D 111 (2025), 054507

arXiv:2409.11207 [pdf, other]

doi 10.1109/TIM.2024.3378310

Comparison of Impedance Matching Networks for Scanning Microwave Microscopy

Authors: Johannes Hoffmann, Sophie de Preville, Bruno Eckmann, Hung-Ju Lin, Benedikt Herzog, Kamel Haddadi, Didier Theron, Georg Gramse, Damien Richert, Jose Moran-Meza, Francois Piquemal

Abstract: In this paper, a definition of the gain and added noise of impedance matching networks for scanning microwave microscopy is given. This definition can be used to compare different impedance matching techniques independently of the instrument used to measure the S-parameter. As a demonstration, impedance matching devices consisting of a Beatty line, a tuner, and interferometric setups with and with… ▽ More In this paper, a definition of the gain and added noise of impedance matching networks for scanning microwave microscopy is given. This definition can be used to compare different impedance matching techniques independently of the instrument used to measure the S-parameter. As a demonstration, impedance matching devices consisting of a Beatty line, a tuner, and interferometric setups with and without amplifiers have been investigated. Measurement frequencies up to 28 GHz are used, and the maximal resulting gain found was 9504.7 per Siemens. △ Less

Submitted 17 September, 2024; originally announced September 2024.

Comments: IEEE Transactions on Instrumentation and Measurement (2024)

arXiv:2408.06876 [pdf, other]

Decision-Focused Learning to Predict Action Costs for Planning

Authors: Jayanta Mandi, Marco Foschini, Daniel Holler, Sylvie Thiebaux, Jorg Hoffmann, Tias Guns

Abstract: In many automated planning applications, action costs can be hard to specify. An example is the time needed to travel through a certain road segment, which depends on many factors, such as the current weather conditions. A natural way to address this issue is to learn to predict these parameters based on input features (e.g., weather forecasts) and use the predicted action costs in automated plann… ▽ More In many automated planning applications, action costs can be hard to specify. An example is the time needed to travel through a certain road segment, which depends on many factors, such as the current weather conditions. A natural way to address this issue is to learn to predict these parameters based on input features (e.g., weather forecasts) and use the predicted action costs in automated planning afterward. Decision-Focused Learning (DFL) has been successful in learning to predict the parameters of combinatorial optimization problems in a way that optimizes solution quality rather than prediction quality. This approach yields better results than treating prediction and optimization as separate tasks. In this paper, we investigate for the first time the challenges of implementing DFL for automated planning in order to learn to predict the action costs. There are two main challenges to overcome: (1) planning systems are called during gradient descent learning, to solve planning problems with negative action costs, which are not supported in planning. We propose novel methods for gradient computation to avoid this issue. (2) DFL requires repeated planner calls during training, which can limit the scalability of the method. We experiment with different methods approximating the optimal plan as well as an easy-to-implement caching mechanism to speed up the learning process. As the first work that addresses DFL for automated planning, we demonstrate that the proposed gradient computation consistently yields significantly better plans than predictions aimed at minimizing prediction error; and that caching can temper the computation requirements. △ Less

Submitted 26 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

arXiv:2408.05937 [pdf, other]

doi 10.1093/mnras/stae131

The impact of the FREDDA dedispersion algorithm on $H_0$ estimations with FRBs

Authors: Jordan Hoffmann, Clancy W. James, Hao Qiu, Marcin Glowacki, Keith W. Bannister, Vivek Gupta, Jason X. Prochaska, Apurba Bera, Adam T. Deller, Kelly Gourdji, Lachlan Marnoch, Stuart D. Ryder, Danica R. Scott, Ryan M. Shannon, Nicolas Tejos

Abstract: Fast radio bursts (FRBs) are transient radio signals of extragalactic origins that are subjected to propagation effects such as dispersion and scattering. It follows then that these signals hold information regarding the medium they have traversed and are hence useful as cosmological probes of the Universe. Recently, FRBs were used to make an independent measure of the Hubble Constant $H_0$, promi… ▽ More Fast radio bursts (FRBs) are transient radio signals of extragalactic origins that are subjected to propagation effects such as dispersion and scattering. It follows then that these signals hold information regarding the medium they have traversed and are hence useful as cosmological probes of the Universe. Recently, FRBs were used to make an independent measure of the Hubble Constant $H_0$, promising to resolve the Hubble tension given a sufficient number of detected FRBs. Such cosmological studies are dependent on FRB population statistics, cosmological parameters and detection biases, and thus it is important to accurately characterise each of these. In this work, we empirically characterise the sensitivity of the Fast Real-time Engine for Dedispersing Amplitudes (FREDDA) which is the current detection system for the Australian Square Kilometer Array Pathfinder (ASKAP). We coherently redisperse high-time resolution data of 13 ASKAP-detected FRBs and inject them into FREDDA to determine the recovered signal-to-noise ratios as a function of dispersion measure (DM). We find that for 11 of the 13 FRBs, these results are consistent with injecting idealised pulses. Approximating this sensitivity function with theoretical predictions results in a systematic error of 0.3$\,$km$\,$s$^{-1}\,$Mpc$^{-1}$ on $H_0$ when it is the only free parameter. Allowing additional parameters to vary could increase this systematic by up to $\sim1\,$km$\,$s$^{-1}\,$Mpc$^{-1}$. We estimate that this systematic will not be relevant until $\sim$400 localised FRBs have been detected, but will likely be significant in resolving the Hubble tension. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: 8 pages, 6 figures, Published in MNRAS

arXiv:2408.04878 [pdf, other]

Modelling DSA, FAST and CRAFT surveys in a z-DM analysis and constraining a minimum FRB energy

Authors: Jordan Hoffmann, Clancy W. James, Marcin Glowacki, Jason X. Prochaska, Alexa C. Gordon, Adam T. Deller, Ryan M. Shannon, Stuart D. Ryder

Abstract: Fast radio burst (FRB) science primarily revolves around two facets: the origin of these bursts and their use in cosmological studies. This work follows from previous redshift-dispersion measure ($z$-DM) analyses in which we model instrumental biases and simultaneously fit population parameters and cosmological parameters to the observed population of FRBs. This sheds light on both the progenitors… ▽ More Fast radio burst (FRB) science primarily revolves around two facets: the origin of these bursts and their use in cosmological studies. This work follows from previous redshift-dispersion measure ($z$-DM) analyses in which we model instrumental biases and simultaneously fit population parameters and cosmological parameters to the observed population of FRBs. This sheds light on both the progenitors of FRBs and cosmological questions. Previously, we have completed similar analyses with data from the Australian Square Kilometer Array Pathfinder (ASKAP) and the Murriyang (Parkes) Multibeam system. With this manuscript, we additionally incorporate data from the Deep Synoptic Array (DSA) and the Five-hundred-meter Aperture Spherical Telescope (FAST), invoke a Markov chain Monte Carlo (MCMC) sampler and implement uncertainty in the Galactic DM contributions. The latter leads to larger uncertainties in derived model parameters than previous estimates despite the additional data. We provide refined constraints on FRB population parameters and derive a new constraint on the minimum FRB energy of log$\,E_{\mathrm{min}}$(erg)=39.49$^{+0.39}_{-1.48}$ which is significantly higher than bursts detected from strong repeaters. This result may indicate a low-energy turnover in the luminosity function or may suggest that strong repeaters have a different luminosity function to single bursts. We also predict that FAST will detect 25-41% of their FRBs at $z \gtrsim 2$ and DSA will detect 2-12% of their FRBs at $z \gtrsim 1$. △ Less

Submitted 9 August, 2024; originally announced August 2024.

Comments: 17 pages, 7 figures, submitted to PASA

arXiv:2406.03995 [pdf, other]

AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model Predictive Control

Authors: Rudolf Reiter, Andrea Ghezzi, Katrin Baumgärtner, Jasper Hoffmann, Robert D. McAllister, Moritz Diehl

Abstract: \Ac{MPC} and \ac{RL} are two powerful control strategies with, arguably, complementary advantages. In this work, we show how actor-critic \ac{RL} techniques can be leveraged to improve the performance of \ac{MPC}. The \ac{RL} critic is used as an approximation of the optimal value function, and an actor roll-out provides an initial guess for primal variables of the \ac{MPC}. A parallel control arc… ▽ More \Ac{MPC} and \ac{RL} are two powerful control strategies with, arguably, complementary advantages. In this work, we show how actor-critic \ac{RL} techniques can be leveraged to improve the performance of \ac{MPC}. The \ac{RL} critic is used as an approximation of the optimal value function, and an actor roll-out provides an initial guess for primal variables of the \ac{MPC}. A parallel control architecture is proposed where each \ac{MPC} instance is solved twice for different initial guesses. Besides the actor roll-out initialization, a shifted initialization from the previous solution is used. Thereafter, the actor and the critic are again used to approximately evaluate the infinite horizon cost of these trajectories. The control actions from the lowest-cost trajectory are applied to the system at each time step. We establish that the proposed algorithm is guaranteed to outperform the original \ac{RL} policy plus an error term that depends on the accuracy of the critic and decays with the horizon length of the \ac{MPC} formulation. Moreover, we do not require globally optimal solutions for these guarantees to hold. The approach is demonstrated on an illustrative toy example and an \ac{AD} overtaking scenario. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.21008 [pdf, other]

Continuation of Bianchi Spacetimes Through The Big Bang

Authors: Josh Hoffmann, David Sloan

Abstract: In this paper we present a framework in which the relational description of General Relativity can be used to smoothly continue cosmological dynamical systems through the Big Bang without invoking quantum gravity effects. Cosmological spacetimes contain as a key dynamical variable a notion of scale through the volume factor $ν$. However no cosmological observer is ever able to separate their measu… ▽ More In this paper we present a framework in which the relational description of General Relativity can be used to smoothly continue cosmological dynamical systems through the Big Bang without invoking quantum gravity effects. Cosmological spacetimes contain as a key dynamical variable a notion of scale through the volume factor $ν$. However no cosmological observer is ever able to separate their measuring apparatus from the system they are measuring, in that sense every measurement is a relative one and measurable dynamical variables are in fact dimensionless ratios. This is manifest in the identification of a scaling symmetry or ``Dynamical Similarity" in the Einstein-Hilbert action associated with the volume factor. By quotienting out this scaling symmetry, we form a relational system defined on a contact manifold whose dynamical variables are decoupled from scale. When the phase space is reduced to shape space, we show that there exist unique solutions to the equations of motion that pass smoothly through the initial cosmological singularity in flat FLRW, Bianchi I and Quiescent Bianchi IX cosmologies. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 52 pages, 38 figures

arXiv:2405.08178 [pdf, ps, other]

A Theoretical Framework for Self-Gravitating k-Form Boson Stars with Internal Symmetries

Authors: Jakob Hoffmann, Cédric Jockel

Abstract: Current boson star models are largely restricted to global symmetries and lower spin fields. In this work, we generalize these systems of self-gravitating bosonic fields to allow for arbitrary totally antisymmetric tensor fields and arbitrary internal gauge symmetries. We construct a generalized formalism for Yang-Mills-like theories, which allows for arbitrary k-form fields, instead of just vecto… ▽ More Current boson star models are largely restricted to global symmetries and lower spin fields. In this work, we generalize these systems of self-gravitating bosonic fields to allow for arbitrary totally antisymmetric tensor fields and arbitrary internal gauge symmetries. We construct a generalized formalism for Yang-Mills-like theories, which allows for arbitrary k-form fields, instead of just vector fields. The k-form fields have gauge symmetries described by semisimple, compact Lie groups. We further derive equations of motion for the k-form fields and connection coefficients of the Lie group. Extensions and applications are also discussed. We present a novel way to fix the group connection using a spacetime connection. As an example, we derive explicitly the connection coefficients for SU(2) in a spherically symmetric spacetime using rectangular vielbeins. The combination of methods presented leads to a powerful, adaptable and practical framework. As a proof of concept, we derive ordinary differential equations for a 0-form field with a SU(2) symmetry. Our framework can be used to model self-gravitating (multi) particle states with internal symmetries, such as pion condensates or dark matter. It is also suited as a tool to approach open problems in modified gravity and string theory. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 58 pages including appendix, both authors are first authors

arXiv:2405.02598 [pdf, other]

UDUC: An Uncertainty-driven Approach for Learning-based Robust Control

Authors: Yuan Zhang, Jasper Hoffmann, Joschka Boedecker

Abstract: Learning-based techniques have become popular in both model predictive control (MPC) and reinforcement learning (RL). Probabilistic ensemble (PE) models offer a promising approach for modelling system dynamics, showcasing the ability to capture uncertainty and scalability in high-dimensional control scenarios. However, PE models are susceptible to mode collapse, resulting in non-robust control whe… ▽ More Learning-based techniques have become popular in both model predictive control (MPC) and reinforcement learning (RL). Probabilistic ensemble (PE) models offer a promising approach for modelling system dynamics, showcasing the ability to capture uncertainty and scalability in high-dimensional control scenarios. However, PE models are susceptible to mode collapse, resulting in non-robust control when faced with environments slightly different from the training set. In this paper, we introduce the $\textbf{u}$ncertainty-$\textbf{d}$riven rob$\textbf{u}$st $\textbf{c}$ontrol (UDUC) loss as an alternative objective for training PE models, drawing inspiration from contrastive learning. We analyze the robustness of UDUC loss through the lens of robust optimization and evaluate its performance on the challenging Real-world Reinforcement Learning (RWRL) benchmark, which involves significant environmental mismatches between the training and testing environments. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2404.18863 [pdf, other]

PlanNetX: Learning an Efficient Neural Network Planner from MPC for Longitudinal Control

Authors: Jasper Hoffmann, Diego Fernandez, Julien Brosseit, Julian Bernhard, Klemens Esterle, Moritz Werling, Michael Karg, Joschka Boedecker

Abstract: Model predictive control (MPC) is a powerful, optimization-based approach for controlling dynamical systems. However, the computational complexity of online optimization can be problematic on embedded devices. Especially, when we need to guarantee fixed control frequencies. Thus, previous work proposed to reduce the computational burden using imitation learning (IL) approximating the MPC policy by… ▽ More Model predictive control (MPC) is a powerful, optimization-based approach for controlling dynamical systems. However, the computational complexity of online optimization can be problematic on embedded devices. Especially, when we need to guarantee fixed control frequencies. Thus, previous work proposed to reduce the computational burden using imitation learning (IL) approximating the MPC policy by a neural network. In this work, we instead learn the whole planned trajectory of the MPC. We introduce a combination of a novel neural network architecture PlanNetX and a simple loss function based on the state trajectory that leverages the parameterized optimal control structure of the MPC. We validate our approach in the context of autonomous driving by learning a longitudinal planner and benchmarking it extensively in the CommonRoad simulator using synthetic scenarios and scenarios derived from real data. Our experimental results show that we can learn the open-loop MPC trajectory with high accuracy while improving the closed-loop performance of the learned control policy over other baselines like behavior cloning. △ Less

Submitted 22 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: 6th Annual Learning for Dynamics & Control Conference (L4DC 2024)

arXiv:2403.10704 [pdf, other]

Parameter Efficient Reinforcement Learning from Human Feedback

Authors: Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, Zhang Chen, Zac Yu, Jarvis Jin, Simral Chaudhary, Roman Komarytsia, Christiane Ahlheim, Yonghao Zhu, Bowen Li, Saravanan Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li, Abhinav Rastogi, Lucas Dixon

Abstract: While Reinforcement Learning from Human Feedback (RLHF) effectively aligns pretrained Large Language and Vision-Language Models (LLMs, and VLMs) with human preferences, its computational cost and complexity hamper its wider adoption. To alleviate some of the computational burden of fine-tuning, parameter efficient methods, like LoRA were introduced. In this work, we empirically evaluate the setup… ▽ More While Reinforcement Learning from Human Feedback (RLHF) effectively aligns pretrained Large Language and Vision-Language Models (LLMs, and VLMs) with human preferences, its computational cost and complexity hamper its wider adoption. To alleviate some of the computational burden of fine-tuning, parameter efficient methods, like LoRA were introduced. In this work, we empirically evaluate the setup of Parameter Efficient Reinforcement Learning from Human Feedback (PE-RLHF) that leverages LoRA fine-tuning for Reward Modeling, and Reinforcement Learning. We benchmark the PE-RLHF setup on six diverse datasets spanning summarization, harmless/helpful response generation, UI automation, and visual question answering in terms of effectiveness of the trained models, and the training resources required. Our findings show, for the first time, that PE-RLHF achieves comparable performance to RLHF, while significantly reducing training time (up to 90% faster for reward models, and 30% faster for RL), and memory footprint (up to 50% reduction for reward models, and 27% for RL). We provide comprehensive ablations across LoRA ranks, and model sizes for both reward modeling and reinforcement learning. By mitigating the computational burden associated with RLHF, we push for a broader adoption of PE-RLHF as an alignment technique for LLMs and VLMs. △ Less

Submitted 12 September, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.08904 [pdf, other]

Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics

Authors: Tyler A. Chang, Katrin Tomanek, Jessica Hoffmann, Nithum Thain, Erin van Liemt, Kathleen Meier-Hellstern, Lucas Dixon

Abstract: We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the… ▽ More We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the given perspectives. As a starting point, we use a deterministic retrieval system and then focus on common LLM failure modes that arise during this approach to text generation, namely hallucination and coverage errors. We propose and evaluate three methods to detect such errors based on (1) word-overlap, (2) salience, and (3) LLM-based classifiers. Our results demonstrate that LLM-based classifiers, even when trained only on synthetic errors, achieve high error detection performance, with ROC AUC scores of 95.3% for hallucination and 90.5% for coverage error detection on unambiguous error cases. We show that when no training data is available, our other methods still yield good results on hallucination (84.0%) and coverage error (85.2%) detection. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2402.02992 [pdf, other]

Decoding-time Realignment of Language Models

Authors: Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel

Abstract: Aligning language models with human preferences is crucial for reducing errors and biases in these models. Alignment techniques, such as reinforcement learning from human feedback (RLHF), are typically cast as optimizing a tradeoff between human preference rewards and a proximity regularization term that encourages staying close to the unaligned model. Selecting an appropriate level of regularizat… ▽ More Aligning language models with human preferences is crucial for reducing errors and biases in these models. Alignment techniques, such as reinforcement learning from human feedback (RLHF), are typically cast as optimizing a tradeoff between human preference rewards and a proximity regularization term that encourages staying close to the unaligned model. Selecting an appropriate level of regularization is critical: insufficient regularization can lead to reduced model capabilities due to reward hacking, whereas excessive regularization hinders alignment. Traditional methods for finding the optimal regularization level require retraining multiple models with varying regularization strengths. This process, however, is resource-intensive, especially for large models. To address this challenge, we propose decoding-time realignment (DeRa), a simple method to explore and evaluate different regularization strengths in aligned models without retraining. DeRa enables control over the degree of alignment, allowing users to smoothly transition between unaligned and aligned models. It also enhances the efficiency of hyperparameter tuning by enabling the identification of effective regularization strengths using a validation dataset. △ Less

Submitted 24 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: In Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.11270 [pdf, other]

Modelling the Lymphatic Metastatic Progression Pathways of OPSCC from Multi-Institutional Datasets

Authors: Roman Ludwig, Adrian Schubert, Dorothea Barbatei, Lauence Bauwens, Jean-Marc Hoffmann, Sandrine Werlen, Olgun Elicin, Matthias Dettmer, Philippe Zrounba, Bertrand Pouymayou, Panagiotis Balermpas, Vincent Grégoire, Roland Giger, Jan Unkelbach

Abstract: The elective clinical target volume (CTV-N) in oropharyngeal squamous cell carcinoma (OPSCC) is currently based mostly on the prevalence of lymph node metastases in different lymph node levels (LNLs) for a given primary tumor location. We present a probabilistic model for ipsilateral lymphatic spread that can quantify the microscopic nodal involvement risk based on an individual patient's T-catego… ▽ More The elective clinical target volume (CTV-N) in oropharyngeal squamous cell carcinoma (OPSCC) is currently based mostly on the prevalence of lymph node metastases in different lymph node levels (LNLs) for a given primary tumor location. We present a probabilistic model for ipsilateral lymphatic spread that can quantify the microscopic nodal involvement risk based on an individual patient's T-category and clinical involvement of LNLs at diagnosis. We extend a previously published hidden Markov model (HMM), which models the LNLs (I, II, III, IV, V, and VII) as hidden binary random variables (RVs). Each represents a patient's true state of lymphatic involvement. Clinical involvement at diagnosis represents the observed binary RVs linked to the true state via sensitivity and specificity. The primary tumor and the hidden RVs are connected in a graph. Each edge represents the conditional probability of metastatic spread per abstract time-step, given disease at the edge's starting node. To learn these probabilities, we draw Markov chain Monte Carlo samples from the likelihood of a dataset (686 OPSCC patients) from three institutions. We compute the model evidence using thermodynamic integration for different graphs to determine which describes the data best. The graph maximizing the model evidence connects the tumor to each LNL and the LNLs I through V in order. It predicts the risk of occult disease in level IV is below 5% if level III is clinically negative, and that the risk of occult disease in level V is below 5% except for advanced T-category (T3 and T4) patients with clinical involvement of levels II, III, and IV. The provided statistical model of nodal involvement in OPSCC patients trained on multi-institutional data may guide the design of clinical trials on volume-deescalated treatment of OPSCC and contribute to more personal guidelines on elective nodal treatment. △ Less

Submitted 21 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: 17 pages, 12 figures, 7 tables, submitted to Physics in Medicine and Biology

arXiv:2311.09830 [pdf, other]

Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning

Authors: Katharina Stein, Daniel Fišer, Jörg Hoffmann, Alexander Koller

Abstract: Large language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically ge… ▽ More Large language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically generate NL prompts from PDDL input. Our automatically generated NL prompts result in similar LLM-planning performance as the previous manually generated ones. Beyond this, the automation enables us to run much larger experiments, providing for the first time a broad evaluation of LLM planning performance in PDDL. Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts. Compared to symbolic planners, LLM planning lags far behind; but in some domains, our best LLM configuration scales up further than A$^\star$ using LM-cut. △ Less

Submitted 2 May, 2025; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Extended version of the paper from the ICAPS'25 proceedings (same main part + additional appendix)

arXiv:2310.10199 [pdf, other]

Impact of Data Synthesis Strategies for the Classification of Craniosynostosis

Authors: Matthias Schaufelberger, Reinald Peter Kühle, Andreas Wachter, Frederic Weichel, Niclas Hagen, Friedemann Ringwald, Urs Eisenmann, Jürgen Hoffmann, Michael Engel, Christian Freudlsperger, Werner Nahm

Abstract: Introduction: Photogrammetric surface scans provide a radiation-free option to assess and classify craniosynostosis. Due to the low prevalence of craniosynostosis and high patient restrictions, clinical data is rare. Synthetic data could support or even replace clinical data for the classification of craniosynostosis, but this has never been studied systematically. Methods: We test the combination… ▽ More Introduction: Photogrammetric surface scans provide a radiation-free option to assess and classify craniosynostosis. Due to the low prevalence of craniosynostosis and high patient restrictions, clinical data is rare. Synthetic data could support or even replace clinical data for the classification of craniosynostosis, but this has never been studied systematically. Methods: We test the combinations of three different synthetic data sources: a statistical shape model (SSM), a generative adversarial network (GAN), and image-based principal component analysis for a convolutional neural network (CNN)-based classification of craniosynostosis. The CNN is trained only on synthetic data, but validated and tested on clinical data. Results: The combination of a SSM and a GAN achieved an accuracy of more than 0.96 and a F1-score of more than 0.95 on the unseen test set. The difference to training on clinical data was smaller than 0.01. Including a second image modality improved classification performance for all data sources. Conclusion: Without a single clinical training sample, a CNN was able to classify head deformities as accurate as if it was trained on clinical data. Using multiple data sources was key for a good classification based on synthetic data alone. Synthetic data might play an important future role in the assessment of craniosynostosis. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2309.08042 [pdf, other]

Towards Large-scale Building Attribute Mapping using Crowdsourced Images: Scene Text Recognition on Flickr and Problems to be Solved

Authors: Yao Sun, Anna Kruspe, Liqiu Meng, Yifan Tian, Eike J Hoffmann, Stefan Auer, Xiao Xiang Zhu

Abstract: Crowdsourced platforms provide huge amounts of street-view images that contain valuable building information. This work addresses the challenges in applying Scene Text Recognition (STR) in crowdsourced street-view images for building attribute mapping. We use Flickr images, particularly examining texts on building facades. A Berlin Flickr dataset is created, and pre-trained STR models are used for… ▽ More Crowdsourced platforms provide huge amounts of street-view images that contain valuable building information. This work addresses the challenges in applying Scene Text Recognition (STR) in crowdsourced street-view images for building attribute mapping. We use Flickr images, particularly examining texts on building facades. A Berlin Flickr dataset is created, and pre-trained STR models are used for text detection and recognition. Manual checking on a subset of STR-recognized images demonstrates high accuracy. We examined the correlation between STR results and building functions, and analysed instances where texts were recognized on residential buildings but not on commercial ones. Further investigation revealed significant challenges associated with this task, including small text regions in street-view images, the absence of ground truth labels, and mismatches in buildings in Flickr images and building footprints in OpenStreetMap (OSM). To develop city-wide mapping beyond urban hotspot locations, we suggest differentiating the scenarios where STR proves effective while developing appropriate algorithms or bringing in additional data for handling other cases. Furthermore, interdisciplinary collaboration should be undertaken to understand the motivation behind building photography and labeling. The STR-on-Flickr results are publicly available at https://github.com/ya0-sun/STR-Berlin. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.01261 [pdf, other]

doi 10.46298/lmcs-20(4:26)2024

Worst-Case Input Generation for Concurrent Programs under Non-Monotone Resource Metrics

Authors: Long Pham, Jan Hoffmann

Abstract: Worst-case input generation aims to automatically generate inputs that exhibit the worst-case performance of programs. It has several applications, and can, for example, detect vulnerabilities to denial-of-service (DoS) attacks. However, it is non-trivial to generate worst-case inputs for concurrent programs, particularly for resources like memory where the peak cost depends on how processes are s… ▽ More Worst-case input generation aims to automatically generate inputs that exhibit the worst-case performance of programs. It has several applications, and can, for example, detect vulnerabilities to denial-of-service (DoS) attacks. However, it is non-trivial to generate worst-case inputs for concurrent programs, particularly for resources like memory where the peak cost depends on how processes are scheduled. This article presents the first sound worst-case input generation algorithm for concurrent programs under non-monotone resource metrics like memory. The key insight is to leverage resource-annotated session types and symbolic execution. Session types describe communication protocols on channels in process calculi. Equipped with resource annotations, resource-annotated session types not only encode cost bounds but also indicate how many resources can be reused and transferred between processes. This information is critical for identifying a worst-case execution path during symbolic execution. The algorithm is sound: if it returns any input, it is guaranteed to be a valid worst-case input. The algorithm is also relatively complete: as long as resource-annotated session types are sufficiently expressive and the background theory for SMT solving is decidable, a worst-case input is guaranteed to be returned. A simple case study of a web server's memory usage demonstrates the utility of the worst-case input generation algorithm. △ Less

Submitted 21 December, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

Journal ref: Logical Methods in Computer Science, Volume 20, Issue 4 (December 24, 2024) lmcs:12242

arXiv:2308.15470 [pdf, other]

Policy composition in reinforcement learning via multi-objective policy optimization

Authors: Shruti Mishra, Ankit Anand, Jordan Hoffmann, Nicolas Heess, Martin Riedmiller, Abbas Abdolmaleki, Doina Precup

Abstract: We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teacher policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optimization setting. Using the Multi-Objective Maximum a Posteriori Policy Optimization algorithm (Abdolmaleki et al. 2020), we show that teacher policies… ▽ More We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teacher policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optimization setting. Using the Multi-Objective Maximum a Posteriori Policy Optimization algorithm (Abdolmaleki et al. 2020), we show that teacher policies can help speed up learning, particularly in the absence of shaping rewards. In two domains with continuous observation and action spaces, our agents successfully compose teacher policies in sequence and in parallel, and are also able to further extend the policies of the teachers in order to solve the task. Depending on the specified combination of task and teacher(s), teacher(s) may naturally act to limit the final performance of an agent. The extent to which agents are required to adhere to teacher policies are determined by hyperparameters which determine both the effect of teachers on learning speed and the eventual performance of the agent on the task. In the humanoid domain (Tassa et al. 2018), we also equip agents with the ability to control the selection of teachers. With this ability, agents are able to meaningfully compose from the teacher policies to achieve a superior task reward on the walk task than in cases without access to the teacher policies. We show the resemblance of composed task policies with the corresponding teacher policies through videos. △ Less

Submitted 30 August, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2308.03667 [pdf, other]

doi 10.1007/s10208-024-09684-5

Computing the noncommutative inner rank by means of operator-valued free probability theory

Authors: Johannes Hoffmann, Tobias Mai, Roland Speicher

Abstract: We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for w… ▽ More We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for which we provide precise analytical and numerical control on the fixed point algorithm for solving the equation. Numerical examples show the efficiency of the algorithm. △ Less

Submitted 28 June, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: In the second version we have not only improved the presentation of the results, but we supply in addition now actually also a certificate for the termination of our algorithm (this relies on recent theoretical results in the paper arxiv.org/abs/2406.15922)

MSC Class: 46L54; 65J15; 12E15

Journal ref: Found Comput Math (2024)

arXiv:2308.02041 [pdf]

Regulating AI: Applying insights from behavioural economics and psychology to the application of article 5 of the EU AI Act

Authors: Huixin Zhong, Eamonn O'Neill, Janina A. Hoffmann

Abstract: Article 5 of the European Union's Artificial Intelligence Act is intended to regulate AI use to prevent potentially harmful consequences. Nevertheless, applying this legislation practically is likely to be challenging because of ambiguously used terminologies and because it fails to specify which manipulation techniques may be invoked by AI, potentially leading to significant harm. This paper aims… ▽ More Article 5 of the European Union's Artificial Intelligence Act is intended to regulate AI use to prevent potentially harmful consequences. Nevertheless, applying this legislation practically is likely to be challenging because of ambiguously used terminologies and because it fails to specify which manipulation techniques may be invoked by AI, potentially leading to significant harm. This paper aims to bridge this gap by defining key terms and demonstrating how AI may invoke these techniques, drawing from insights in psychology and behavioural economics. First, this paper provides definitions of the terms "subliminal techniques", "manipulative techniques" and "deceptive techniques". Secondly, we identified from the literature in cognitive psychology and behavioural economics three subliminal and five manipulative techniques and exemplify how AI might implement these techniques to manipulate users in real-world case scenarios. These illustrations may serve as a practical guide for stakeholders to detect cases of AI manipulation and consequently devise preventive measures. Article 5 has also been criticised for offering inadequate protection. We critically assess the protection offered by Article 5, proposing specific revisions to paragraph 1, points (a) and (b) of Article 5 to increase its protective effectiveness. △ Less

Submitted 25 February, 2024; v1 submitted 24 July, 2023; originally announced August 2023.

Comments: This paper was accepted for publication by AAAI 2024 paper on December of 2023

arXiv:2305.17300 [pdf, other]

Exploiting Large Neuroimaging Datasets to Create Connectome-Constrained Approaches for more Robust, Efficient, and Adaptable Artificial Intelligence

Authors: Erik C. Johnson, Brian S. Robinson, Gautam K. Vallabha, Justin Joyce, Jordan K. Matelsky, Raphael Norman-Tenazas, Isaac Western, Marisel Villafañe-Delgado, Martha Cervantes, Michael S. Robinette, Arun V. Reddy, Lindsey Kitchell, Patricia K. Rivlin, Elizabeth P. Reilly, Nathan Drenkow, Matthew J. Roos, I-Jeng Wang, Brock A. Wester, William R. Gray-Roncal, Joan A. Hoffmann

Abstract: Despite the progress in deep learning networks, efficient learning at the edge (enabling adaptable, low-complexity machine learning solutions) remains a critical need for defense and commercial applications. We envision a pipeline to utilize large neuroimaging datasets, including maps of the brain which capture neuron and synapse connectivity, to improve machine learning approaches. We have pursue… ▽ More Despite the progress in deep learning networks, efficient learning at the edge (enabling adaptable, low-complexity machine learning solutions) remains a critical need for defense and commercial applications. We envision a pipeline to utilize large neuroimaging datasets, including maps of the brain which capture neuron and synapse connectivity, to improve machine learning approaches. We have pursued different approaches within this pipeline structure. First, as a demonstration of data-driven discovery, the team has developed a technique for discovery of repeated subcircuits, or motifs. These were incorporated into a neural architecture search approach to evolve network architectures. Second, we have conducted analysis of the heading direction circuit in the fruit fly, which performs fusion of visual and angular velocity features, to explore augmenting existing computational models with new insight. Our team discovered a novel pattern of connectivity, implemented a new model, and demonstrated sensor fusion on a robotic platform. Third, the team analyzed circuitry for memory formation in the fruit fly connectome, enabling the design of a novel generative replay approach. Finally, the team has begun analysis of connectivity in mammalian cortex to explore potential improvements to transformer networks. These constraints increased network robustness on the most challenging examples in the CIFAR-10-C computer vision robustness benchmark task, while reducing learnable attention parameters by over an order of magnitude. Taken together, these results demonstrate multiple potential approaches to utilize insight from neural systems for developing robust and efficient machine learning techniques. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 11 pages, 4 figures

arXiv:2304.13627 [pdf, ps, other]

Automatic Amortized Resource Analysis with Regular Recursive Types

Authors: Jessie Grosen, David M. Kahn, Jan Hoffmann

Abstract: The goal of automatic resource bound analysis is to statically infer symbolic bounds on the resource consumption of the evaluation of a program. A longstanding challenge for automatic resource analysis is the inference of bounds that are functions of complex custom data structures. This article builds on type-based automatic amortized resource analysis (AARA) to address this challenge. AARA is bas… ▽ More The goal of automatic resource bound analysis is to statically infer symbolic bounds on the resource consumption of the evaluation of a program. A longstanding challenge for automatic resource analysis is the inference of bounds that are functions of complex custom data structures. This article builds on type-based automatic amortized resource analysis (AARA) to address this challenge. AARA is based on the potential method of amortized analysis and reduces bound inference to standard type inference with additional linear constraint solving, even when deriving non-linear bounds. A key component of AARA is resource functions that generate the space of possible bounds for values of a given type while enjoying necessary closure properties. Existing work on AARA defined such functions for many data structures such as lists of lists but the question of whether such functions exist for arbitrary data structures remained open. This work answers this questions positively by uniformly constructing resource polynomials for algebraic data structures defined by regular recursive types. These functions are a generalization of all previously proposed polynomial resource functions and can be seen as a general notion of polynomials for values of a given recursive type. A resource type system for FPC, a core language with recursive types, demonstrates how resource polynomials can be integrated with AARA while preserving all benefits of past techniques. The article also proposes the use of new techniques useful for stating the rules of this type system and proving it sound. First, multivariate potential annotations are stated in terms of free semimodules, substantially abstracting details of the presentation of annotations and the proofs of their properties. Second, a logical relation giving semantic meaning to resource types enables a proof of soundness by a single induction on typing derivations. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 15 pages, 5 figures; to be published in LICS'23

arXiv:2302.06541 [pdf, other]

Towards Agile Text Classifiers for Everyone

Authors: Maximilian Mozes, Jessica Hoffmann, Katrin Tomanek, Muhamed Kouate, Nithum Thain, Ann Yuan, Tolga Bolukbasi, Lucas Dixon

Abstract: Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text cla… ▽ More Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text classification, whereby classifiers are trained using small, targeted datasets that can be quickly developed for a particular policy. Experimenting with 7 datasets from three safety-related domains, comprising 15 annotation schemes, led to our key finding: prompt-tuning large language models, like PaLM 62B, with a labeled dataset of as few as 80 examples can achieve state-of-the-art performance. We argue that this enables a paradigm shift for text classification, especially for models supporting safer online discourse. Instead of collecting millions of examples to attempt to create universal safety classifiers over months or years, classifiers could be tuned using small datasets, created by individuals or small organizations, tailored for specific use cases, and iterated on and adapted in the time-span of a day. △ Less

Submitted 21 October, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: Findings of EMNLP 2023

arXiv:2212.01607 [pdf, other]

A Hierarchical Approach for Strategic Motion Planning in Autonomous Racing

Authors: Rudolf Reiter, Jasper Hoffmann, Joschka Boedecker, Moritz Diehl

Abstract: We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics… ▽ More We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics in the NLP, we are able to guarantee safe and feasible trajectories related to the used model. Compared to classical reinforcement learning (RL), our approach restricts the exploration to safe trajectories, starts with a good prior performance and yields full trajectories that can be passed to a tracking lowest-level controller. We do not address the lowest-level controller in this work and assume perfect tracking of feasible trajectories. We show the superior performance of our algorithm on simulated racing tasks that include high-level decision making. The vehicle learns to efficiently overtake slower vehicles and to avoid getting overtaken by blocking faster vehicles. △ Less

Submitted 3 December, 2022; originally announced December 2022.

arXiv:2211.15765 [pdf, other]

Inclusion of heavy spin effects in the $u d \bar{b} \bar{b}$ $I(J^{P})=0(1^{-})$ four-quark channel in the Born-Oppenheimer approximation

Authors: Jakob Hoffmann, André Zimermmane-Santos, Marc Wagner

Abstract: We refine our previous study of a $u d \bar{b} \bar{b}$ tetraquark resonance with quantum numbers $I(J^{P})=0(1^{-})$, which is based on antiheavy-antiheavy lattice QCD potentials, by including heavy quark spin effects via the mass difference of the $B$ and the $B^{*}$ meson. This leads to a coupled channel Schrödinger equation, where the two channels correspond to $BB$ and $B^{*}B^{*}$, respectiv… ▽ More We refine our previous study of a $u d \bar{b} \bar{b}$ tetraquark resonance with quantum numbers $I(J^{P})=0(1^{-})$, which is based on antiheavy-antiheavy lattice QCD potentials, by including heavy quark spin effects via the mass difference of the $B$ and the $B^{*}$ meson. This leads to a coupled channel Schrödinger equation, where the two channels correspond to $BB$ and $B^{*}B^{*}$, respectively. We search for $\mbox{T}$ matrix poles in the complex energy plane, but do not find any indication for the existence of a tetraquark resonance in this refined coupled channel approach. We also vary the antiheavy-antiheavy potentials as well as the $b$ quark mass to further understand the dynamics of this four-quark system. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: 9 pages, 4 figures, talk given at "The 39th International Symposium on Lattice Field Theory", 08th-13th August 2022, Bonn, Germany

arXiv:2211.00543 [pdf]

Geo-Information Harvesting from Social Media Data

Authors: Xiao Xiang Zhu, Yuanyuan Wang, Mrinalini Kochupillai, Martin Werner, Matthias Häberle, Eike Jens Hoffmann, Hannes Taubenböck, Devis Tuia, Alex Levering, Nathan Jacobs, Anna Kruspe, Karam Abdulahhad

Abstract: As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multi-perspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characterist… ▽ More As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multi-perspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysis-ready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data. △ Less

Submitted 1 November, 2022; originally announced November 2022.

Comments: Accepted for publication IEEE Geoscience and Remote Sensing Magazine

arXiv:2208.09390 [pdf, other]

doi 10.1103/PhysRevD.107.023502

Regularization of Single Field Inflation Models

Authors: Josh Hoffmann, David Sloan

Abstract: There are many single field inflationary models that are consistent with the recent Planck 2018 measurements of the spectral index $n_s$ and tensor-to-scalar ratio $r$. Despite good agreement with observational data some of these models suffer from having unregularized potentials which would produce a collapsing universe shortly after the end of inflation. In this paper we show that how one choose… ▽ More There are many single field inflationary models that are consistent with the recent Planck 2018 measurements of the spectral index $n_s$ and tensor-to-scalar ratio $r$. Despite good agreement with observational data some of these models suffer from having unregularized potentials which would produce a collapsing universe shortly after the end of inflation. In this paper we show that how one chooses to correct the behaviour potential towards the end of inflation can have a significant effect on the inflationary predictions of the model, specifically in the case of quartic hilltop and radiatively corrected Higgs inflation. △ Less

Submitted 23 November, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

Comments: 25 pages, 25 figures

arXiv:2206.06054 [pdf, other]

Specifying and Testing $k$-Safety Properties for Machine-Learning Models

Authors: Maria Christakis, Hasan Ferit Eniser, Jörg Hoffmann, Adish Singla, Valentin Wüstholz

Abstract: Machine-learning models are becoming increasingly prevalent in our lives, for instance assisting in image-classification or decision-making tasks. Consequently, the reliability of these models is of critical importance and has resulted in the development of numerous approaches for validating and verifying their robustness and fairness. However, beyond such specific properties, it is challenging to… ▽ More Machine-learning models are becoming increasingly prevalent in our lives, for instance assisting in image-classification or decision-making tasks. Consequently, the reliability of these models is of critical importance and has resulted in the development of numerous approaches for validating and verifying their robustness and fairness. However, beyond such specific properties, it is challenging to specify, let alone check, general functional-correctness expectations from models. In this paper, we take inspiration from specifications used in formal methods, expressing functional-correctness properties by reasoning about $k$ different executions, so-called $k$-safety properties. Considering a credit-screening model of a bank, the expected property that "if a person is denied a loan and their income decreases, they should still be denied the loan" is a 2-safety property. Here, we show the wide applicability of $k$-safety properties for machine-learning models and present the first specification language for expressing them. We also operationalize the language in a framework for automatically validating such properties using metamorphic testing. Our experiments show that our framework is effective in identifying property violations, and that detected bugs could be used to train better models. △ Less

Submitted 13 June, 2022; originally announced June 2022.

arXiv:2204.08524 [pdf, other]

So2Sat POP -- A Curated Benchmark Data Set for Population Estimation from Space on a Continental Scale

Authors: Sugandha Doda, Yuanyuan Wang, Matthias Kahl, Eike Jens Hoffmann, Kim Ouan, Hannes Taubenböck, Xiao Xiang Zhu

Abstract: Obtaining a dynamic population distribution is key to many decision-making processes such as urban planning, disaster management and most importantly helping the government to better allocate socio-technical supply. For the aspiration of these objectives, good population data is essential. The traditional method of collecting population data through the census is expensive and tedious. In recent y… ▽ More Obtaining a dynamic population distribution is key to many decision-making processes such as urban planning, disaster management and most importantly helping the government to better allocate socio-technical supply. For the aspiration of these objectives, good population data is essential. The traditional method of collecting population data through the census is expensive and tedious. In recent years, statistical and machine learning methods have been developed to estimate population distribution. Most of the methods use data sets that are either developed on a small scale or not publicly available yet. Thus, the development and evaluation of new methods become challenging. We fill this gap by providing a comprehensive data set for population estimation in 98 European cities. The data set comprises a digital elevation model, local climate zone, land use proportions, nighttime lights in combination with multi-spectral Sentinel-2 imagery, and data from the Open Street Map initiative. We anticipate that it would be a valuable addition to the research community for the development of sophisticated approaches in the field of population estimation. △ Less

Submitted 10 November, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.15556 [pdf, other]

Training Compute-Optimal Large Language Models

Authors: Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, Laurent Sifre

Abstract: We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant. By training over 400 language models ranging from 70 million to over 16 billion… ▽ More We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant. By training over 400 language models ranging from 70 million to over 16 billion parameters on 5 to 500 billion tokens, we find that for compute-optimal training, the model size and the number of training tokens should be scaled equally: for every doubling of model size the number of training tokens should also be doubled. We test this hypothesis by training a predicted compute-optimal model, Chinchilla, that uses the same compute budget as Gopher but with 70B parameters and 4$\times$ more more data. Chinchilla uniformly and significantly outperforms Gopher (280B), GPT-3 (175B), Jurassic-1 (178B), and Megatron-Turing NLG (530B) on a large range of downstream evaluation tasks. This also means that Chinchilla uses substantially less compute for fine-tuning and inference, greatly facilitating downstream usage. As a highlight, Chinchilla reaches a state-of-the-art average accuracy of 67.5% on the MMLU benchmark, greater than a 7% improvement over Gopher. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.09361 [pdf, other]

Expressivity of Planning with Horn Description Logic Ontologies (Technical Report)

Authors: Stefan Borgwardt, Jörg Hoffmann, Alisa Kovtunova, Markus Krötzsch, Bernhard Nebel, Marcel Steinmetz

Abstract: State constraints in AI Planning globally restrict the legal environment states. Standard planning languages make closed-domain and closed-world assumptions. Here we address open-world state constraints formalized by planning over a description logic (DL) ontology. Previously, this combination of DL and planning has been investigated for the light-weight DL DL-Lite. Here we propose a novel compila… ▽ More State constraints in AI Planning globally restrict the legal environment states. Standard planning languages make closed-domain and closed-world assumptions. Here we address open-world state constraints formalized by planning over a description logic (DL) ontology. Previously, this combination of DL and planning has been investigated for the light-weight DL DL-Lite. Here we propose a novel compilation scheme into standard PDDL with derived predicates, which applies to more expressive DLs and is based on the rewritability of DL queries into Datalog with stratified negation. We also provide a new rewritability result for the DL Horn-ALCHOIQ, which allows us to apply our compilation scheme to quite expressive ontologies. In contrast, we show that in the slight extension Horn-SROIQ no such compilation is possible unless the weak exponential hierarchy collapses. Finally, we show that our approach can outperform previous work on existing benchmarks for planning with DL ontologies, and is feasible on new benchmarks taking advantage of more expressive ontologies. That is an extended version of a paper accepted at AAAI 22. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: 16 pages with appendix

MSC Class: 68 ACM Class: I.2.4; I.2.8

arXiv:2202.07315 [pdf, other]

Using Social Media Images for Building Function Classification

Authors: Eike Jens Hoffmann, Karam Abdulahhad, Xiao Xiang Zhu

Abstract: Urban land use on a building instance level is crucial geo-information for many applications, yet difficult to obtain. An intuitive approach to close this gap is predicting building functions from ground level imagery. Social media image platforms contain billions of images, with a large variety of motifs including but not limited to street perspectives. To cope with this issue this study proposes… ▽ More Urban land use on a building instance level is crucial geo-information for many applications, yet difficult to obtain. An intuitive approach to close this gap is predicting building functions from ground level imagery. Social media image platforms contain billions of images, with a large variety of motifs including but not limited to street perspectives. To cope with this issue this study proposes a filtering pipeline to yield high quality, ground level imagery from large social media image datasets. The pipeline ensures that all resulting images have full and valid geotags with a compass direction to relate image content and spatial objects from maps. We analyze our method on a culturally diverse social media dataset from Flickr with more than 28 million images from 42 cities around the world. The obtained dataset is then evaluated in a context of 3-classes building function classification task. The three building classes that are considered in this study are: commercial, residential, and other. Fine-tuned state-of-the-art architectures yield F1-scores of up to 0.51 on the filtered images. Our analysis shows that the performance is highly limited by the quality of the labels obtained from OpenStreetMap, as the metrics increase by 0.2 if only human validated labels are considered. Therefore, we consider these labels to be weak and publish the resulting images from our pipeline together with the buildings they are showing as a weakly labeled dataset. △ Less

Submitted 15 February, 2022; originally announced February 2022.

arXiv:2202.01169 [pdf, other]

Unified Scaling Laws for Routed Language Models

Authors: Aidan Clark, Diego de las Casas, Aurelia Guy, Arthur Mensch, Michela Paganini, Jordan Hoffmann, Bogdan Damoc, Blake Hechtman, Trevor Cai, Sebastian Borgeaud, George van den Driessche, Eliza Rutherford, Tom Hennigan, Matthew Johnson, Katie Millican, Albin Cassirer, Chris Jones, Elena Buchatskaya, David Budden, Laurent Sifre, Simon Osindero, Oriol Vinyals, Jack Rae, Erich Elsen, Koray Kavukcuoglu , et al. (1 additional authors not shown)

Abstract: The performance of a language model has been shown to be effectively modeled as a power-law in its parameter count. Here we study the scaling behaviors of Routing Networks: architectures that conditionally use only a subset of their parameters while processing an input. For these models, parameter count and computational requirement form two independent axes along which an increase leads to better… ▽ More The performance of a language model has been shown to be effectively modeled as a power-law in its parameter count. Here we study the scaling behaviors of Routing Networks: architectures that conditionally use only a subset of their parameters while processing an input. For these models, parameter count and computational requirement form two independent axes along which an increase leads to better performance. In this work we derive and justify scaling laws defined on these two variables which generalize those known for standard language models and describe the performance of a wide range of routing architectures trained via three different techniques. Afterwards we provide two applications of these laws: first deriving an Effective Parameter Count along which all models scale at the same rate, and then using the scaling coefficients to give a quantitative comparison of the three routing techniques considered. Our analysis derives from an extensive evaluation of Routing Networks across five orders of magnitude of size, including models with hundreds of experts and hundreds of billions of parameters. △ Less

Submitted 9 February, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

Comments: Fixing typos and affiliation clarity

arXiv:2201.03744 [pdf, other]

doi 10.1093/mnras/stac333

AT2019azh: an unusually long-lived, radio-bright thermal tidal disruption event

Authors: A. J. Goodwin, S. van Velzen, J. C. A. Miller-Jones, A. Mummery, M. F. Bietenholz, A. Wederfoort, E. Hammerstein, C. Bonnerot, J. Hoffmann, L. Yan

Abstract: Tidal disruption events (TDEs) occur when a star is destroyed by a supermassive black hole at the center of a galaxy, temporarily increasing the accretion rate onto the black hole and producing a bright flare across the electromagnetic spectrum. Radio observations of TDEs trace outflows and jets that may be produced. Radio detections of the outflows from TDEs are uncommon, with only about one thir… ▽ More Tidal disruption events (TDEs) occur when a star is destroyed by a supermassive black hole at the center of a galaxy, temporarily increasing the accretion rate onto the black hole and producing a bright flare across the electromagnetic spectrum. Radio observations of TDEs trace outflows and jets that may be produced. Radio detections of the outflows from TDEs are uncommon, with only about one third of TDEs discovered to date having published radio detections. Here we present over two years of comprehensive, multi-radio frequency monitoring observations of the tidal disruption event AT2019azh taken with the Very Large Array (VLA) and MeerKAT radio telescopes from approximately 10 days pre-optical peak to 810 days post-optical peak. AT2019azh shows unusual radio emission for a thermal TDE, as it brightened very slowly over two years, and showed fluctuations in the synchrotron energy index of the optically thin synchrotron emission from 450 days post-disruption. Based on the radio properties, we deduce that the outflow in this event is likely non-relativistic and could be explained by a spherical outflow arising from self-stream intersections, or a mildly collimated outflow from accretion onto the supermassive black hole. This data-set provides a significant contribution to the observational database of outflows from TDEs, including the earliest radio detection of a non-relativistic TDE to date, relative to the optical discovery. △ Less

Submitted 10 January, 2022; originally announced January 2022.

Comments: 17 pages, 8 figures. Submitted to MNRAS. Comments welcome!

arXiv:2201.03288 [pdf, other]

A statistical shape model for radiation-free assessment and classification of craniosynostosis

Authors: Matthias Schaufelberger, Reinald Peter Kühle, Andreas Wachter, Frederic Weichel, Niclas Hagen, Friedemann Ringwald, Urs Eisenmann, Jürgen Hoffmann, Michael Engel, Christian Freudlsperger, Werner Nahm

Abstract: The assessment of craniofacial deformities requires patient data which is sparsely available. Statistical shape models provide realistic and synthetic data enabling comparisons of existing methods on a common dataset. We build the first publicly available statistical 3D head model of craniosynostosis patients and the first model focusing on infants younger than 1.5 years. We further present a sh… ▽ More The assessment of craniofacial deformities requires patient data which is sparsely available. Statistical shape models provide realistic and synthetic data enabling comparisons of existing methods on a common dataset. We build the first publicly available statistical 3D head model of craniosynostosis patients and the first model focusing on infants younger than 1.5 years. We further present a shape-model-based classification pipeline to distinguish between three different classes of craniosynostosis and a control group on photogrammetric surface scans. To the best of our knowledge, our study uses the largest dataset of craniosynostosis patients in a classification study for craniosynostosis and statistical shape modeling to date. We demonstrate that our shape model performs similar to other statistical shape models of the human head. Craniosynostosis-specific pathologies are represented in the first eigenmodes of the model. Regarding the automatic classification of craniosynostis, our classification approach yields an accuracy of 97.8%, comparable to other state-of-the-art methods using both computed tomography scans and stereophotogrammetry. Our publicly available, craniosynostosis-specific statistical shape model enables the assessment of craniosynostosis on realistic and synthetic data. We further present a state-of-the-art shape-model-based classification approach for a radiation-free diagnosis of craniosynostosis. △ Less

Submitted 28 March, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2112.11446 [pdf, other]

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. These models are evaluated on 152 diverse tasks, achieving state-of-the-art performance across the majority. Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit. We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity. Finally we discuss the application of language models to AI safety and the mitigation of downstream harms. △ Less

Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

Comments: 120 pages

Showing 1–50 of 155 results for author: Hoffmann, J