-
Designing Corrosion-Resistant CoCrNi Medium Entropy Alloys via Short-Range Order Modification
Authors:
Elaf A. Anber,
Debashish Sur,
Annie K. Barnett,
Daniel L. Foley,
Andrew M. Minor,
Brian L. DeCost,
Howie Joress,
Anatoly I. Frenkel,
Michael L. Falk,
John R. Scully,
Mitra L. Taheri
Abstract:
Equiatomic CoCrNi medium entropy alloys are known for their unique properties linked to chemical short-range order (CSRO), crucial in both percolation processes and/or nucleation and growth processes influencing alloy passivation in aqueous environments. This study combines extended x-ray absorption fine structure, atomistic simulations, electrochemical methods, x-ray photoelectron spectroscopy, a…
▽ More
Equiatomic CoCrNi medium entropy alloys are known for their unique properties linked to chemical short-range order (CSRO), crucial in both percolation processes and/or nucleation and growth processes influencing alloy passivation in aqueous environments. This study combines extended x-ray absorption fine structure, atomistic simulations, electrochemical methods, x-ray photoelectron spectroscopy, and transmission electron microscopy to explore CSRO evolution, passive film formation, as well as its characteristics in the as-homogenized CoCrNi condition, both before and after aging treatment. Results reveal a shift in local alloying element bonding environments post-aging, with simulations indicating increased Cr-Cr CSRO in 2nd nearest neighbor shells. Enhanced passive film formation kinetics and superior protection of the aged alloy in harsh acidified 3 mol/L NaCl solution indicate improved aqueous passivation correlated with Cr-Cr CSRO. This work establishes a direct connection between alloy CSRO and aqueous passivation in CoCrNi, highlighting its potential for tailored corrosion-resistant applications.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Counterculture Stars: Slow and Retrograde Stars with Low-Alpha Disk Abundances
Authors:
Carrie Filion,
Michael S. Petersen,
Danny Horta,
Kathryne J. Daniel,
Madeline Lucey,
Adrian M. Price-Whelan
Abstract:
The Milky Way is home to a thin disk that can be defined via kinematics and/or elemental abundances. The elemental abundance-defined thin disk, also called the low-alpha disk, is generally thought to be comprised of stars on planar, circular orbits that approximate the circular velocity curve. While this is an apt description for the majority of stars with thin-disk-like abundances, there are a nu…
▽ More
The Milky Way is home to a thin disk that can be defined via kinematics and/or elemental abundances. The elemental abundance-defined thin disk, also called the low-alpha disk, is generally thought to be comprised of stars on planar, circular orbits that approximate the circular velocity curve. While this is an apt description for the majority of stars with thin-disk-like abundances, there are a number of interesting exceptions. In this analysis, we identify and investigate $\sim 70$ stars with thin-disk-like abundances and very slow or retrograde Galactocentric azimuthal velocities. These stars could be kinematical outliers of the thin disk or elemental abundance outliers of the halo. Focusing first on the former, we introduce a number of mechanisms that could alter a thin disk orbit and cause the azimuthal velocity to become slow or retrograde. We then determine signatures for each mechanism and assess whether that mechanism is unlikely, plausible, or consistent given each star's reported properties. We find that at least one mechanism is plausible for each star, and the mechanism with the highest number of consistent candidate stars is dynamical ejection from stellar clusters. We next discuss scenarios that could produce halo stars with thin disk abundances, and again identify stars that could be connected to these mechanisms. With this sample we investigate rare processes, such as binary disruption by the central supermassive black hole, while also providing a unique perspective into the chemo-dynamics and structural components of the Milky Way.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
The Emergence of Abstract Thought in Large Language Models Beyond Any Language
Authors:
Yuxin Chen,
Yiran Zhao,
Yang Zhang,
An Zhang,
Kenji Kawaguchi,
Shafiq Joty,
Junnan Li,
Tat-Seng Chua,
Michael Qizhe Shieh,
Wenxuan Zhang
Abstract:
As large language models (LLMs) continue to advance, their capacity to function effectively across a diverse range of languages has shown marked improvement. Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts. This has led to the widespread assumption that LLMs may "think" in English. However, more recent results show…
▽ More
As large language models (LLMs) continue to advance, their capacity to function effectively across a diverse range of languages has shown marked improvement. Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts. This has led to the widespread assumption that LLMs may "think" in English. However, more recent results showing strong multilingual performance, even surpassing English performance on specific tasks in other languages, challenge this view. In this work, we find that LLMs progressively develop a core language-agnostic parameter space-a remarkably small subset of parameters whose deactivation results in significant performance degradation across all languages. This compact yet critical set of parameters underlies the model's ability to generalize beyond individual languages, supporting the emergence of abstract thought that is not tied to any specific linguistic system. Specifically, we identify language-related neurons-those are consistently activated during the processing of particular languages, and categorize them as either shared (active across multiple languages) or exclusive (specific to one). As LLMs undergo continued development over time, we observe a marked increase in both the proportion and functional importance of shared neurons, while exclusive neurons progressively diminish in influence. These shared neurons constitute the backbone of the core language-agnostic parameter space, supporting the emergence of abstract thought. Motivated by these insights, we propose neuron-specific training strategies tailored to LLMs' language-agnostic levels at different development stages. Experiments across diverse LLM families support our approach.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Radon Transforms and the SYK model
Authors:
Michael Stone
Abstract:
Motivated by recent work on the Sachdev-Ye-Kitaev (SYK) model, we consider the effect of Radon or X-ray transformations, on the Laplace eigenfunctions in hyperbolic Bolyai-Lobachevsky space. We show that the Radon map from this space to Lorentzian-signature Anti-de Sitter or de Sitter space is easier to interpret if we use the Poincare disc model and eigenfunctions rather than the upper-half-plane…
▽ More
Motivated by recent work on the Sachdev-Ye-Kitaev (SYK) model, we consider the effect of Radon or X-ray transformations, on the Laplace eigenfunctions in hyperbolic Bolyai-Lobachevsky space. We show that the Radon map from this space to Lorentzian-signature Anti-de Sitter or de Sitter space is easier to interpret if we use the Poincare disc model and eigenfunctions rather than the upper-half-plane model. In particular, this version of the transform reveals the geometric origin of the boundary conditions imposed on the eigenfunctions that are involved in calculating the SYK four-point function.
△ Less
Submitted 20 June, 2025; v1 submitted 11 June, 2025;
originally announced June 2025.
-
IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments
Authors:
Florian Bordes,
Quentin Garrido,
Justine T Kao,
Adina Williams,
Michael Rabbat,
Emmanuel Dupoux
Abstract:
We present IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models. Building on the original IntPhys benchmark, IntPhys 2 focuses on four core principles related to macroscopic objects: Permanence, Immutability, Spatio-Temporal Continuity, and Solidity. These conditions are inspired by research into intuitive physical understanding emerging dur…
▽ More
We present IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models. Building on the original IntPhys benchmark, IntPhys 2 focuses on four core principles related to macroscopic objects: Permanence, Immutability, Spatio-Temporal Continuity, and Solidity. These conditions are inspired by research into intuitive physical understanding emerging during early childhood. IntPhys 2 offers a comprehensive suite of tests, based on the violation of expectation framework, that challenge models to differentiate between possible and impossible events within controlled and diverse virtual environments. Alongside the benchmark, we provide performance evaluations of several state-of-the-art models. Our findings indicate that while these models demonstrate basic visual understanding, they face significant challenges in grasping intuitive physics across the four principles in complex scenes, with most models performing at chance levels (50%), in stark contrast to human performance, which achieves near-perfect accuracy. This underscores the gap between current models and human-like intuitive physics understanding, highlighting the need for advancements in model architectures and training methodologies.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Cross-Channel Unlabeled Sensing over a Union of Signal Subspaces
Authors:
Taulant Koka,
Manolis C. Tsakiris,
Benjamín Béjar Haro,
Michael Muma
Abstract:
Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches bet…
▽ More
Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches between samples and channels often arise in applications such as whole-brain calcium imaging of freely moving organisms or multi-target tracking. We improve over previous models by deriving tighter bounds on the required number of samples for unique reconstruction, while supporting more general signal types. The approach is validated through an application in whole-brain calcium imaging, where organism movements disrupt sample-to-neuron mappings. This demonstrates the utility of our framework in real-world settings with imprecise sample-channel associations, achieving accurate signal reconstruction.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Don't be Afraid of Cell Complexes! An Introduction from an Applied Perspective
Authors:
Josef Hoppe,
Vincent P. Grande,
Michael T. Schaub
Abstract:
Cell complexes (CCs) are a higher-order network model deeply rooted in algebraic topology that has gained interest in signal processing and network science recently. However, while the processing of signals supported on CCs can be described in terms of easily-accessible algebraic or combinatorial notions, the commonly presented definition of CCs is grounded in abstract concepts from topology and r…
▽ More
Cell complexes (CCs) are a higher-order network model deeply rooted in algebraic topology that has gained interest in signal processing and network science recently. However, while the processing of signals supported on CCs can be described in terms of easily-accessible algebraic or combinatorial notions, the commonly presented definition of CCs is grounded in abstract concepts from topology and remains disconnected from the signal processing methods developed for CCs. In this paper, we aim to bridge this gap by providing a simplified definition of CCs that is accessible to a wider audience and can be used in practical applications. Specifically, we first introduce a simplified notion of abstract regular cell complexes (ARCCs). These ARCCs only rely on notions from algebra and can be shown to be equivalent to regular cell complexes for most practical applications. Second, using this new definition we provide an accessible introduction to (abstract) cell complexes from a perspective of network science and signal processing. Furthermore, as many practical applications work with CCs of dimension 2 and below, we provide an even simpler definition for this case that significantly simplifies understanding and working with CCs in practice.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Non-Euclidean dual gradient ascent for entropically regularized linear and semidefinite programming
Authors:
Yuhang Cai,
Michael Lindsey
Abstract:
We present an optimization framework that exhibits dimension-independent convergence on a broad class of semidefinite programs (SDPs). Our approach first regularizes the primal problem with the von Neumann entropy, then solve the regularized problem using dual gradient ascent with respect to a problem-adapted norm. In particular, we show that the dual gradient norm converges to zero at a rate inde…
▽ More
We present an optimization framework that exhibits dimension-independent convergence on a broad class of semidefinite programs (SDPs). Our approach first regularizes the primal problem with the von Neumann entropy, then solve the regularized problem using dual gradient ascent with respect to a problem-adapted norm. In particular, we show that the dual gradient norm converges to zero at a rate independent of the ambient dimension and, via rounding arguments, construct primal-feasible solutions in certain special cases. We also derive explicit convergence rates for the objective. In order to achieve optimal computational scaling, we must accommodate the use of stochastic gradients constructed via randomized trace estimators. Throughout we illustrate the generality of our framework via three important special cases -- the Goemans-Williamson SDP relaxation of the Max-Cut problem, the optimal transport linear program, and several SDP relaxations of the permutation synchronization problem. Numerical experiments confirm that our methods achieve dimension-independent convergence in practice.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Multi-Qubit Parity Gates for Rydberg Atoms in Various Configurations
Authors:
Javad Kazemi,
Michael Schuler,
Christian Ertler,
Wolfgang Lechner
Abstract:
We present a native approach for realizing multi-qubit parity phase gates in neutral atom systems through global phase modulation of a Rydberg excitation laser. By shaping the temporal profile of the laser's phase, we enable high fidelity, time efficient entangling operations between multiple qubits without requiring individual qubit addressing. To mitigate intrinsic noise sources including sponta…
▽ More
We present a native approach for realizing multi-qubit parity phase gates in neutral atom systems through global phase modulation of a Rydberg excitation laser. By shaping the temporal profile of the laser's phase, we enable high fidelity, time efficient entangling operations between multiple qubits without requiring individual qubit addressing. To mitigate intrinsic noise sources including spontaneous decay and motional effects, we develop a noise-aware optimal control framework that reduces gate errors under the presence of noise while maintaining smooth pulse profiles suitable for experimental implementation. In addition to equidistant qubit arrangements, we explore the impact of non-equidistant atomic configurations, where interaction inhomogeneity becomes significant. In these cases, the flexibility of our control approach helps to compensate for such variations, supporting reliable gate performance across different spatial layouts. These results facilitate the practical implementation of complex, multi-qubit quantum operations in near-term neutral atom quantum processors.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
SyncFed: Time-Aware Federated Learning through Explicit Timestamping and Synchronization
Authors:
Baran Can Gül,
Stefanos Tziampazis,
Nasser Jazdi,
Michael Weyrich
Abstract:
As Federated Learning (FL) expands to larger and more distributed environments, consistency in training is challenged by network-induced delays, clock unsynchronicity, and variability in client updates. This combination of factors may contribute to misaligned contributions that undermine model reliability and convergence. Existing methods like staleness-aware aggregation and model versioning addre…
▽ More
As Federated Learning (FL) expands to larger and more distributed environments, consistency in training is challenged by network-induced delays, clock unsynchronicity, and variability in client updates. This combination of factors may contribute to misaligned contributions that undermine model reliability and convergence. Existing methods like staleness-aware aggregation and model versioning address lagging updates heuristically, yet lack mechanisms to quantify staleness, especially in latency-sensitive and cross-regional deployments. In light of these considerations, we introduce \emph{SyncFed}, a time-aware FL framework that employs explicit synchronization and timestamping to establish a common temporal reference across the system. Staleness is quantified numerically based on exchanged timestamps under the Network Time Protocol (NTP), enabling the server to reason about the relative freshness of client updates and apply temporally informed weighting during aggregation. Our empirical evaluation on a geographically distributed testbed shows that, under \emph{SyncFed}, the global model evolves within a stable temporal context, resulting in improved accuracy and information freshness compared to round-based baselines devoid of temporal semantics.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Brillouin-Mandelstam scattering in telecommunications optical fiber at millikelvin temperatures
Authors:
E. A. Cryer-Jenkins,
A. C. Leung,
H. Rathee,
A. K. C. Tan,
K. D. Major,
M. R. Vanner
Abstract:
Brillouin-Mandelstam scattering is a strong and readily accessible optical nonlinearity enabling a wide array of applications and research directions. For instance, the three-wave mixing process has been employed to great success for narrow-linewidth lasers, sensing applications, microscopy, and signal processing. While most of these avenues focus on room temperature operation, there is now increa…
▽ More
Brillouin-Mandelstam scattering is a strong and readily accessible optical nonlinearity enabling a wide array of applications and research directions. For instance, the three-wave mixing process has been employed to great success for narrow-linewidth lasers, sensing applications, microscopy, and signal processing. While most of these avenues focus on room temperature operation, there is now increasing interest in cryogenic operation owing to the scattering mechanism's significant potential for applications and fundamental physics at low temperatures. Here, we measure the Brillouin scattering spectrum in standard single-mode telecommunications optical fiber at millikelvin temperatures using a closed-cycle dilution refrigerator and optical heterodyne detection. Our experiments are performed with a cryostat temperature from 50 mK to 27 K, extending previously reported measurements that utilized liquid helium-4 cryostats with temperatures greater than 1 K. At millikelvin temperatures, our experiment observes coherent acoustic interaction with microscopic defects of the amorphous material - two-level-systems (TLS) - which has not been previously observed in optical fiber. The measured behaviour of the linewidth with temperature is in agreement with well-established models of ultrasonic attenuation in amorphous materials comprising a background intrinsic scattering, thermally-activated scattering, and incoherent and coherent TLS interaction. This work provides a foundation for a wide range of applications and further research including sensing applications, new approaches to investigate TLS physics, and Brillouin-scattering-based quantum science and technology.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Causal effects on non-terminal event time with application to antibiotic usage and future resistance
Authors:
Tamir Zehavi,
Uri Obolski,
Michal Chowers,
Daniel Nevo
Abstract:
Comparing future antibiotic resistance levels resulting from different antibiotic treatments is challenging because some patients may survive only under one of the antibiotic treatments. We embed this problem within a semi-competing risks approach to study the causal effect on resistant infection, treated as a non-terminal event time. We argue that existing principal stratification estimands for s…
▽ More
Comparing future antibiotic resistance levels resulting from different antibiotic treatments is challenging because some patients may survive only under one of the antibiotic treatments. We embed this problem within a semi-competing risks approach to study the causal effect on resistant infection, treated as a non-terminal event time. We argue that existing principal stratification estimands for such problems exclude patients for whom a causal effect is well-defined and is of clinical interest. Therefore, we present a new principal stratum, the infected-or-survivors (ios). The ios is the subpopulation of patients who would have survived or been infected under both antibiotic treatments. This subpopulation is more inclusive than previously defined subpopulations. We target the causal effect among these patients, which we term the feasible-infection causal effect (FICE). We develop large-sample bounds under novel assumptions, and discuss the plausibility of these assumptions in our application. As an alternative, we derive FICE identification using two illness-death models with a bivariate frailty random variable. These two models are connected by a cross-world correlation parameter. Estimation is performed by an expectation-maximization algorithm followed by a Monte Carlo procedure. We apply our methods to detailed clinical data obtained from a hospital setting.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Disorder-induced suppression of superconductivity in infinite-layer nickelates
Authors:
Abhishek Ranna,
Romain Grasset,
Martin Gonzalez,
Kyuho Lee,
Bai Yang Wang,
Edgar Abarca Morales,
Florian Theuss,
Zuzanna H. Filipiak,
Michal Moravec,
Marcin Konczykowski,
Harold Y. Hwang,
Andrew P. Mackenzie,
Berit H. Goodge
Abstract:
The pairing symmetry of superconducting infinite-layer nickelates is a fundamental yet experimentally challenging question. We employ high-energy electron irradiation to induce disorder in superconducting Nd$_{0.825}$Sr$_{0.175}$NiO$_2$ thin films and examine the impact of pair-breaking defects on superconductivity and elucidate the nature of the superconducting gap. Our measurements reveal a comp…
▽ More
The pairing symmetry of superconducting infinite-layer nickelates is a fundamental yet experimentally challenging question. We employ high-energy electron irradiation to induce disorder in superconducting Nd$_{0.825}$Sr$_{0.175}$NiO$_2$ thin films and examine the impact of pair-breaking defects on superconductivity and elucidate the nature of the superconducting gap. Our measurements reveal a complete suppression of superconductivity with increasing disorder, suggesting an unconventional, sign-changing order parameter.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Apparent motion of penumbral grains in a sunspot simulation
Authors:
Michal Sobotka,
Markus Schmassmann
Abstract:
Context. The bright heads of penumbral filaments, penumbral grains (PGs), are manifestations of hot plasma flows rising to the surface. They are observed to move horizontally toward the sunspot umbra or away from it. Recent analyses of observations indicate that the direction of this motion is related to the inclination of the surrounding magnetic field. Aims. The penumbra of a sunspot simulated b…
▽ More
Context. The bright heads of penumbral filaments, penumbral grains (PGs), are manifestations of hot plasma flows rising to the surface. They are observed to move horizontally toward the sunspot umbra or away from it. Recent analyses of observations indicate that the direction of this motion is related to the inclination of the surrounding magnetic field. Aims. The penumbra of a sunspot simulated by the radiative magnetohydrodynamic code MURaM is analysed to get typical physical conditions in PGs, compare them to those in the surroundings, describe their spatial distribution, and study their evolution. Methods. We use time series of images that map intensity, temperature, magnetic field vector, and velocity vector in horizontal slices at the visible surface, in subsurface layers, and in vertical cuts through the simulation box to track PGs and compare, statistically and in individual cases, the physical quantities inside them with those in the surroundings. Results. The statistical analysis of simulation results provides average values of temperature, magnetic field strength and inclination, vertical velocity, and their changes with radial distance from the spot centre. We find a subtle difference between simulated PGs with opposite directions of motions when comparing the magnetic field inclinations inside and outside the PGs. The case studies, documented by movies, show that the differences of inclinations and the direction of motions may change during the lifetime of some PGs and that the turbulence in the surface layers introduces some randomness in the apparent motions of PGs.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Covert Entanglement Generation over Bosonic Channels
Authors:
Evan J. D. Anderson,
Michael S. Bullock,
Ohad Kimelfeld,
Christopher K. Eyre,
Filip Rozpędek,
Uzi Pereg,
Boulat A. Bash
Abstract:
We explore covert entanglement generation over the lossy thermal-noise bosonic channel, which is a quantum-mechanical model of many practical settings, including optical, microwave, and radio-frequency (RF) channels. Covert communication ensures that an adversary is unable to detect the presence of transmissions, which are concealed in channel noise. We show that a $\textit{square root law}$ (SRL)…
▽ More
We explore covert entanglement generation over the lossy thermal-noise bosonic channel, which is a quantum-mechanical model of many practical settings, including optical, microwave, and radio-frequency (RF) channels. Covert communication ensures that an adversary is unable to detect the presence of transmissions, which are concealed in channel noise. We show that a $\textit{square root law}$ (SRL) for covert entanglement generation similar to that for classical: $L_{\rm EG}\sqrt{n}$ entangled bits (ebits) can be generated covertly and reliably over $n$ uses of a bosonic channel. We report a single-letter expression for optimal $L_{\rm EG}$ as well as an achievable method. We additionally analyze the performance of covert entanglement generation using single- and dual-rail photonic qubits, which may be more practical for physical implementation.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
The Fast and the Frame-Dragging: Efficient waveforms for asymmetric-mass eccentric equatorial inspirals into rapidly-spinning black holes
Authors:
Christian E. A. Chapman-Bird,
Lorenzo Speri,
Zachary Nasipak,
Ollie Burke,
Michael L. Katz,
Alessandro Santini,
Shubham Kejriwal,
Philip Lynch,
Josh Mathews,
Hassan Khalvati,
Jonathan E. Thompson,
Soichiro Isoyama,
Scott A. Hughes,
Niels Warburton,
Alvin J. K. Chua,
Maxime Pigou
Abstract:
Observations of gravitational-wave signals emitted by compact binary inspirals provide unique insights into their properties, but their analysis requires accurate and efficient waveform models. Intermediate- and extreme-mass-ratio inspirals (I/EMRIs), with mass ratios $q \gtrsim 10^2$, are promising sources for future detectors such as the Laser Interferometer Space Antenna (LISA). Modelling wavef…
▽ More
Observations of gravitational-wave signals emitted by compact binary inspirals provide unique insights into their properties, but their analysis requires accurate and efficient waveform models. Intermediate- and extreme-mass-ratio inspirals (I/EMRIs), with mass ratios $q \gtrsim 10^2$, are promising sources for future detectors such as the Laser Interferometer Space Antenna (LISA). Modelling waveforms for these asymmetric-mass binaries is challenging, entailing the tracking of many harmonic modes over thousands to millions of cycles. The FastEMRIWaveforms (FEW) modelling framework addresses this need, leveraging precomputation of mode data and interpolation to rapidly compute adiabatic waveforms for eccentric inspirals into zero-spin black holes. In this work, we extend FEW to model eccentric equatorial inspirals into black holes with spin magnitudes $|a| \leq 0.999$. Our model supports eccentricities $e < 0.9$ and semi-latus recta $p < 200$, enabling the generation of long-duration IMRI waveforms, and produces waveforms in $\sim 100$ ms with hardware acceleration. Characterising systematic errors, we estimate that our model attains mismatches of $\sim 10^{-5}$ (for LISA sensitivity) with respect to error-free adiabatic waveforms over most of parameter space. We find that kludge models introduce errors in signal-to-noise ratios (SNRs) as great as $^{+60\%}_{-40\%}$ and induce marginal biases of up to $\sim 1σ$ in parameter estimation. We show LISA's horizon redshift for I/EMRI signals varies significantly with $a$, reaching a redshift of $3$ ($15$) for EMRIs (IMRIs) with only minor $(\sim10\%)$ dependence on $e$ for an SNR threshold of 20. For signals with SNR $\sim 50$, spin and eccentricity-at-plunge are measured with uncertainties of $δa \sim 10^{-7}$ and $δe_f \sim 10^{-5}$. This work advances the state-of-the-art in waveform generation for asymmetric-mass binaries.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
When Is Diversity Rewarded in Cooperative Multi-Agent Learning?
Authors:
Michael Amir,
Matteo Bettini,
Amanda Prorok
Abstract:
The success of teams in robotics, nature, and society often depends on the division of labor among diverse specialists; however, a principled explanation for when such diversity surpasses a homogeneous team is still missing. Focusing on multi-agent task allocation problems, our goal is to study this question from the perspective of reward design: what kinds of objectives are best suited for hetero…
▽ More
The success of teams in robotics, nature, and society often depends on the division of labor among diverse specialists; however, a principled explanation for when such diversity surpasses a homogeneous team is still missing. Focusing on multi-agent task allocation problems, our goal is to study this question from the perspective of reward design: what kinds of objectives are best suited for heterogeneous teams? We first consider an instantaneous, non-spatial setting where the global reward is built by two generalized aggregation operators: an inner operator that maps the $N$ agents' effort allocations on individual tasks to a task score, and an outer operator that merges the $M$ task scores into the global team reward. We prove that the curvature of these operators determines whether heterogeneity can increase reward, and that for broad reward families this collapses to a simple convexity test. Next, we ask what incentivizes heterogeneity to emerge when embodied, time-extended agents must learn an effort allocation policy. To study heterogeneity in such settings, we use multi-agent reinforcement learning (MARL) as our computational paradigm, and introduce Heterogeneous Environment Design (HED), a gradient-based algorithm that optimizes the parameter space of underspecified MARL environments to find scenarios where heterogeneity is advantageous. Experiments in matrix games and an embodied Multi-Goal-Capture environment show that, despite the difference in settings, HED rediscovers the reward regimes predicted by our theory to maximize the advantage of heterogeneity, both validating HED and connecting our theoretical insights to reward design in MARL. Together, these results help us understand when behavioral diversity delivers a measurable benefit.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Consistent Infill Estimability of the Regression Slope Between Gaussian Random Fields Under Spatial Confounding
Authors:
Abhirup Datta,
Michael L. Stein
Abstract:
The problem of estimating the slope parameter in regression between two spatial processes under confounding by an unmeasured spatial process has received widespread attention in the recent statistical literature. Yet, a fundamental question remains unsolved: when is this slope consistently estimable under spatial confounding, with existing insights being largely empirical or estimator-specific. In…
▽ More
The problem of estimating the slope parameter in regression between two spatial processes under confounding by an unmeasured spatial process has received widespread attention in the recent statistical literature. Yet, a fundamental question remains unsolved: when is this slope consistently estimable under spatial confounding, with existing insights being largely empirical or estimator-specific. In this manuscript, we characterize conditions for consistent estimability of the regression slope between Gaussian random fields (GRFs). Under fixed-domain (infill) asymptotics, we give sufficient conditions for consistent estimability using a novel characterization of the regression slope as the ratio of principal irregular terms of covariances, dictating the relative local behavior of the exposure and confounder processes. When estimability holds, we provide consistent estimators of the slope using local differencing (taking discrete differences or Laplacians of the processes of suitable order). Using functional analysis results on Paley-Wiener spaces, we then provide an easy-to-verify necessary condition for consistent estimability of the slope in terms of the relative spectral tail decays of the confounder and exposure. As a by-product, we establish a novel and general spectral condition on the equivalence of measures on the paths of multivariate GRFs with component fields of varying smoothnesses, a result of independent importance. We show that for the Matérn, power-exponential, generalized Cauchy, and coregionalization families, the necessary and sufficient conditions become identical, thereby providing a complete characterization of consistent estimability of the slope under spatial confounding. The results are extended to accommodate measurement error using local-averaging-and-differencing based estimators. Finite sample behavior is explored via numerical experiments.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Spin-lattice entanglement in $\mathbf{CoPS}_3$
Authors:
Thuc T. Mai,
Amber McCreary,
K. F. Garrity,
Rebecca L. Dally,
Sambridhi Shah,
Bryan C. Chakoumakos,
Md Nasim Afroj Taj,
Jeffrey W. Lynn,
Michael A. McGuire,
Benjamin S. Conner,
Mona Zebarjadi,
Janice L. Musfeldt,
Angela R. Hight Walker,
Rahul Rao,
Michael A. Susner
Abstract:
Complex chalcogenides in the $M$PS$_3$ family of materials ($M$ = Mn, Fe, Co, and Ni) display remarkably different phase progressions depending upon the metal center orbital filling, character of the P-P linkage, and size of the van der Waals gap. There is also a stacking pattern and spin state difference between the lighter and heavier transition metal-containing systems that places CoPS$_3$ at t…
▽ More
Complex chalcogenides in the $M$PS$_3$ family of materials ($M$ = Mn, Fe, Co, and Ni) display remarkably different phase progressions depending upon the metal center orbital filling, character of the P-P linkage, and size of the van der Waals gap. There is also a stacking pattern and spin state difference between the lighter and heavier transition metal-containing systems that places CoPS$_3$ at the nexus of these activities. Despite these unique properties, this compound is under-explored. Here, we bring together Raman scattering spectroscopy and infrared absorption spectroscopy with X-ray techniques to identify a structural component to the 119 K magnetic ordering transition as well as a remarkable lower temperature set of magnon-phonon pairs that engage in avoided crossings along with a magnetic scattering continuum that correlates with phonon lifetime effects. These findings point to strong spin-phonon entanglement as well as opportunities to control these effects under external stimuli.
△ Less
Submitted 16 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
Down But Not Out: The Case of Long-Period Comet C/2021 O3 (Panstarrs)
Authors:
David Jewitt. Jing Li,
Michael Jaeger,
Yoonyoung Kim
Abstract:
We combine ground- and space-based observations of long-period comet C/2021 O3 (Panstarrs) (perihelion distance 0.287 au) in order to investigate its reported near-perihelion destruction. Pre-perihelion photometric observations show a remarkably small heliocentric dependence of the scattered light, $\propto r_H^{-s}$ with $s = 2.59\pm0.21$, distinct from values reported in other long-period comets…
▽ More
We combine ground- and space-based observations of long-period comet C/2021 O3 (Panstarrs) (perihelion distance 0.287 au) in order to investigate its reported near-perihelion destruction. Pre-perihelion photometric observations show a remarkably small heliocentric dependence of the scattered light, $\propto r_H^{-s}$ with $s = 2.59\pm0.21$, distinct from values reported in other long-period comets, for which $s$ = 4 is the canonical standard. The index is smaller than expected of coma production by equilibrium sublimation of either supervolatiles (for which $s \sim$ 4 is expected), or water ice ($s \sim$ 6 to 8) across the $\sim$4 au to 2 au range. The absolute magnitude deduced from the pre-perihelion data is $H$ = 13.0$\pm$0.3 (coma scattering cross-section $\sim$225 km$^2$ for an assumed geometric albedo 0.04) while, after perihelion, the cross-section fades by a factor of 25 to $H$ = 16.5 ($\sim$9 km$^2$). STEREO spacecraft observations near perihelion show a long debris trail whose properties are consistent with forward scattering from radius $\sim$7 $μ$m particles. The data show that the nucleus of C/2021 O3 was not destroyed at perihelion. Although the lightcurve from 3.9 au inbound to 0.8 au outbound cannot be uniquely interpreted, a simple and plausible explanation is provided by seasonal dimming on a nucleus having high obliquity and an asymmetric distribution of near-surface volatiles. The survival of the nucleus against rotational disruption suggests a pre-perihelion nucleus radius $r_n \gtrsim$ 1.0 km while the photometric limit to the radius of the nucleus after perihelion is $r_n < 1.7$ km (geometric albedo 0.04 assumed).
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Self-Anchored Attention Model for Sample-Efficient Classification of Prosocial Text Chat
Authors:
Zhuofang Li,
Rafal Kocielnik,
Fereshteh Soltani,
Penphob,
Boonyarungsrit,
Animashree Anandkumar,
R. Michael Alvarez
Abstract:
Millions of players engage daily in competitive online games, communicating through in-game chat. Prior research has focused on detecting relatively small volumes of toxic content using various Natural Language Processing (NLP) techniques for the purpose of moderation. However, recent studies emphasize the importance of detecting prosocial communication, which can be as crucial as identifying toxi…
▽ More
Millions of players engage daily in competitive online games, communicating through in-game chat. Prior research has focused on detecting relatively small volumes of toxic content using various Natural Language Processing (NLP) techniques for the purpose of moderation. However, recent studies emphasize the importance of detecting prosocial communication, which can be as crucial as identifying toxic interactions. Recognizing prosocial behavior allows for its analysis, rewarding, and promotion. Unlike toxicity, there are limited datasets, models, and resources for identifying prosocial behaviors in game-chat text. In this work, we employed unsupervised discovery combined with game domain expert collaboration to identify and categorize prosocial player behaviors from game chat. We further propose a novel Self-Anchored Attention Model (SAAM) which gives 7.9% improvement compared to the best existing technique. The approach utilizes the entire training set as "anchors" to help improve model performance under the scarcity of training data. This approach led to the development of the first automated system for classifying prosocial behaviors in in-game chats, particularly given the low-resource settings where large-scale labeled data is not available. Our methodology was applied to one of the most popular online gaming titles - Call of Duty(R): Modern Warfare(R)II, showcasing its effectiveness. This research is novel in applying NLP techniques to discover and classify prosocial behaviors in player in-game chat communication. It can help shift the focus of moderation from solely penalizing toxicity to actively encouraging positive interactions on online platforms.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
CFMI: Flow Matching for Missing Data Imputation
Authors:
Vaidotas Simkus,
Michael U. Gutmann
Abstract:
We introduce conditional flow matching for imputation (CFMI), a new general-purpose method to impute missing data. The method combines continuous normalising flows, flow-matching, and shared conditional modelling to deal with intractabilities of traditional multiple imputation. Our comparison with nine classical and state-of-the-art imputation methods on 24 small to moderate-dimensional tabular da…
▽ More
We introduce conditional flow matching for imputation (CFMI), a new general-purpose method to impute missing data. The method combines continuous normalising flows, flow-matching, and shared conditional modelling to deal with intractabilities of traditional multiple imputation. Our comparison with nine classical and state-of-the-art imputation methods on 24 small to moderate-dimensional tabular data sets shows that CFMI matches or outperforms both traditional and modern techniques across a wide range of metrics. Applying the method to zero-shot imputation of time-series data, we find that it matches the accuracy of a related diffusion-based method while outperforming it in terms of computational efficiency. Overall, CFMI performs at least as well as traditional methods on lower-dimensional data while remaining scalable to high-dimensional settings, matching or exceeding the performance of other deep learning-based approaches, making it a go-to imputation method for a wide range of data types and dimensionalities.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Improved H2-He and H2-H2 Collision-Induced Absorption Models and Application to Outer-Planet Atmospheres
Authors:
Glenn S. Orton,
Magnus Gustafsson,
Leigh N. Fletcher,
Michael T. Roman,
James A. Sinclair
Abstract:
Using state-of-the-art ab initio interaction-induced dipole and potential-energy surfaces for hydrogen-helium (H2-He) pairs, we compute the rototranslational collision-induced absorption coefficient at 40-400 K for frequencies covering 0-4000 cm-1. The quantum mechanical scattering calculations account for the full anisotropic interaction potential, replacing the isotropic approximation. The absor…
▽ More
Using state-of-the-art ab initio interaction-induced dipole and potential-energy surfaces for hydrogen-helium (H2-He) pairs, we compute the rototranslational collision-induced absorption coefficient at 40-400 K for frequencies covering 0-4000 cm-1. The quantum mechanical scattering calculations account for the full anisotropic interaction potential, replacing the isotropic approximation. The absorption data are expected to be accurate with an uncertainty of 2% or better up to 2500 cm-1. The uncertainty is slightly higher at the highest frequencies where the rototranslational absorption is largely obscured by the rovibrational band. Our improved agreement with measurements at 200-800 cm-1 results from the improvement of the potential energy surface. The previously available rototranslational data set for H2-H2 pairs (Fletcher et al., Astrophys. J. Supp. 235, 24 (2018)) is also extended up to 4000 cm-1. In the rovibrational band previous isotropic potential calculations for H2-He (Gustafsson et al. J. Chem. Physics. 113, 3641 (2000)) and H2-H2 (Borysow, Icarus 92, 273 (1992)) have been extended to complement the rototranslational data set. The absorption coefficients are tabulated for ortho-to-para ratios from normal-H2 to pure para-H2, as well as equilibrium-H2, over 40-400 K. The effect of these updates are simulated for the cold atmosphere of Uranus and warmer atmosphere of Jupiter. They are equivalent to a brightness temperature difference of a fraction of a degree in the rototranslational region but up to 4 degrees in the rovibrational region. Our state-of-the-art modifications correct an otherwise +2% error in determining the He/H2 ratio in Uranus from its spectrum alone.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Aluminum oxide coatings on Co-rich cathodes and interactions with organic electrolyte
Authors:
M. D. Hashan C. Peiris,
Michael Woodcox,
Diana Liepinya,
Robert Shephard,
Hao Liu,
Manuel Smeu
Abstract:
Lithium-ion batteries (LIBs) have become essential in modern energy storage; however, their performance is often limited by the stability and efficiency of their components, particularly the cathode and electrolyte. Transition metal layered oxide cathodes, a popular choice for lithium-ion batteries (LIBs), suffer from several degradation mechanisms, including capacity fading, reactions with the el…
▽ More
Lithium-ion batteries (LIBs) have become essential in modern energy storage; however, their performance is often limited by the stability and efficiency of their components, particularly the cathode and electrolyte. Transition metal layered oxide cathodes, a popular choice for lithium-ion batteries (LIBs), suffer from several degradation mechanisms, including capacity fading, reactions with the electrolyte, unstable cathode-electrolyte interfaces, and lattice breakdown during cycling. In recent years, oxide coating, such as alumina, has emerged as a promising strategy to enhance the durability of cathodes by forming a protective layer that mitigates detrimental reactions and improves the stability of the cathode electrolyte interphase (CEI). This study employs ab initio molecular dynamics (AIMD) simulations to investigate the chemical and mechanical behavior of LiCoO2 cathodes with and without aluminum oxide coatings in contact with an organic electrolyte. We examine the interactions between electrolyte molecules with both bare and coated cathode surfaces, focusing on the decomposition of ethylene carbonate (EC) and dimethyl carbonate (DMC), the formation of oxygen species, and solvation dynamics, and evaluate the mechanical robustness of the cathode-coating interface using calculations of axial strain and cleavage energy. Our findings reveal that alumina coatings effectively reduce electrolyte degradation and stabilize the cathode structure, particularly under high-charge states. The coating's thickness and structural orientation are crucial in enhancing mechanical strength and minimizing detrimental reactions at the cathode-electrolyte interface. These insights contribute to the development of more durable LIBs by optimizing the interface chemistry and mechanical properties, providing a pathway toward higher energy densities and longer cycle life.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
In Crowd Veritas: Leveraging Human Intelligence To Fight Misinformation
Authors:
Michael Soprano
Abstract:
The spread of online misinformation poses serious threats to democratic societies. Traditionally, expert fact-checkers verify the truthfulness of information through investigative processes. However, the volume and immediacy of online content present major scalability challenges. Crowdsourcing offers a promising alternative by leveraging non-expert judgments, but it introduces concerns about bias,…
▽ More
The spread of online misinformation poses serious threats to democratic societies. Traditionally, expert fact-checkers verify the truthfulness of information through investigative processes. However, the volume and immediacy of online content present major scalability challenges. Crowdsourcing offers a promising alternative by leveraging non-expert judgments, but it introduces concerns about bias, accuracy, and interpretability. This thesis investigates how human intelligence can be harnessed to assess the truthfulness of online information, focusing on three areas: misinformation assessment, cognitive biases, and automated fact-checking systems. Through large-scale crowdsourcing experiments and statistical modeling, it identifies key factors influencing human judgments and introduces a model for the joint prediction and explanation of truthfulness. The findings show that non-expert judgments often align with expert assessments, particularly when factors such as timing and experience are considered. By deepening our understanding of human judgment and bias in truthfulness assessment, this thesis contributes to the development of more transparent, trustworthy, and interpretable systems for combating misinformation.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
The Curious Language Model: Strategic Test-Time Information Acquisition
Authors:
Michael Cooper,
Rohan Wadhawan,
John Michael Giorgi,
Chenhao Tan,
Davis Liang
Abstract:
Decision-makers often possess insufficient information to render a confident decision. In these cases, the decision-maker can often undertake actions to acquire the necessary information about the problem at hand, e.g., by consulting knowledgeable authorities or by conducting experiments. Importantly, different levers of information acquisition come with different costs, posing the challenge of se…
▽ More
Decision-makers often possess insufficient information to render a confident decision. In these cases, the decision-maker can often undertake actions to acquire the necessary information about the problem at hand, e.g., by consulting knowledgeable authorities or by conducting experiments. Importantly, different levers of information acquisition come with different costs, posing the challenge of selecting the actions that are both informative and cost-effective. In this work, we propose CuriosiTree, a heuristic-based, test-time policy for zero-shot information acquisition in large language models (LLMs). CuriosiTree employs a greedy tree search to estimate the expected information gain of each action and strategically chooses actions based on a balance of anticipated information gain and associated cost. Empirical validation in a clinical diagnosis simulation shows that CuriosiTree enables cost-effective integration of heterogenous sources of information, and outperforms baseline action selection strategies in selecting action sequences that enable accurate diagnosis.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
The RSNA Lumbar Degenerative Imaging Spine Classification (LumbarDISC) Dataset
Authors:
Tyler J. Richards,
Adam E. Flanders,
Errol Colak,
Luciano M. Prevedello,
Robyn L. Ball,
Felipe Kitamura,
John Mongan,
Maryam Vazirabad,
Hui-Ming Lin,
Anne Kendell,
Thanat Kanthawang,
Salita Angkurawaranon,
Emre Altinmakas,
Hakan Dogan,
Paulo Eduardo de Aguiar Kuriki,
Arjuna Somasundaram,
Christopher Ruston,
Deniz Bulja,
Naida Spahovic,
Jennifer Sommer,
Sirui Jiang,
Eduardo Moreno Judice de Mattos Farina,
Eduardo Caminha Nunes,
Michael Brassil,
Megan McNamara
, et al. (11 additional authors not shown)
Abstract:
The Radiological Society of North America (RSNA) Lumbar Degenerative Imaging Spine Classification (LumbarDISC) dataset is the largest publicly available dataset of adult MRI lumbar spine examinations annotated for degenerative changes. The dataset includes 2,697 patients with a total of 8,593 image series from 8 institutions across 6 countries and 5 continents. The dataset is available for free fo…
▽ More
The Radiological Society of North America (RSNA) Lumbar Degenerative Imaging Spine Classification (LumbarDISC) dataset is the largest publicly available dataset of adult MRI lumbar spine examinations annotated for degenerative changes. The dataset includes 2,697 patients with a total of 8,593 image series from 8 institutions across 6 countries and 5 continents. The dataset is available for free for non-commercial use via Kaggle and RSNA Medical Imaging Resource of AI (MIRA). The dataset was created for the RSNA 2024 Lumbar Spine Degenerative Classification competition where competitors developed deep learning models to grade degenerative changes in the lumbar spine. The degree of spinal canal, subarticular recess, and neural foraminal stenosis was graded at each intervertebral disc level in the lumbar spine. The images were annotated by expert volunteer neuroradiologists and musculoskeletal radiologists from the RSNA, American Society of Neuroradiology, and the American Society of Spine Radiology. This dataset aims to facilitate research and development in machine learning and lumbar spine imaging to lead to improved patient care and clinical efficiency.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
CAIRe: Cultural Attribution of Images by Retrieval-Augmented Evaluation
Authors:
Arnav Yayavaram,
Siddharth Yayavaram,
Simran Khanuja,
Michael Saxon,
Graham Neubig
Abstract:
As text-to-image models become increasingly prevalent, ensuring their equitable performance across diverse cultural contexts is critical. Efforts to mitigate cross-cultural biases have been hampered by trade-offs, including a loss in performance, factual inaccuracies, or offensive outputs. Despite widespread recognition of these challenges, an inability to reliably measure these biases has stalled…
▽ More
As text-to-image models become increasingly prevalent, ensuring their equitable performance across diverse cultural contexts is critical. Efforts to mitigate cross-cultural biases have been hampered by trade-offs, including a loss in performance, factual inaccuracies, or offensive outputs. Despite widespread recognition of these challenges, an inability to reliably measure these biases has stalled progress. To address this gap, we introduce CAIRe, a novel evaluation metric that assesses the degree of cultural relevance of an image, given a user-defined set of labels. Our framework grounds entities and concepts in the image to a knowledge base and uses factual information to give independent graded judgments for each culture label. On a manually curated dataset of culturally salient but rare items built using language models, CAIRe surpasses all baselines by 28% F1 points. Additionally, we construct two datasets for culturally universal concept, one comprising of T2I-generated outputs and another retrieved from naturally occurring data. CAIRe achieves Pearson's correlations of 0.56 and 0.66 with human ratings on these sets, based on a 5-point Likert scale of cultural relevance. This demonstrates its strong alignment with human judgment across diverse image sources.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
The spatially variable effects of mangroves on flood depths and losses from storm surges in Florida
Authors:
Siddharth Narayan,
Christopher J. Thomas,
Kechi Nzerem,
Joss Matthewman,
Christine Shepard,
Laura Geselbracht,
Michael W. Beck
Abstract:
Mangroves modify storm surges with impacts to property damages from tropical cyclones (TC), but the magnitude of these effects and their spatial variability are not well understood especially at sub-county scales. We use high-resolution storm surge flood and loss models to examine variation in the effects of mangroves on these losses spatially and by storm intensity in Florida. We estimate that ma…
▽ More
Mangroves modify storm surges with impacts to property damages from tropical cyclones (TC), but the magnitude of these effects and their spatial variability are not well understood especially at sub-county scales. We use high-resolution storm surge flood and loss models to examine variation in the effects of mangroves on these losses spatially and by storm intensity in Florida. We estimate that mangroves reduce storm surge losses to properties by $67.5 million annually in Collier County in western Florida with over half the cumulative benefits from storm surges with return periods under 30 years. We estimate the benefits of mangroves during hurricanes Irma (2017) and Ian (2022) as US$ 725 Million and $4.1 Billion. We show that flood depths and losses are always lower for properties landward of mangroves but can increase for properties seaward or between mangroves, underlining the importance of nuanced descriptions of variability in mangrove effects during storms.
△ Less
Submitted 13 June, 2025; v1 submitted 9 June, 2025;
originally announced June 2025.
-
Fault-Tolerant Stabilizer Measurements in Surface Codes with Three-Qubit Gates
Authors:
Josias Old,
Stephan Tasler,
Michael J. Hartmann,
Markus Müller
Abstract:
Quantum error correction (QEC) is considered a deciding component in enabling practical quantum computing. Stabilizer codes, and in particular topological surface codes, are promising candidates for implementing QEC by redundantly encoding quantum information. While it is widely believed that a strictly fault-tolerant protocol can only be implemented using single- and two-qubit gates, several quan…
▽ More
Quantum error correction (QEC) is considered a deciding component in enabling practical quantum computing. Stabilizer codes, and in particular topological surface codes, are promising candidates for implementing QEC by redundantly encoding quantum information. While it is widely believed that a strictly fault-tolerant protocol can only be implemented using single- and two-qubit gates, several quantum computing platforms, based on trapped ions, neutral atoms and also superconducting qubits support native multi-qubit operations, e.g. using multi-ion entangling gates, Rydberg blockade or parallelized tunable couplers, respectively. In this work, we show that stabilizer measurement circuits for unrotated surface codes can be fault-tolerant using single auxiliary qubits and three-qubit gates. These gates enable lower-depth circuits leading to fewer fault locations and potentially shorter QEC cycle times. Concretely, we find that in an optimistic parameter regime where fidelities of three-qubit gates are the same as those of two-qubit gates, the logical error rate can be up to one order of magnitude lower and the threshold can be significantly higher, increasing from $\approx 0.76 \%$ to $\approx 1.05 \%$. Our results, which are applicable to a wide range of platforms, thereby motivate further investigation into multi-qubit gates as components for fault-tolerant QEC, as they can lead to substantial advantages in terms of time and physical qubit resources required to reach a target logical error rate.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Optimizing Superconducting Three-Qubit Gates for Surface-Code Error Correction
Authors:
Stephan Tasler,
Josias Old,
Lukas Heunisch,
Verena Feulner,
Timo Eckstein,
Markus Müller,
Michael J. Hartmann
Abstract:
Quantum error correction (QEC) is one of the crucial building blocks for developing quantum computers that have significant potential for reaching a quantum advantage in applications. Prominent candidates for QEC are stabilizer codes for which periodic readout of stabilizer operators is typically implemented via successive two-qubit entangling gates, and is repeated many times during a computation…
▽ More
Quantum error correction (QEC) is one of the crucial building blocks for developing quantum computers that have significant potential for reaching a quantum advantage in applications. Prominent candidates for QEC are stabilizer codes for which periodic readout of stabilizer operators is typically implemented via successive two-qubit entangling gates, and is repeated many times during a computation. To improve QEC performance, it is thus beneficial to make the stabilizer readout faster and less prone to fault-tolerance-breaking errors. Here we design a 3-qubit CZZ gate for superconducting transmon qubits that maps the parity of two data qubits onto one measurement qubit in a single step. We find that the gate can be executed in a duration of $35\,$ns with a fidelity of F$=99.96 \, \%$. To optimize the gate, we use an error model obtained from the microscopic gate simulation to systematically suppress Pauli errors that are particularly harmful to the QEC protocol. Using this error model, we investigate the implementation of this 3-qubit gate in a surface code syndrome readout schedule. We find that for the rotated surface code, the implementation of CZZ gates increases the error threshold by nearly 50\% to $\approx 1.2\,\%$ and decreases the logical error rate, in the experimental relevant regime, by up to one order of magnitude, compared to the standard CZ readout protocol. We also show that for the unrotated surface code, strictly fault-tolerant readout schedules can be found. This opens a new perspective for below-threshold surface-code error correction, where it can be advantageous to use multi-qubit gates instead of two-qubit gates to obtain a better QEC performance.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Fine-Grained Spatially Varying Material Selection in Images
Authors:
Julia Guerrero-Viu,
Michael Fischer,
Iliyan Georgiev,
Elena Garces,
Diego Gutierrez,
Belen Masia,
Valentin Deschaintre
Abstract:
Selection is the first step in many image editing processes, enabling faster and simpler modifications of all pixels sharing a common modality. In this work, we present a method for material selection in images, robust to lighting and reflectance variations, which can be used for downstream editing tasks. We rely on vision transformer (ViT) models and leverage their features for selection, proposi…
▽ More
Selection is the first step in many image editing processes, enabling faster and simpler modifications of all pixels sharing a common modality. In this work, we present a method for material selection in images, robust to lighting and reflectance variations, which can be used for downstream editing tasks. We rely on vision transformer (ViT) models and leverage their features for selection, proposing a multi-resolution processing strategy that yields finer and more stable selection results than prior methods. Furthermore, we enable selection at two levels: texture and subtexture, leveraging a new two-level material selection (DuMaS) dataset which includes dense annotations for over 800,000 synthetic images, both on the texture and subtexture levels.
△ Less
Submitted 11 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers
Authors:
Marek Kadlčík,
Michal Štefánik,
Timothee Mickus,
Michal Spiegel,
Josef Kuchař
Abstract:
Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure…
▽ More
Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure of learned number embeddings with sinusoidal patterns.
In response, we propose a novel probing technique that decodes numeric values from input embeddings with near-perfect accuracy across a range of open-source LMs. This proves that after the sole pre-training, LMs represent numbers with remarkable precision. Finally, we find that the embeddings' preciseness judged by our probe's accuracy explains a large portion of LM's errors in elementary arithmetic, and show that aligning the embeddings with the pattern discovered by our probe can mitigate these errors.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Scaling Portfolio Diversification with Quantum Circuit Cutting Techniques
Authors:
Vicente P. Soloviev,
Antonio Márquez Romero,
Josh Kirsopp,
Michal Krompiec
Abstract:
Quantum Approximate Optimization Algorithms (QAOA) have demonstrated a strong potential in addressing graph-based optimization problems. However, the execution of large-scale quantum circuits remains constrained by the limitations of current quantum hardware. In this work, we introduce QuantCut, an automatic framework for circuit cutting that enables efficient execution of large quantum circuits b…
▽ More
Quantum Approximate Optimization Algorithms (QAOA) have demonstrated a strong potential in addressing graph-based optimization problems. However, the execution of large-scale quantum circuits remains constrained by the limitations of current quantum hardware. In this work, we introduce QuantCut, an automatic framework for circuit cutting that enables efficient execution of large quantum circuits by decomposing entangling two-qubit gates into manageable sub-circuits. Specifically, we focus on gate-cutting techniques. We apply QuantCut to a 71-qubit QAOA circuit ansatz for portfolio diversification in the S&P 500 stock market, aiming to maximize asset diversification. Our approach iteratively optimizes the expectation value while leveraging circuit-cutting strategies to reduce the qubit register size. To validate our framework, we first conduct experiments on a toy model using quantum noise simulations for the Max-Cut problem, analyzing performance improvements with an increasing number of layers. Subsequently, we extend our methodology to a real-world financial optimization scenario, showing competitive results. The results suggest that QuantCut effectively facilitates large-scale quantum computations with circuit-cutting technologies.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Congruence conditions for the mod $λ$ values of the Fourier coefficients of classical eigenforms
Authors:
Michael A. Daas
Abstract:
We classify all instances of the condition $a_{p}(f) \equiv x \bmod λ$ being related to a congruence on the prime $p$, where $a_{p}(f)$ denotes the $p$th Fourier coefficient of a classical normalised cuspidal eigenform $f$ and $λ$ is a prime in the number field generated by the Fourier coefficients of $f$. This classification is done in terms of the (projective) image of the mod $λ$ Galois represe…
▽ More
We classify all instances of the condition $a_{p}(f) \equiv x \bmod λ$ being related to a congruence on the prime $p$, where $a_{p}(f)$ denotes the $p$th Fourier coefficient of a classical normalised cuspidal eigenform $f$ and $λ$ is a prime in the number field generated by the Fourier coefficients of $f$. This classification is done in terms of the (projective) image of the mod $λ$ Galois representation associated with $f$ and extends work by Swinnerton-Dyer. We highlight that for $x = 0$, this condition is more often implied by a congruence on the prime $p$ than the general value of $a_{p}(f) \bmod λ$. Finally, we illustrate various instances of these congruences through examples from the setting of weight 2 newforms attached to rational elliptic curves.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Fast Estimation of Globally Optimal Independent Contact Regions for Robust Grasping and Manipulation
Authors:
Jonathan P. King,
Harnoor Ahluwalia,
Michael Zhang,
Nancy S. Pollard
Abstract:
This work presents a fast anytime algorithm for computing globally optimal independent contact regions (ICRs). ICRs are regions such that one contact within each region enables a valid grasp. Locations of ICRs can provide guidance for grasp and manipulation planning, learning, and policy transfer. However, ICRs for modern applications have been little explored, in part due to the expense of comput…
▽ More
This work presents a fast anytime algorithm for computing globally optimal independent contact regions (ICRs). ICRs are regions such that one contact within each region enables a valid grasp. Locations of ICRs can provide guidance for grasp and manipulation planning, learning, and policy transfer. However, ICRs for modern applications have been little explored, in part due to the expense of computing them, as they have a search space exponential in the number of contacts. We present a divide and conquer algorithm based on incremental n-dimensional Delaunay triangulation that produces results with bounded suboptimality in times sufficient for real-time planning. This paper presents the base algorithm for grasps where contacts lie within a plane. Our experiments show substantial benefits over competing grasp quality metrics and speedups of 100X and more for competing approaches to computing ICRs. We explore robustness of a policy guided by ICRs and outline a path to general 3D implementation. Code will be released on publication to facilitate further development and applications.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis
Authors:
Jingguo Qu,
Xinyang Han,
Tonghuan Xiao,
Jia Ai,
Juan Wu,
Tong Zhao,
Jing Qin,
Ann Dorothy King,
Winnie Chiu-Wing Chu,
Jing Cai,
Michael Tin-Cheung Ying
Abstract:
Medical ultrasonography is an essential imaging technique for examining superficial organs and tissues, including lymph nodes, breast, and thyroid. It employs high-frequency ultrasound waves to generate detailed images of the internal structures of the human body. However, manually contouring regions of interest in these images is a labor-intensive task that demands expertise and often results in…
▽ More
Medical ultrasonography is an essential imaging technique for examining superficial organs and tissues, including lymph nodes, breast, and thyroid. It employs high-frequency ultrasound waves to generate detailed images of the internal structures of the human body. However, manually contouring regions of interest in these images is a labor-intensive task that demands expertise and often results in inconsistent interpretations among individuals. Vision-language foundation models, which have excelled in various computer vision applications, present new opportunities for enhancing ultrasound image analysis. Yet, their performance is hindered by the significant differences between natural and medical imaging domains. This research seeks to overcome these challenges by developing domain adaptation methods for vision-language foundation models. In this study, we explore the fine-tuning pipeline for vision-language foundation models by utilizing large language model as text refiner with special-designed adaptation strategies and task-driven heads. Our approach has been extensively evaluated on six ultrasound datasets and two tasks: segmentation and classification. The experimental results show that our method can effectively improve the performance of vision-language foundation models for ultrasound image analysis, and outperform the existing state-of-the-art vision-language and pure foundation models. The source code of this study is available at https://github.com/jinggqu/NextGen-UIA.
△ Less
Submitted 10 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
A multi-physics model for dislocation driven spontaneous grain nucleation and microstructure evolution in polycrystals
Authors:
Izzet Tarik Tandogan,
Michael Budnitzki,
Stefan Sandfeld
Abstract:
The granular microstructure of metals evolves significantly during thermomechanical processing through viscoplastic deformation and recrystallization. Microstructural features such as grain boundaries (GBs), subgrains, localized deformation bands, and non-uniform dislocation distributions critically influence grain nucleation and growth during recrystallization. Traditionally, modeling this couple…
▽ More
The granular microstructure of metals evolves significantly during thermomechanical processing through viscoplastic deformation and recrystallization. Microstructural features such as grain boundaries (GBs), subgrains, localized deformation bands, and non-uniform dislocation distributions critically influence grain nucleation and growth during recrystallization. Traditionally, modeling this coupled evolution involves separate, specialized frameworks for mechanical deformation and microstructural kinetics, typically used in a staggered manner. Nucleation is often introduced ad hoc, with nuclei seeded at predefined sites based on criteria like critical dislocation density, stress or strain. This is a consequence of the inherent limitations of the staggered approach, where newly formed GBs or grains have to be incorporated with additional processing. In this work, we propose a unified, thermodynamically consistent field theory that enables spontaneous nucleation driven by stored dislocations at GBs. The model integrates Cosserat crystal plasticity with the Henry-Mellenthin-Plapp orientation phase field approach, allowing the simulation of key microstructural defects, as well as curvature- and stored energy-driven grain boundary migration. The unified approach enables seamless identification of GBs that emerge from deformation and nucleation. Nucleation is activated through a coupling function that links dislocation-related free energy contributions to the phase field. Dislocation recovery occurs both at newly formed nuclei and behind migrating GBs. The model's capabilities are demonstrated using periodic bicrystal and polycrystal simulations, where mechanisms such as strain-induced boundary migration, subgrain growth, and coalescence are captured. The proposed spontaneous nucleation mechanism offers a novel addition to the capabilities of phase field models for recrystallization simulation.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
SPECULOOS: five years hunting terrestrial planets around ultra-cool dwarfs
Authors:
Sebastián Zúñiga-Fernández,
Michael Gillon,
SPECULOOS consortium
Abstract:
The SPECULOOS (Search for habitable Planets EClipsing ULtra-cOOl Stars) project aims to detect temperate terrestrial planets transiting nearby ultracool dwarfs, including late M-dwarf stars and brown dwarfs, which are well-suited for atmospheric characterization using the James Webb Space Telescope (JWST) and upcoming giant telescopes like the European Extremely Large Telescope (ELT). Led by the U…
▽ More
The SPECULOOS (Search for habitable Planets EClipsing ULtra-cOOl Stars) project aims to detect temperate terrestrial planets transiting nearby ultracool dwarfs, including late M-dwarf stars and brown dwarfs, which are well-suited for atmospheric characterization using the James Webb Space Telescope (JWST) and upcoming giant telescopes like the European Extremely Large Telescope (ELT). Led by the University of Liège, SPECULOOS is conducted in partnership with the University of Cambridge, the University of Birmingham, the Massachusetts Institute of Technology, the University of Bern, and ETH Zurich. The project operates a network of robotic telescopes at two main observatories: SPECULOOS-South in Chile, with four telescopes, and SPECULOOS-North in Tenerife, currently with one telescope (soon to be two). This network is complemented by the SAINT-EX telescope located in San Pedro Mártir, Mexico. In this paper, we review the status of our facilities after five years of operations, the current challenges and development plans, and our latest scientific results.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Numerical stability of force-gradient integrators and their Hessian-free variants in lattice QCD simulations
Authors:
Kevin Schäfers,
Jacob Finkenrath,
Michael Günther,
Francesco Knechtli
Abstract:
We investigate the numerical stability of force-gradient integrators and their Hessian-free variants within the molecular dynamics step of the Hamiltonian Monte Carlo algorithm in lattice QCD simulations. A linear stability analysis of (Hessian-free) force-gradient integrators is conducted by investigating the harmonic oscillator as a test equation. By performing detailed stability investigations…
▽ More
We investigate the numerical stability of force-gradient integrators and their Hessian-free variants within the molecular dynamics step of the Hamiltonian Monte Carlo algorithm in lattice QCD simulations. A linear stability analysis of (Hessian-free) force-gradient integrators is conducted by investigating the harmonic oscillator as a test equation. By performing detailed stability investigations for the entire family of self-adjoint integrators with up to eleven exponentials per time step, we detect promising integrator variants that are providing a good trade-off between accuracy and numerical stability. Simulations for the two-dimensional Schwinger model demonstrate that there are no significant differences in the stability domain of a force-gradient integrator and its Hessian-free counterpart. Furthermore, lattice QCD simulations are conducted to emphasize the significance of numerical stability as a metric for evaluating the computational efficiency of integrators when applied to lattice QCD simulations.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Confidence Boosts Trust-Based Resilience in Cooperative Multi-Robot Systems
Authors:
Luca Ballotta,
Áron Vékássy,
Stephanie Gil,
Michal Yemini
Abstract:
Wireless communication-based multi-robot systems open the door to cyberattacks that can disrupt safety and performance of collaborative robots. The physical channel supporting inter-robot communication offers an attractive opportunity to decouple the detection of malicious robots from task-relevant data exchange between legitimate robots. Yet, trustworthiness indications coming from physical chann…
▽ More
Wireless communication-based multi-robot systems open the door to cyberattacks that can disrupt safety and performance of collaborative robots. The physical channel supporting inter-robot communication offers an attractive opportunity to decouple the detection of malicious robots from task-relevant data exchange between legitimate robots. Yet, trustworthiness indications coming from physical channels are uncertain and must be handled with this in mind. In this paper, we propose a resilient protocol for multi-robot operation wherein a parameter λt accounts for how confident a robot is about the legitimacy of nearby robots that the physical channel indicates. Analytical results prove that our protocol achieves resilient coordination with arbitrarily many malicious robots under mild assumptions. Tuning λt allows a designer to trade between near-optimal inter-robot coordination and quick task execution; see Fig. 1. This is a fundamental performance tradeoff and must be carefully evaluated based on the task at hand. The effectiveness of our approach is numerically verified with experiments involving platoons of autonomous cars where some vehicles are maliciously spoofed.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Modern approach to muonic x-ray spectroscopy demonstrated through the measurement of stable Cl radii
Authors:
K. A. Beyer,
T. E. Cocolios,
C. Costache,
M. Deseyn,
P. Demol,
A. Doinaki,
O. Eizenberg,
M. Gorshteyn,
M. Heines,
A. Herzáň,
P. Indelicato,
K. Kirch,
A. Knecht,
R. Lica,
V. Matousek,
E. A. Maugeri,
B. Ohayon,
N. S. Oreshkina,
W. W. M. M. Phyo,
R. Pohl,
S. Rathi,
W. Ryssens,
A. Turturica,
K. von Schoeler,
I. A. Valuev
, et al. (3 additional authors not shown)
Abstract:
Recent advances in muonic x-ray experiments have reinvigorated efforts in measurements of absolute nuclear charge radii. Here, a modern approach is presented, and demonstrated through determination of the charge radii of the two stable chlorine nuclides $^{35}$Cl and $^{37}$Cl. Knowledge of these radii has implications for fundamental studies in nuclear and atomic physics. For this purpose, a stat…
▽ More
Recent advances in muonic x-ray experiments have reinvigorated efforts in measurements of absolute nuclear charge radii. Here, a modern approach is presented, and demonstrated through determination of the charge radii of the two stable chlorine nuclides $^{35}$Cl and $^{37}$Cl. Knowledge of these radii has implications for fundamental studies in nuclear and atomic physics. For this purpose, a state-of-the-art experiment was performed at the $π$E1 beamline in the Paul Scherrer Institute (Switzerland), using a large-scale HPGe detector array in order to extract precise energies of the muonic $^{35}$Cl and $^{37}$Cl $np1s$ transitions. The nuclear charge radius extraction relies on modern calculations for QED effects and nuclear polarization with rigorous uncertainty quantification, including effects that were not accounted for in older studies. Additionally, we established a new method for applying the nuclear shape correction directly from energy density functionals, which are amenable to isotopes for which no high-quality electron scattering experiments are available. The resulting charge radii are $3.3335(23) fm$ for $^{35}$Cl and $3.3445(23) fm$ for $^{37}$Cl, thus improving the uncertainty of the available electron scattering values by a factor of seven. The correlation of several observables was evaluated between the different isotopes in order to produce a more precise value of the differential mean square charge radius $δ\langle r^2 \rangle^{37, 35}=+0.0771(66) fm^{2}$. In this case, improvement of the uncertainty by more than one order of magnitude was achieved compared to the literature value. This precision is sufficient to use this differential as input for isotope shift factor determination.
△ Less
Submitted 11 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
Radiation-Reaction on the Straight-Line Motion of a Point Charge accelerated by a constant applied Electric Field in an Electromagnetic Bopp-Landé-Thomas-Podolsky vacuum
Authors:
Ryan J. McGuigan,
Michael K. -H. Kiessling
Abstract:
The radiation-reaction problem of standard Lorentz electrodynamics with point charges is pathological, standing in contrast to Bopp--Landé--Thomas--Podolsky (BLTP) electrodynamics where it is in fact well-defined and calculable, as reported in a previous publication. To demonstrate the viability of BLTP electrodynamics, we consider the BLTP analogue of the radiation reaction of a classical point c…
▽ More
The radiation-reaction problem of standard Lorentz electrodynamics with point charges is pathological, standing in contrast to Bopp--Landé--Thomas--Podolsky (BLTP) electrodynamics where it is in fact well-defined and calculable, as reported in a previous publication. To demonstrate the viability of BLTP electrodynamics, we consider the BLTP analogue of the radiation reaction of a classical point charge accelerated from rest by a static homogeneous capacitor plate field, and calculate it up to $O(\varkappa^4)$ in a formal expansion about $\varkappa=0$ in powers of $\varkappa$, Bopp's reciprocal length, a new electrodynamics parameter introduced by BLTP theory. In a paper by Carley and Kiessling (arXiv:2303.01720 [physics.class-ph]) the radiation-reaction corrections to test-particle motion were explicitly computed to $O(\varkappa^3)$, the first non-vanishing order. In this article a crucial question regarding this ``small-$\varkappa$'' expansion, raised by Carley and Kiessling, is answered as follows: The motions computed with terms $O(\varkappa^3)$ included are mathematically accurate approximations to {physically reasonable} solutions of the actual BLTP initial value problem for short times $t$, viz. when $\varkappa c t \ll 1$, where $c$ is the speed of light in vacuo, but their unphysical behavior over {much} longer times does not accurately approximate the actual BLTP solutions even when the dimensionless parameter $\varkappa e^2 / |m_b| c^2 \ll 1$, where $e$ is the elementary charge and $m_b$ the bare rest mass of the electron. This has the important implication that BLTP electrodynamics remains a viable contender for an accurate classical electrodynamics with point charges that does not suffer from the infinite self-interaction problems of textbook Lorentz electrodynamics with point charges.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery
Authors:
Yuni Susanti,
Michael Färber
Abstract:
Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -- which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context) -- offers a compelling alternative to traditional methods that rely on observational data. However, existing met…
▽ More
Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -- which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context) -- offers a compelling alternative to traditional methods that rely on observational data. However, existing methods using Large Language Models (LLMs) often produce unstable and inconsistent results, compromising their reliability for causal inference. To address this, we introduce a novel approach that integrates Knowledge Graphs (KGs) with LLMs to enhance knowledge-based causal discovery. Our approach identifies informative metapath-based subgraphs within KGs and further refines the selection of these subgraphs using Learning-to-Rank-based models. The top-ranked subgraphs are then incorporated into zero-shot prompts, improving the effectiveness of LLMs in inferring the causal relationship. Extensive experiments on biomedical and open-domain datasets demonstrate that our method outperforms most baselines by up to 44.4 points in F1 scores, evaluated across diverse LLMs and KGs. Our code and datasets are available on GitHub: https://github.com/susantiyuni/path-to-causality
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Bridging RDF Knowledge Graphs with Graph Neural Networks for Semantically-Rich Recommender Systems
Authors:
Michael Färber,
David Lamprecht,
Yuni Susanti
Abstract:
Graph Neural Networks (GNNs) have substantially advanced the field of recommender systems. However, despite the creation of more than a thousand knowledge graphs (KGs) under the W3C standard RDF, their rich semantic information has not yet been fully leveraged in GNN-based recommender systems. To address this gap, we propose a comprehensive integration of RDF KGs with GNNs that utilizes both the t…
▽ More
Graph Neural Networks (GNNs) have substantially advanced the field of recommender systems. However, despite the creation of more than a thousand knowledge graphs (KGs) under the W3C standard RDF, their rich semantic information has not yet been fully leveraged in GNN-based recommender systems. To address this gap, we propose a comprehensive integration of RDF KGs with GNNs that utilizes both the topological information from RDF object properties and the content information from RDF datatype properties. Our main focus is an in-depth evaluation of various GNNs, analyzing how different semantic feature initializations and types of graph structure heterogeneity influence their performance in recommendation tasks. Through experiments across multiple recommendation scenarios involving multi-million-node RDF graphs, we demonstrate that harnessing the semantic richness of RDF KGs significantly improves recommender systems and lays the groundwork for GNN-based recommender systems for the Linked Open Data cloud. The code and data are available on our GitHub repository: https://github.com/davidlamprecht/rdf-gnn-recommendation
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Enhancing Synthetic CT from CBCT via Multimodal Fusion: A Study on the Impact of CBCT Quality and Alignment
Authors:
Maximilian Tschuchnig,
Lukas Lamminger,
Philipp Steininger,
Michael Gadermayr
Abstract:
Cone-Beam Computed Tomography (CBCT) is widely used for real-time intraoperative imaging due to its low radiation dose and high acquisition speed. However, despite its high resolution, CBCT suffers from significant artifacts and thereby lower visual quality, compared to conventional Computed Tomography (CT). A recent approach to mitigate these artifacts is synthetic CT (sCT) generation, translatin…
▽ More
Cone-Beam Computed Tomography (CBCT) is widely used for real-time intraoperative imaging due to its low radiation dose and high acquisition speed. However, despite its high resolution, CBCT suffers from significant artifacts and thereby lower visual quality, compared to conventional Computed Tomography (CT). A recent approach to mitigate these artifacts is synthetic CT (sCT) generation, translating CBCT volumes into the CT domain. In this work, we enhance sCT generation through multimodal learning, integrating intraoperative CBCT with preoperative CT. Beyond validation on two real-world datasets, we use a versatile synthetic dataset, to analyze how CBCT-CT alignment and CBCT quality affect sCT quality. The results demonstrate that multimodal sCT consistently outperform unimodal baselines, with the most significant gains observed in well-aligned, low-quality CBCT-CT cases. Finally, we demonstrate that these findings are highly reproducible in real-world clinical datasets.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Optimizing Learned Image Compression on Scalar and Entropy-Constraint Quantization
Authors:
Florian Borzechowski,
Michael Schäfer,
Heiko Schwarz,
Jonathan Pfaff,
Detlev Marpe,
Thomas Wiegand
Abstract:
The continuous improvements on image compression with variational autoencoders have lead to learned codecs competitive with conventional approaches in terms of rate-distortion efficiency. Nonetheless, taking the quantization into account during the training process remains a problem, since it produces zero derivatives almost everywhere and needs to be replaced with a differentiable approximation w…
▽ More
The continuous improvements on image compression with variational autoencoders have lead to learned codecs competitive with conventional approaches in terms of rate-distortion efficiency. Nonetheless, taking the quantization into account during the training process remains a problem, since it produces zero derivatives almost everywhere and needs to be replaced with a differentiable approximation which allows end-to-end optimization. Though there are different methods for approximating the quantization, none of them model the quantization noise correctly and thus, result in suboptimal networks. Hence, we propose an additional finetuning training step: After conventional end-to-end training, parts of the network are retrained on quantized latents obtained at the inference stage. For entropy-constraint quantizers like Trellis-Coded Quantization, the impact of the quantizer is particularly difficult to approximate by rounding or adding noise as the quantized latents are interdependently chosen through a trellis search based on both the entropy model and a distortion measure. We show that retraining on correctly quantized data consistently yields additional coding gain for both uniform scalar and especially for entropy-constraint quantization, without increasing inference complexity. For the Kodak test set, we obtain average savings between 1% and 2%, and for the TecNick test set up to 2.2% in terms of Bjøntegaard-Delta bitrate.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models
Authors:
Shuzhou Yuan,
Ercong Nie,
Mario Tawfelis,
Helmut Schmid,
Hinrich Schütze,
Michael Färber
Abstract:
Hate speech detection is a socially sensitive and inherently subjective task, with judgments often varying based on personal traits. While prior work has examined how socio-demographic factors influence annotation, the impact of personality traits on Large Language Models (LLMs) remains largely unexplored. In this paper, we present the first comprehensive study on the role of persona prompts in ha…
▽ More
Hate speech detection is a socially sensitive and inherently subjective task, with judgments often varying based on personal traits. While prior work has examined how socio-demographic factors influence annotation, the impact of personality traits on Large Language Models (LLMs) remains largely unexplored. In this paper, we present the first comprehensive study on the role of persona prompts in hate speech classification, focusing on MBTI-based traits. A human annotation survey confirms that MBTI dimensions significantly affect labeling behavior. Extending this to LLMs, we prompt four open-source models with MBTI personas and evaluate their outputs across three hate speech datasets. Our analysis uncovers substantial persona-driven variation, including inconsistencies with ground truth, inter-persona disagreement, and logit-level biases. These findings highlight the need to carefully define persona prompts in LLM-based annotation workflows, with implications for fairness and alignment with human values.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
SLEEPYLAND: trust begins with fair evaluation of automatic sleep staging models
Authors:
Alvise Dei Rossi,
Matteo Metaldi,
Michal Bechny,
Irina Filchenko,
Julia van der Meer,
Markus H. Schmidt,
Claudio L. A. Bassetti,
Athina Tzovara,
Francesca D. Faraci,
Luigi Fiorillo
Abstract:
Despite advances in deep learning for automatic sleep staging, clinical adoption remains limited due to challenges in fair model evaluation, generalization across diverse datasets, model bias, and variability in human annotations. We present SLEEPYLAND, an open-source sleep staging evaluation framework designed to address these barriers. It includes more than 220'000 hours in-domain (ID) sleep rec…
▽ More
Despite advances in deep learning for automatic sleep staging, clinical adoption remains limited due to challenges in fair model evaluation, generalization across diverse datasets, model bias, and variability in human annotations. We present SLEEPYLAND, an open-source sleep staging evaluation framework designed to address these barriers. It includes more than 220'000 hours in-domain (ID) sleep recordings, and more than 84'000 hours out-of-domain (OOD) sleep recordings, spanning a broad range of ages, sleep-wake disorders, and hardware setups. We release pre-trained models based on high-performing SoA architectures and evaluate them under standardized conditions across single- and multi-channel EEG/EOG configurations. We introduce SOMNUS, an ensemble combining models across architectures and channel setups via soft voting. SOMNUS achieves robust performance across twenty-four different datasets, with macro-F1 scores between 68.7% and 87.2%, outperforming individual models in 94.9% of cases. Notably, SOMNUS surpasses previous SoA methods, even including cases where compared models were trained ID while SOMNUS treated the same data as OOD. Using a subset of the BSWR (N=6'633), we quantify model biases linked to age, gender, AHI, and PLMI, showing that while ensemble improves robustness, no model architecture consistently minimizes bias in performance and clinical markers estimation. In evaluations on OOD multi-annotated datasets (DOD-H, DOD-O), SOMNUS exceeds the best human scorer, i.e., MF1 85.2% vs 80.8% on DOD-H, and 80.2% vs 75.9% on DOD-O, better reproducing the scorer consensus than any individual expert (k = 0.89/0.85 and ACS = 0.95/0.94 for healthy/OSA cohorts). Finally, we introduce ensemble disagreement metrics - entropy and inter-model divergence based - predicting regions of scorer disagreement with ROC AUCs up to 0.828, offering a data-driven proxy for human uncertainty.
△ Less
Submitted 11 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
The Geometries of Truth Are Orthogonal Across Tasks
Authors:
Waiss Azizian,
Michael Kirchhof,
Eugene Ndiaye,
Louis Bethune,
Michal Klein,
Pierre Ablin,
Marco Cuturi
Abstract:
Large Language Models (LLMs) have demonstrated impressive generalization capabilities across various tasks, but their claim to practical relevance is still mired by concerns on their reliability. Recent works have proposed examining the activations produced by an LLM at inference time to assess whether its answer to a question is correct. Some works claim that a "geometry of truth" can be learned…
▽ More
Large Language Models (LLMs) have demonstrated impressive generalization capabilities across various tasks, but their claim to practical relevance is still mired by concerns on their reliability. Recent works have proposed examining the activations produced by an LLM at inference time to assess whether its answer to a question is correct. Some works claim that a "geometry of truth" can be learned from examples, in the sense that the activations that generate correct answers can be distinguished from those leading to mistakes with a linear classifier. In this work, we underline a limitation of these approaches: we observe that these "geometries of truth" are intrinsically task-dependent and fail to transfer across tasks. More precisely, we show that linear classifiers trained across distinct tasks share little similarity and, when trained with sparsity-enforcing regularizers, have almost disjoint supports. We show that more sophisticated approaches (e.g., using mixtures of probes and tasks) fail to overcome this limitation, likely because activation vectors commonly used to classify answers form clearly separated clusters when examined across tasks.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.