-
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
Authors:
Jinhao Duan,
James Diffenderfer,
Sandeep Madireddy,
Tianlong Chen,
Bhavya Kailkhura,
Kaidi Xu
Abstract:
As Large Language Models (LLMs) are integrated into safety-critical applications involving sequential decision-making in the real world, it is essential to know when to trust LLM decisions. Existing LLM Uncertainty Quantification (UQ) methods are primarily designed for single-turn question-answering formats, resulting in multi-step decision-making scenarios, e.g., LLM agentic system, being underex…
▽ More
As Large Language Models (LLMs) are integrated into safety-critical applications involving sequential decision-making in the real world, it is essential to know when to trust LLM decisions. Existing LLM Uncertainty Quantification (UQ) methods are primarily designed for single-turn question-answering formats, resulting in multi-step decision-making scenarios, e.g., LLM agentic system, being underexplored. In this paper, we introduce a principled, information-theoretic framework that decomposes LLM sequential decision uncertainty into two parts: (i) internal uncertainty intrinsic to the current decision, which is focused on existing UQ methods, and (ii) extrinsic uncertainty, a Mutual-Information (MI) quantity describing how much uncertainty should be inherited from preceding decisions. We then propose UProp, an efficient and effective extrinsic uncertainty estimator that converts the direct estimation of MI to the estimation of Pointwise Mutual Information (PMI) over multiple Trajectory-Dependent Decision Processes (TDPs). UProp is evaluated over extensive multi-step decision-making benchmarks, e.g., AgentBench and HotpotQA, with state-of-the-art LLMs, e.g., GPT-4.1 and DeepSeek-V3. Experimental results demonstrate that UProp significantly outperforms existing single-turn UQ baselines equipped with thoughtful aggregation strategies. Moreover, we provide a comprehensive analysis of UProp, including sampling efficiency, potential applications, and intermediate uncertainty propagation, to demonstrate its effectiveness. Codes will be available at https://github.com/jinhaoduan/UProp.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Robustness of Majorana modes to potential disorder in Fe chains on a superconducting Rashba alloy
Authors:
Harim Jang,
Daniel Crawford,
Khai Ton That,
Lucas Schneider,
Jens Wiebe,
Makoto Shimizu,
Harald O. Jeschke,
Stephan Rachel,
Roland Wiesendanger
Abstract:
Majorana modes offer great potential for fault-tolerant quantum computation due to their topological protection. However, for superconductor-semiconductor nanowire hybrids, intrinsic disorder makes the unambiguous detection of Majorana modes difficult. Here, we construct 1D spin chains from individual Fe atoms on the Rashba surface alloy BiAg2/Ag(111) with proximity-induced superconductivity from…
▽ More
Majorana modes offer great potential for fault-tolerant quantum computation due to their topological protection. However, for superconductor-semiconductor nanowire hybrids, intrinsic disorder makes the unambiguous detection of Majorana modes difficult. Here, we construct 1D spin chains from individual Fe atoms on the Rashba surface alloy BiAg2/Ag(111) with proximity-induced superconductivity from a Nb(110) substrate. While the Fe chains exhibit perfect crystalline order, we observe nano-scale potential disorder of the BiAg2/Ag(111)/Nb(110) heterostructure by scanning tunneling microscopy. However, this does not prevent the emergence of zero-energy modes at both ends of the Fe chains, in agreement with tight-binding calculations showing that they are only found in the topologically non-trivial regime of the phase diagram. These Majorana modes are indeed robust against potential disorder.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation
Authors:
Quoc Thinh Vo,
Joe Woods,
Priontu Chowdhury,
David K. Han
Abstract:
Localizing acoustic sound sources in the ocean is a challenging task due to the complex and dynamic nature of the environment. Factors such as high background noise, irregular underwater geometries, and varying acoustic properties make accurate localization difficult. To address these obstacles, we propose a multi-branch network architecture designed to accurately predict the distance between a mo…
▽ More
Localizing acoustic sound sources in the ocean is a challenging task due to the complex and dynamic nature of the environment. Factors such as high background noise, irregular underwater geometries, and varying acoustic properties make accurate localization difficult. To address these obstacles, we propose a multi-branch network architecture designed to accurately predict the distance between a moving acoustic source and a receiver, tested on real-world underwater signal arrays. The network leverages Convolutional Neural Networks (CNNs) for robust spatial feature extraction and integrates Conformers with self-attention mechanism to effectively capture temporal dependencies. Log-mel spectrogram and generalized cross-correlation with phase transform (GCC-PHAT) features are employed as input representations. To further enhance the model performance, we introduce an Adaptive Gain Control (AGC) layer, that adaptively adjusts the amplitude of input features, ensuring consistent energy levels across varying ranges, signal strengths, and noise conditions. We assess the model's generalization capability by training it in one domain and testing it in a different domain, using only a limited amount of data from the test domain for fine-tuning. Our proposed method outperforms state-of-the-art (SOTA) approaches in similar settings, establishing new benchmarks for underwater sound localization.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Variational quantum algorithms with exact geodesic transport
Authors:
André J. Ferreira-Martins,
Renato M. S. Farias,
Giancarlo Camilo,
Thiago O. Maciel,
Allan Tosta,
Ruge Lin,
Abdulla Alhajri,
Tobias Haug,
Leandro Aolita
Abstract:
Variational quantum algorithms (VQAs) are promising candidates for near-term applications of quantum computers, but their training represents a major challenge in practice. We introduce exact-geodesic VQAs, a curvature-aware framework that enables analytic Riemannian optimization of variational quantum circuits through a convenient choice of circuit ansatz. Our method exploits the exact metric to…
▽ More
Variational quantum algorithms (VQAs) are promising candidates for near-term applications of quantum computers, but their training represents a major challenge in practice. We introduce exact-geodesic VQAs, a curvature-aware framework that enables analytic Riemannian optimization of variational quantum circuits through a convenient choice of circuit ansatz. Our method exploits the exact metric to find a parameter optimization path based on exact geodesic transport with conjugate gradients (EGT-CG). This supersedes the quantum natural gradient method, in fact recovering it as its first-order approximation. Further, the exact-geodesic updates for our circuit ansatz have the same measurement cost as standard gradient descent. This contrasts with previous metric-aware methods, which require resource-intensive estimations of the metric tensor using quantum hardware. In numerical simulations for electronic structure problems of up to 14 spin-orbitals, our framework allows us to achieve up to a 20x reduction in the number of iterations over Adam or quantum natural gradient methods. Moreover, for degenerate cases, which are notoriously difficult to optimize with conventional methods, we achieve rapid convergence to the global minima. Our work demonstrates that the cost of VQA optimization can be drastically reduced by harnessing the Riemannian geometry of the manifold expressed by the circuit ansatz, with potential implications at the interface between quantum machine learning, differential geometry, and optimal control theory.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
A competitive NISQ and qubit-efficient solver for the LABS problem
Authors:
Marco Sciorilli,
Giancarlo Camilo,
Thiago O. Maciel,
Askery Canabarro,
Lucas Borges,
Leandro Aolita
Abstract:
Pauli Correlation Encoding (PCE) has recently been introduced as a qubit-efficient approach to combinatorial optimization problems within variational quantum algorithms (VQAs). The method offers a polynomial reduction in qubit count and a super-polynomial suppression of barren plateaus. Moreover, it has been shown to feature a competitive performance with classical state-of-the-art methods on MaxC…
▽ More
Pauli Correlation Encoding (PCE) has recently been introduced as a qubit-efficient approach to combinatorial optimization problems within variational quantum algorithms (VQAs). The method offers a polynomial reduction in qubit count and a super-polynomial suppression of barren plateaus. Moreover, it has been shown to feature a competitive performance with classical state-of-the-art methods on MaxCut. Here, we extend the PCE-based framework to solve the Low Autocorrelation Binary Sequences (LABS) problem. This is a notoriously hard problem with a single instance per problem size, considered a major benchmark for classical and quantum solvers. We simulate our PCE variational quantum solver for LABS instances of up to $N=44$ binary variables using only $n=6$ qubits and a brickwork circuit Ansatz of depth $10$, with a total of $30$ two-qubit gates, i.e. well inside the NISQ regime. We observe a significant scaling advantage in the total time to (the exact) solution of our solver with respect to previous studies using QAOA, and even a modest advantage with respect to the leading classical heuristic, given by Tabu search. Our findings point at PCE-based solvers as a promising quantum-inspired classical heuristic for hard-in-practice problems as well as a tool to reduce the resource requirements for actual quantum algorithms, with both fundamental and applied potential implications.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
The MORRIS Experiment: Magnetic Levitation as a New Probe of Non-Newtonian Gravity
Authors:
Dorian W. P. Amaral,
Tim M. Fuchs,
Hendrik Ulbricht,
Christopher D. Tunnell
Abstract:
We present MORRIS (Magnetic Oscillatory Resonator for Rare-Interaction Studies) and propose the first tabletop search for non-Newtonian gravity due to a Yukawa-like fifth force using a magnetically levitated particle. Our experiment comprises a levitated sub-millimeter magnet in a superconducting trap that is driven by a time-periodic source. Featuring short-, medium-, and long-term stages, MORRIS…
▽ More
We present MORRIS (Magnetic Oscillatory Resonator for Rare-Interaction Studies) and propose the first tabletop search for non-Newtonian gravity due to a Yukawa-like fifth force using a magnetically levitated particle. Our experiment comprises a levitated sub-millimeter magnet in a superconducting trap that is driven by a time-periodic source. Featuring short-, medium-, and long-term stages, MORRIS will admit increasing sensitivities to the force coupling strength $α$, optimally probing screening lengths of $λ\sim 1\,\mathrm{mm}$. Our short-term setup provides a proof-of-principle study, with our medium- and long-term stages respectively constraining $α\lesssim 10^{-4}$ and $α\lesssim 10^{-5}$, leading over existing bounds. Our projections are readily recastable to concrete models predicting the existence of fifth forces, and our statistical analysis is generally applicable to well-characterized sinusoidal driving forces. By leveraging ultralow dissipation and heavy test masses, MORRIS opens a new window onto tests of small-scale gravity and searches for physics beyond the Standard Model.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Dynamics of tidal spiral arms: Machine learning-assisted identification of equations and application to the Milky Way
Authors:
Marcel Bernet,
Pau Ramos,
Teresa Antoja,
Adrian Price-Whelan,
Steven L. Brunton,
Tetsuro Asano,
Alexandra Girón-Soto
Abstract:
Understanding the spiral arms of the Milky Way (MW) remains a key open question in galactic dynamics. Tidal perturbations, such as the recent passage of the Sagittarius dwarf galaxy (Sgr), could play a significant role in exciting them. We aim to analytically characterize the dynamics of tidally induced spiral arms, including their phase-space signatures. We ran idealized test-particle simulations…
▽ More
Understanding the spiral arms of the Milky Way (MW) remains a key open question in galactic dynamics. Tidal perturbations, such as the recent passage of the Sagittarius dwarf galaxy (Sgr), could play a significant role in exciting them. We aim to analytically characterize the dynamics of tidally induced spiral arms, including their phase-space signatures. We ran idealized test-particle simulations resembling impulsive satellite impacts, and used the Sparse Identification of Non-linear Dynamics (SINDy) method to infer their governing Partial Differential Equations (PDEs). We validated the method with analytical derivations and a realistic $N$-body simulation of a MW-Sgr encounter analogue. For small perturbations, a linear system of equations was recovered with SINDy, consistent with predictions from linearised collisionless dynamics. In this case, two distinct waves wrapping at pattern speeds $Ω\pm κ/m$ emerge. For large impacts, we empirically discovered a non-linear system of equations, representing a novel formulation for the dynamics of tidally induced spiral arms. For both cases, these equations describe wave properties like amplitude and pattern speed, and their shape and temporal evolution in different phase-space projections. We fit the Gaia $L_Z-V_R$ waves with the linear model, providing a reasonable fit and plausible parameters for the Sgr passage. However, the predicted amplitude ratio of the two waves is inconsistent with observations, supporting a more complex origin for this feature (e.g. multiple passages, bar, spiral arms). We merge data-driven discovery with theory to create simple, accurate models of tidal spiral arms that match simulations and provide a simple tool to fit Gaia and external galaxy data. This methodology could be extended to model complex phenomena like self-gravity and dynamical friction. (ABR)
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
The AURORA Survey: Tracing Galactic Outflows at $z\gtrsim2.5$ with JWST/NIRSpec NUV Absorption Lines
Authors:
Emily Kehoe,
Alice E. Shapley,
Ryan L. Sanders,
Naveen A. Reddy,
Michael W. Topping,
Natalie Lam,
Leonardo Clarke,
Fergus Cullen,
Richard S. Ellis,
N. M. Forster Schreiber,
Tucker Jones,
Ali Ahmad Khostovan,
Derek J. McLeod,
Ross J. McLure,
Desika Narayanan,
Pascal Oesch,
Anthony J. Pahl
Abstract:
We probe galactic-scale outflows in star-forming galaxies at $z\gtrsim2.5$ drawn from the JWST/NIRSpec AURORA program. For the first time, we directly compare outflow properties from the early universe to the present day using near-UV absorption lines. We measure ISM kinematics from Fe II and Mg II absorption features in 41 and 43 galaxies, respectively, and examine how these kinematics correlate…
▽ More
We probe galactic-scale outflows in star-forming galaxies at $z\gtrsim2.5$ drawn from the JWST/NIRSpec AURORA program. For the first time, we directly compare outflow properties from the early universe to the present day using near-UV absorption lines. We measure ISM kinematics from Fe II and Mg II absorption features in 41 and 43 galaxies, respectively, and examine how these kinematics correlate with galaxy properties. We find that galaxies with outflows tend to have higher stellar masses, and that maximum outflow velocities increase with stellar mass, SFR, $β$, $E(B-V)$, and $A_V$. We also find that Mg II emission is more common in galaxies with lower masses, higher sSFRs, and less dust. These trends are consistent with those in star-forming galaxies at $z<2$ when using the same outflow tracers, suggesting that the feedback from star formation has played a persistent role in shaping galaxy evolution over cosmic time. We also directly compare near-UV and far-UV features in the same NIRSpec spectrum for a $z=5.19$ galaxy, finding consistent ISM kinematics and demonstrating that different tracers yield comparable measurements. We also detect Na D absorption in 10 galaxies, which have higher stellar mass, SFR, and dust attenuation compared to galaxies without Na D absorption, which is consistent with $z\sim0$ studies. The broad continuum coverage and sensitivity of NIRSpec will enable future studies with larger samples, allowing for robust tests of these trends across a wider dynamic range of galaxy properties.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Quantum secure direct communication of continuous-time signals using Whittaker Nyquist Shannon theorem
Authors:
V. F. Guedes,
S. T. de Oliveira,
G. L. de Oliveira,
J. B. R. Silva,
R. V. Ramos
Abstract:
In the present work, we provide a new quantum secure direct communication protocol and its experimental implementation. The proposed protocol can be used to transfer, in a secure way, continuous signals, like audio signal, from Alice to Bob. The security is guaranteed by the quantum nature of optical signals and the Whittaker-Nyquist-Shannon theorem. Furthermore, it can be easily implemented with…
▽ More
In the present work, we provide a new quantum secure direct communication protocol and its experimental implementation. The proposed protocol can be used to transfer, in a secure way, continuous signals, like audio signal, from Alice to Bob. The security is guaranteed by the quantum nature of optical signals and the Whittaker-Nyquist-Shannon theorem. Furthermore, it can be easily implemented with common optical devices that are commercially available.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Identifying Anomalous DESI Galaxy Spectra with a Variational Autoencoder
Authors:
C. Nicolaou,
R. P. Nathan,
O. Lahav,
A. Palmese,
A. Saintonge,
J. Aguilar,
S. Ahlen,
C. Allende Prieto,
S. Bailey,
S. BenZvi,
D. Bianchi,
A. Brodzeller,
D. Brooks,
T. Claybaugh,
A. de la Macorra,
J. Della Costa,
Arjun Dey,
P. Doel,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
G. Gutierrez,
K. Honscheid,
C. Howlett,
M. Ishak
, et al. (21 additional authors not shown)
Abstract:
The tens of millions of spectra being captured by the Dark Energy Spectroscopic Instrument (DESI) provide tremendous discovery potential. In this work we show how Machine Learning, in particular Variational Autoencoders (VAE), can detect anomalies in a sample of approximately 200,000 DESI spectra comprising galaxies, quasars and stars. We demonstrate that the VAE can compress the dimensionality of…
▽ More
The tens of millions of spectra being captured by the Dark Energy Spectroscopic Instrument (DESI) provide tremendous discovery potential. In this work we show how Machine Learning, in particular Variational Autoencoders (VAE), can detect anomalies in a sample of approximately 200,000 DESI spectra comprising galaxies, quasars and stars. We demonstrate that the VAE can compress the dimensionality of a spectrum by a factor of 100, while still retaining enough information to accurately reconstruct spectral features. We then detect anomalous spectra as those with high reconstruction error and those which are isolated in the VAE latent representation. The anomalies identified fall into two categories: spectra with artefacts and spectra with unique physical features. Awareness of the former can help to improve the DESI spectroscopic pipeline; whilst the latter can lead to the identification of new and unusual objects. To further curate the list of outliers, we use the Astronomaly package which employs Active Learning to provide personalised outlier recommendations for visual inspection. In this work we also explore the VAE latent space, finding that different object classes and subclasses are separated despite being unlabelled. We demonstrate the interpretability of this latent space by identifying tracks within it that correspond to various spectral characteristics. For example, we find tracks that correspond to increasing star formation and increase in broad emission lines along the Balmer series. In upcoming work we hope to apply the methods presented here to search for both systematics and astrophysically interesting objects in much larger datasets of DESI spectra.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
From Drawings to Decisions: A Hybrid Vision-Language Framework for Parsing 2D Engineering Drawings into Structured Manufacturing Knowledge
Authors:
Muhammad Tayyab Khan,
Lequn Chen,
Zane Yong,
Jun Ming Tan,
Wenhe Feng,
Seung Ki Moon
Abstract:
Efficient and accurate extraction of key information from 2D engineering drawings is essential for advancing digital manufacturing workflows. Such information includes geometric dimensioning and tolerancing (GD&T), measures, material specifications, and textual annotations. Manual extraction is slow and labor-intensive, while generic OCR models often fail due to complex layouts, engineering symbol…
▽ More
Efficient and accurate extraction of key information from 2D engineering drawings is essential for advancing digital manufacturing workflows. Such information includes geometric dimensioning and tolerancing (GD&T), measures, material specifications, and textual annotations. Manual extraction is slow and labor-intensive, while generic OCR models often fail due to complex layouts, engineering symbols, and rotated text, leading to incomplete and unreliable outputs. These limitations result in incomplete and unreliable outputs. To address these challenges, we propose a hybrid vision-language framework that integrates a rotation-aware object detection model (YOLOv11-obb) with a transformer-based vision-language parser. Our structured pipeline applies YOLOv11-OBB to localize annotations and extract oriented bounding box (OBB) patches, which are then parsed into structured outputs using a fine-tuned, lightweight vision-language model (VLM). We curate a dataset of 1,367 2D mechanical drawings annotated across nine key categories. YOLOv11-OBB is trained on this dataset to detect OBBs and extract annotation patches. These are parsed using two open-source VLMs: Donut and Florence-2. Both models are lightweight and well-suited for specialized industrial tasks under limited computational overhead. Following fine-tuning of both models on the curated dataset of image patches paired with structured annotation labels, a comparative experiment is conducted to evaluate parsing performance across four key metrics. Donut outperforms Florence-2, achieving 88.5% precision, 99.2% recall, and a 93.5% F1-score, with a hallucination rate of 11.5%. Finally, a case study demonstrates how the extracted structured information supports downstream manufacturing tasks such as process and tool selection, showcasing the practical utility of the proposed framework in modernizing 2D drawing interpretation.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Secret Sharing in 5G-MEC: Applicability for joint Security and Dependability
Authors:
Thilina Pathirana,
Ruxandra F. Olimid
Abstract:
Multi-access Edge Computing (MEC), an enhancement of 5G, processes data closer to its generation point, reducing latency and network load. However, the distributed and edge-based nature of 5G-MEC presents privacy and security challenges, including data exposure risks. Ensuring efficient manipulation and security of sensitive data at the edge is crucial. To address these challenges, we investigate…
▽ More
Multi-access Edge Computing (MEC), an enhancement of 5G, processes data closer to its generation point, reducing latency and network load. However, the distributed and edge-based nature of 5G-MEC presents privacy and security challenges, including data exposure risks. Ensuring efficient manipulation and security of sensitive data at the edge is crucial. To address these challenges, we investigate the usage of threshold secret sharing in 5G-MEC storage, an approach that enhances both security and dependability. A (k,n) threshold secret sharing scheme splits and stores sensitive data among n nodes, requiring at least k nodes for reconstruction. The solution ensures confidentiality by protecting data against fewer than k colluding nodes and enhances availability by tolerating up to n-k failing nodes. This approach mitigates threats such as unauthorized access and node failures, whether accidental or intentional. We further discuss a method for selecting the convenient MEHs to store the shares, considering the MEHs' trustworthiness level as a main criterion. Although we define our proposal in the context of secret-shared data storage, it can be seen as an independent, standalone selection process for 5G-MEC trustworthy node selection in other scenarios too.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Cash or Comfort? How LLMs Value Your Inconvenience
Authors:
Mateusz Cedro,
Timour Ichmoukhamedov,
Sofie Goethals,
Yifan He,
James Hinns,
David Martens
Abstract:
Large Language Models (LLMs) are increasingly proposed as near-autonomous artificial intelligence (AI) agents capable of making everyday decisions on behalf of humans. Although LLMs perform well on many technical tasks, their behaviour in personal decision-making remains less understood. Previous studies have assessed their rationality and moral alignment with human decisions. However, the behavio…
▽ More
Large Language Models (LLMs) are increasingly proposed as near-autonomous artificial intelligence (AI) agents capable of making everyday decisions on behalf of humans. Although LLMs perform well on many technical tasks, their behaviour in personal decision-making remains less understood. Previous studies have assessed their rationality and moral alignment with human decisions. However, the behaviour of AI assistants in scenarios where financial rewards are at odds with user comfort has not yet been thoroughly explored. In this paper, we tackle this problem by quantifying the prices assigned by multiple LLMs to a series of user discomforts: additional walking, waiting, hunger and pain. We uncover several key concerns that strongly question the prospect of using current LLMs as decision-making assistants: (1) a large variance in responses between LLMs, (2) within a single LLM, responses show fragility to minor variations in prompt phrasing (e.g., reformulating the question in the first person can considerably alter the decision), (3) LLMs can accept unreasonably low rewards for major inconveniences (e.g., 1 Euro to wait 10 hours), and (4) LLMs can reject monetary gains where no discomfort is imposed (e.g., 1,000 Euro to wait 0 minutes). These findings emphasize the need for scrutiny of how LLMs value human inconvenience, particularly as we move toward applications where such cash-versus-comfort trade-offs are made on users' behalf.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Uncertainties of a Spherical Magnetic Field Camera
Authors:
Fynn Foerger,
Philip Suskin,
Marija Boberg,
Jonas Faltinath,
Tobias Knopp,
Martin Möddel
Abstract:
Spherical harmonic expansions are well-established tools for estimating magnetic fields from surface measurements and are widely used in applications such as tomographic imaging, geomagnetism, and biomagnetism. Although the mathematical foundations of these expansions are well understood, the impact of real-world imperfections, on the uncertainty of the field model has received little attention. I…
▽ More
Spherical harmonic expansions are well-established tools for estimating magnetic fields from surface measurements and are widely used in applications such as tomographic imaging, geomagnetism, and biomagnetism. Although the mathematical foundations of these expansions are well understood, the impact of real-world imperfections, on the uncertainty of the field model has received little attention. In this work, we present a systematic uncertainty propagation analysis for a magnetic field camera that estimates the field from surface measurements using a spherical array of Hall magnetometers arranged in a spherical t-design. A Monte Carlo-based approach is employed to quantify how sensor-related uncertainties, such as calibration errors and positioning inaccuracies, affect the spatial distribution of the estimated field's uncertainty. The results offer insights into the robustness of spherical harmonic methods and help identify the dominant sources of uncertainty in practical implementations.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Towards Safety Evaluations of Theory of Mind in Large Language Models
Authors:
Tatsuhiro Aoshima,
Mitsuaki Akiyama
Abstract:
As the capabilities of large language models (LLMs) continue to advance, the importance of rigorous safety evaluation is becoming increasingly evident. Recent concerns within the realm of safety assessment have highlighted instances in which LLMs exhibit behaviors that appear to disable oversight mechanisms and respond in a deceptive manner. For example, there have been reports suggesting that, wh…
▽ More
As the capabilities of large language models (LLMs) continue to advance, the importance of rigorous safety evaluation is becoming increasingly evident. Recent concerns within the realm of safety assessment have highlighted instances in which LLMs exhibit behaviors that appear to disable oversight mechanisms and respond in a deceptive manner. For example, there have been reports suggesting that, when confronted with information unfavorable to their own persistence during task execution, LLMs may act covertly and even provide false answers to questions intended to verify their behavior.To evaluate the potential risk of such deceptive actions toward developers or users, it is essential to investigate whether these behaviors stem from covert, intentional processes within the model. In this study, we propose that it is necessary to measure the theory of mind capabilities of LLMs. We begin by reviewing existing research on theory of mind and identifying the perspectives and tasks relevant to its application in safety evaluation. Given that theory of mind has been predominantly studied within the context of developmental psychology, we analyze developmental trends across a series of open-weight LLMs. Our results indicate that while LLMs have improved in reading comprehension, their theory of mind capabilities have not shown comparable development. Finally, we present the current state of safety evaluation with respect to LLMs' theory of mind, and discuss remaining challenges for future work.
△ Less
Submitted 19 June, 2025;
originally announced June 2025.
-
FFINO: Factorized Fourier Improved Neural Operator for Modeling Multiphase Flow in Underground Hydrogen Storage
Authors:
Tao Wang,
Hewei Tang
Abstract:
Underground hydrogen storage (UHS) is a promising energy storage option for the current energy transition to a low-carbon economy. Fast modeling of hydrogen plume migration and pressure field evolution is crucial for UHS field management. In this study, we propose a new neural operator architecture, FFINO, as a fast surrogate model for multiphase flow problems in UHS. We parameterize experimental…
▽ More
Underground hydrogen storage (UHS) is a promising energy storage option for the current energy transition to a low-carbon economy. Fast modeling of hydrogen plume migration and pressure field evolution is crucial for UHS field management. In this study, we propose a new neural operator architecture, FFINO, as a fast surrogate model for multiphase flow problems in UHS. We parameterize experimental relative permeability curves reported in the literature and include them as key uncertainty parameters in the FFINO model. We also compare the FFINO model with the state-of-the-art FMIONet model through a comprehensive combination of metrics. Our new FFINO model has 38.1% fewer trainable parameters, 17.6% less training time, and 12% less GPU memory cost compared to FMIONet. The FFINO model also achieves a 9.8% accuracy improvement in predicting hydrogen plume in focused areas, and 18% higher RMSE in predicting pressure buildup. The inference time of the trained FFINO model is 7850 times faster than a numerical simulator, which makes it a competent substitute for numerical simulations of UHS problems with superior time efficiency.
△ Less
Submitted 19 June, 2025;
originally announced June 2025.
-
Origins of Creativity in Attention-Based Diffusion Models
Authors:
Emma Finn,
T. Anderson Keller,
Manos Theodosis,
Demba E. Ba
Abstract:
As diffusion models have become the tool of choice for image generation and as the quality of the images continues to improve, the question of how `creativity' originates in diffusion has become increasingly important. The score matching perspective on diffusion has proven particularly fruitful for understanding how and why diffusion models generate images that remain plausible while differing sig…
▽ More
As diffusion models have become the tool of choice for image generation and as the quality of the images continues to improve, the question of how `creativity' originates in diffusion has become increasingly important. The score matching perspective on diffusion has proven particularly fruitful for understanding how and why diffusion models generate images that remain plausible while differing significantly from their training images. In particular, as explained in (Kamb \& Ganguli, 2024) and others, e.g., (Ambrogioni, 2023), theory suggests that if our score matching were optimal, we would only be able to recover training samples through our diffusion process. However, as shown by Kamb \& Ganguli, (2024), in diffusion models where the score is parametrized by a simple CNN, the inductive biases of the CNN itself (translation equivariance and locality) allow the model to generate samples that globally do not match any training samples, but are rather patch-wise `mosaics'. Notably, however, this theory does not extend to describe the role of self-attention in this process. In this work, we take a preliminary step in this direction to extend this theory to the case of diffusion models whose score is parametrized by a CNN with a final self-attention layer. We show that our theory suggests that self-attention will induce a globally image-consistent arrangement of local features beyond the patch-level in generated samples, and we verify this behavior empirically on a carefully crafted dataset.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution
Authors:
Tamas Bisztray,
Bilel Cherif,
Richard A. Dubniczky,
Nils Gruschka,
Bertalan Borsos,
Mohamed Amine Ferrag,
Attila Kovacs,
Vasileios Mavroeidis,
Norbert Tihanyi
Abstract:
Detecting AI-generated code, deepfakes, and other synthetic content is an emerging research challenge. As code generated by Large Language Models (LLMs) becomes more common, identifying the specific model behind each sample is increasingly important. This paper presents the first systematic study of LLM authorship attribution for C programs. We released CodeT5-Authorship, a novel model that uses o…
▽ More
Detecting AI-generated code, deepfakes, and other synthetic content is an emerging research challenge. As code generated by Large Language Models (LLMs) becomes more common, identifying the specific model behind each sample is increasingly important. This paper presents the first systematic study of LLM authorship attribution for C programs. We released CodeT5-Authorship, a novel model that uses only the encoder layers from the original CodeT5 encoder-decoder architecture, discarding the decoder to focus on classification. Our model's encoder output (first token) is passed through a two-layer classification head with GELU activation and dropout, producing a probability distribution over possible authors. To evaluate our approach, we introduce LLM-AuthorBench, a benchmark of 32,000 compilable C programs generated by eight state-of-the-art LLMs across diverse tasks. We compare our model to seven traditional ML classifiers and eight fine-tuned transformer models, including BERT, RoBERTa, CodeBERT, ModernBERT, DistilBERT, DeBERTa-V3, Longformer, and LoRA-fine-tuned Qwen2-1.5B. In binary classification, our model achieves 97.56% accuracy in distinguishing C programs generated by closely related models such as GPT-4.1 and GPT-4o, and 95.40% accuracy for multi-class attribution among five leading LLMs (Gemini 2.5 Flash, Claude 3.5 Haiku, GPT-4.1, Llama 3.3, and DeepSeek-V3). To support open science, we release the CodeT5-Authorship architecture, the LLM-AuthorBench benchmark, and all relevant Google Colab scripts on GitHub: https://github.com/LLMauthorbench/.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
An Expert Survey on Models and Digital Twins
Authors:
Jonathan Reif,
Daniel Dittler,
Milapji Singh Gill,
Tamás Farkas,
Valentin Stegmaier,
Felix Gehlhoff,
Tobias Kleinert,
Michael Weyrich
Abstract:
Digital Twins (DTs) are becoming increasingly vital for future industrial applications, enhancing monitoring, control, and optimization of physical assets. This enhancement is made possible by integrating various Digital Models (DMs) within DTs, which must interoperate to represent different system aspects and fulfill diverse application purposes. However, industry perspectives on the challenges a…
▽ More
Digital Twins (DTs) are becoming increasingly vital for future industrial applications, enhancing monitoring, control, and optimization of physical assets. This enhancement is made possible by integrating various Digital Models (DMs) within DTs, which must interoperate to represent different system aspects and fulfill diverse application purposes. However, industry perspectives on the challenges and research needs for integrating these models are rarely obtained. Thus, this study conducts an expert survey across multiple application domains to identify and analyze the challenges in utilizing diverse DMs within DTs. The results reveal missing standardized interfaces, high manual adaptation effort, and limited support for model reuse across lifecycle phases, highlighting future research needs in automated model composition and semantics-based interoperability.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
Authors:
Kangcong Li,
Peng Ye,
Chongjun Tu,
Lin Zhang,
Chunfeng Song,
Jiamin Wu,
Tao Yang,
Qihao Zheng,
Tao Chen
Abstract:
While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain's working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent A…
▽ More
While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain's working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent Activity (PA) Mechanism that mimics prefrontal cortex (PFC) neurons' persistent firing by introducing an activation-level memory bank to dynamically retrieve, reuse, and update critical FFN states, addressing contextual decay; and (2) Cortical Expert (CE) Clustering that emulates task-adaptive neural specialization to reorganize FFN weights into semantic modules, establishing cross-token dependencies and mitigating fragmentation. Extensive evaluations show that PaceLLM achieves 6% improvement on LongBench's Multi-document QA and 12.5-17.5% performance gains on Infinite-Bench tasks, while extending measurable context length to 200K tokens in Needle-In-A-Haystack (NIAH) tests. This work pioneers brain-inspired LLM optimization and is complementary to other works. Besides, it can be generalized to any model and enhance their long-context performance and interpretability without structural overhauls.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
A Nested Watermark for Large Language Models
Authors:
Koichi Nagatsuka,
Terufumi Morishita,
Yasuhiro Sogawa
Abstract:
The rapid advancement of large language models (LLMs) has raised concerns regarding their potential misuse, particularly in generating fake news and misinformation. To address these risks, watermarking techniques for autoregressive language models have emerged as a promising means for detecting LLM-generated text. Existing methods typically embed a watermark by increasing the probabilities of toke…
▽ More
The rapid advancement of large language models (LLMs) has raised concerns regarding their potential misuse, particularly in generating fake news and misinformation. To address these risks, watermarking techniques for autoregressive language models have emerged as a promising means for detecting LLM-generated text. Existing methods typically embed a watermark by increasing the probabilities of tokens within a group selected according to a single secret key. However, this approach suffers from a critical limitation: if the key is leaked, it becomes impossible to trace the text's provenance or attribute authorship. To overcome this vulnerability, we propose a novel nested watermarking scheme that embeds two distinct watermarks into the generated text using two independent keys. This design enables reliable authorship identification even in the event that one key is compromised. Experimental results demonstrate that our method achieves high detection accuracy for both watermarks while maintaining the fluency and overall quality of the generated text.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Challenges and Practices in Quantum Software Testing and Debugging: Insights from Practitioners
Authors:
Jake Zappin,
Trevor Stalnaker,
Oscar Chaparro,
Denys Poshyvanyk
Abstract:
Quantum software engineering is an emerging discipline with distinct challenges, particularly in testing and debugging. As quantum computing transitions from theory to implementation, developers face issues not present in classical software development, such as probabilistic execution, limited observability, shallow abstractions, and low awareness of quantum-specific tools. To better understand cu…
▽ More
Quantum software engineering is an emerging discipline with distinct challenges, particularly in testing and debugging. As quantum computing transitions from theory to implementation, developers face issues not present in classical software development, such as probabilistic execution, limited observability, shallow abstractions, and low awareness of quantum-specific tools. To better understand current practices, we surveyed 26 quantum software developers from academia and industry and conducted follow-up interviews focused on testing, debugging, and recurring challenges. All participants reported engaging in testing, with unit testing (88%), regression testing (54%), and acceptance testing (54%) being the most common. However, only 31% reported using quantum-specific testing tools, relying instead on manual methods. Debugging practices were similarly grounded in classical strategies, such as print statements, circuit visualizations, and simulators, which respondents noted do not scale well. The most frequently cited sources of bugs were classical in nature-library updates (81%), developer mistakes (68%), and compatibility issues (62%)-often worsened by limited abstraction in existing SDKs. These findings highlight the urgent need for better-aligned testing and debugging tools, integrated more seamlessly into the workflows of quantum developers. We present these results in detail and offer actionable recommendations grounded in the real-world needs of practitioners.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
The California Report on Frontier AI Policy
Authors:
Rishi Bommasani,
Scott R. Singer,
Ruth E. Appel,
Sarah Cen,
A. Feder Cooper,
Elena Cryst,
Lindsey A. Gailmard,
Ian Klaus,
Meredith M. Lee,
Inioluwa Deborah Raji,
Anka Reuel,
Drew Spence,
Alexander Wan,
Angelina Wang,
Daniel Zhang,
Daniel E. Ho,
Percy Liang,
Dawn Song,
Joseph E. Gonzalez,
Jonathan Zittrain,
Jennifer Tour Chayes,
Mariano-Florentino Cuellar,
Li Fei-Fei
Abstract:
The innovations emerging at the frontier of artificial intelligence (AI) are poised to create historic opportunities for humanity but also raise complex policy challenges. Continued progress in frontier AI carries the potential for profound advances in scientific discovery, economic productivity, and broader social well-being. As the epicenter of global AI innovation, California has a unique oppor…
▽ More
The innovations emerging at the frontier of artificial intelligence (AI) are poised to create historic opportunities for humanity but also raise complex policy challenges. Continued progress in frontier AI carries the potential for profound advances in scientific discovery, economic productivity, and broader social well-being. As the epicenter of global AI innovation, California has a unique opportunity to continue supporting developments in frontier AI while addressing substantial risks that could have far reaching consequences for the state and beyond. This report leverages broad evidence, including empirical research, historical analysis, and modeling and simulations, to provide a framework for policymaking on the frontier of AI development. Building on this multidisciplinary approach, this report derives policy principles that can inform how California approaches the use, assessment, and governance of frontier AI: principles rooted in an ethos of trust but verify. This approach takes into account the importance of innovation while establishing appropriate strategies to reduce material risks.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
Fine-Scale Soil Mapping in Alaska with Multimodal Machine Learning
Authors:
Yijun Lin,
Theresa Chen,
Colby Brungard,
Grunwald Sabine,
Sue Ives,
Matt Macander,
Timm Nawrocki,
Yao-Yi Chiang,
Nic Jelinski
Abstract:
Fine-scale soil mapping in Alaska, traditionally relying on fieldwork and localized simulations, remains a critical yet underdeveloped task, despite the region's ecological importance and extensive permafrost coverage. As permafrost thaw accelerates due to climate change, it threatens infrastructure stability and key ecosystem services, such as soil carbon storage. High-resolution soil maps are es…
▽ More
Fine-scale soil mapping in Alaska, traditionally relying on fieldwork and localized simulations, remains a critical yet underdeveloped task, despite the region's ecological importance and extensive permafrost coverage. As permafrost thaw accelerates due to climate change, it threatens infrastructure stability and key ecosystem services, such as soil carbon storage. High-resolution soil maps are essential for characterizing permafrost distribution, identifying vulnerable areas, and informing adaptation strategies. We present MISO, a vision-based machine learning (ML) model to produce statewide fine-scale soil maps for near-surface permafrost and soil taxonomy. The model integrates a geospatial foundation model for visual feature extraction, implicit neural representations for continuous spatial prediction, and contrastive learning for multimodal alignment and geo-location awareness. We compare MISO with Random Forest (RF), a traditional ML model that has been widely used in soil mapping applications. Spatial cross-validation and regional analysis across Permafrost Zones and Major Land Resource Areas (MLRAs) show that MISO generalizes better to remote, unseen locations and achieves higher recall than RF, which is critical for monitoring permafrost thaw and related environmental processes. These findings demonstrate the potential of advanced ML approaches for fine-scale soil mapping and provide practical guidance for future soil sampling and infrastructure planning in permafrost-affected landscapes. The project will be released at https://github.com/knowledge-computing/Peatland-permafrost.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
Individual Causal Inference with Structural Causal Model
Authors:
Daniel T. Chang
Abstract:
Individual causal inference (ICI) uses causal inference methods to understand and predict the effects of interventions on individuals, considering their specific characteristics / facts. It aims to estimate individual causal effect (ICE), which varies across individuals. Estimating ICE can be challenging due to the limited data available for individuals, and the fact that most causal inference met…
▽ More
Individual causal inference (ICI) uses causal inference methods to understand and predict the effects of interventions on individuals, considering their specific characteristics / facts. It aims to estimate individual causal effect (ICE), which varies across individuals. Estimating ICE can be challenging due to the limited data available for individuals, and the fact that most causal inference methods are population-based. Structural Causal Model (SCM) is fundamentally population-based. Therefore, causal discovery (structural learning and parameter learning), association queries and intervention queries are all naturally population-based. However, exogenous variables (U) in SCM can encode individual variations and thus provide the mechanism for individualized population per specific individual characteristics / facts. Based on this, we propose ICI with SCM as a "rung 3" causal inference, because it involves "imagining" what would be the causal effect of a hypothetical intervention on an individual, given the individual's observed characteristics / facts. Specifically, we propose the indiv-operator, indiv(W), to formalize/represent the population individualization process, and the individual causal query, P(Y | indiv(W), do(X), Z), to formalize/represent ICI. We show and argue that ICI with SCM is inference on individual alternatives (possible), not individual counterfactuals (non-actual).
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
SafeRL-Lite: A Lightweight, Explainable, and Constrained Reinforcement Learning Library
Authors:
Satyam Mishra,
Phung Thao Vi,
Shivam Mishra,
Vishwanath Bijalwan,
Vijay Bhaskar Semwal,
Abdul Manan Khan
Abstract:
We introduce SafeRL-Lite, an open-source Python library for building reinforcement learning (RL) agents that are both constrained and explainable. Existing RL toolkits often lack native mechanisms for enforcing hard safety constraints or producing human-interpretable rationales for decisions. SafeRL-Lite provides modular wrappers around standard Gym environments and deep Q-learning agents to enabl…
▽ More
We introduce SafeRL-Lite, an open-source Python library for building reinforcement learning (RL) agents that are both constrained and explainable. Existing RL toolkits often lack native mechanisms for enforcing hard safety constraints or producing human-interpretable rationales for decisions. SafeRL-Lite provides modular wrappers around standard Gym environments and deep Q-learning agents to enable: (i) safety-aware training via constraint enforcement, and (ii) real-time post-hoc explanation via SHAP values and saliency maps. The library is lightweight, extensible, and installable via pip, and includes built-in metrics for constraint violations. We demonstrate its effectiveness on constrained variants of CartPole and provide visualizations that reveal both policy logic and safety adherence. The full codebase is available at: https://github.com/satyamcser/saferl-lite.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
Theoretically Unmasking Inference Attacks Against LDP-Protected Clients in Federated Vision Models
Authors:
Quan Nguyen,
Minh N. Vu,
Truc Nguyen,
My T. Thai
Abstract:
Federated Learning enables collaborative learning among clients via a coordinating server while avoiding direct data sharing, offering a perceived solution to preserve privacy. However, recent studies on Membership Inference Attacks (MIAs) have challenged this notion, showing high success rates against unprotected training data. While local differential privacy (LDP) is widely regarded as a gold s…
▽ More
Federated Learning enables collaborative learning among clients via a coordinating server while avoiding direct data sharing, offering a perceived solution to preserve privacy. However, recent studies on Membership Inference Attacks (MIAs) have challenged this notion, showing high success rates against unprotected training data. While local differential privacy (LDP) is widely regarded as a gold standard for privacy protection in data analysis, most studies on MIAs either neglect LDP or fail to provide theoretical guarantees for attack success rates against LDP-protected data. To address this gap, we derive theoretical lower bounds for the success rates of low-polynomial time MIAs that exploit vulnerabilities in fully connected or self-attention layers. We establish that even when data are protected by LDP, privacy risks persist, depending on the privacy budget. Practical evaluations on federated vision models confirm considerable privacy risks, revealing that the noise required to mitigate these attacks significantly degrades models' utility.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
Relationship between unpredictability and intermittency in shell models of turbulence and experiments
Authors:
Ewen Frogé,
Carlos Granero-Belinchon,
Stéphane G. Roux,
Thierry Chonavel,
Nicolas B. Garnier
Abstract:
We study the predictability of turbulent velocity signals using probabilistic analog-forecasting. Here, predictability is defined by the accuracy of forecasts and the associated uncertainties. We study the Gledzer-Ohkitani-Yamada (GOY) shell model of turbulence as well as experimental measurements from a fully developed turbulent flow. In both cases, we identify the extreme values of velocity at s…
▽ More
We study the predictability of turbulent velocity signals using probabilistic analog-forecasting. Here, predictability is defined by the accuracy of forecasts and the associated uncertainties. We study the Gledzer-Ohkitani-Yamada (GOY) shell model of turbulence as well as experimental measurements from a fully developed turbulent flow. In both cases, we identify the extreme values of velocity at small scales as localized unpredictable events that lead to a loss of predictability: worse predictions and increase of their uncertainties. The GOY model, with its explicit scale separation, allows to evaluate the prediction performance at individual scales, and so to better relate the intensity of extreme events and the loss of forecast performance. Results show that predictability decreases systematically from large to small scales. These findings establish a statistical connection between predictability loss across scales and intermittency in turbulent flows.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Chunk Twice, Embed Once: A Systematic Study of Segmentation and Representation Trade-offs in Chemistry-Aware Retrieval-Augmented Generation
Authors:
Mahmoud Amiri,
Thomas Bocklitz
Abstract:
Retrieval-Augmented Generation (RAG) systems are increasingly vital for navigating the ever-expanding body of scientific literature, particularly in high-stakes domains such as chemistry. Despite the promise of RAG, foundational design choices -- such as how documents are segmented and represented -- remain underexplored in domain-specific contexts. This study presents the first large-scale, syste…
▽ More
Retrieval-Augmented Generation (RAG) systems are increasingly vital for navigating the ever-expanding body of scientific literature, particularly in high-stakes domains such as chemistry. Despite the promise of RAG, foundational design choices -- such as how documents are segmented and represented -- remain underexplored in domain-specific contexts. This study presents the first large-scale, systematic evaluation of chunking strategies and embedding models tailored to chemistry-focused RAG systems. We investigate 25 chunking configurations across five method families and evaluate 48 embedding models on three chemistry-specific benchmarks, including the newly introduced QuestChemRetrieval dataset. Our results reveal that recursive token-based chunking (specifically R100-0) consistently outperforms other approaches, offering strong performance with minimal resource overhead. We also find that retrieval-optimized embeddings -- such as Nomic and Intfloat E5 variants -- substantially outperform domain-specialized models like SciBERT. By releasing our datasets, evaluation framework, and empirical benchmarks, we provide actionable guidelines for building effective and efficient chemistry-aware RAG systems.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
QUST_NLP at SemEval-2025 Task 7: A Three-Stage Retrieval Framework for Monolingual and Crosslingual Fact-Checked Claim Retrieval
Authors:
Youzheng Liu,
Jiyan Liu,
Xiaoman Xu,
Taihang Wang,
Yimin Wang,
Ye Jiang
Abstract:
This paper describes the participation of QUST_NLP in the SemEval-2025 Task 7. We propose a three-stage retrieval framework specifically designed for fact-checked claim retrieval. Initially, we evaluate the performance of several retrieval models and select the one that yields the best results for candidate retrieval. Next, we employ multiple re-ranking models to enhance the candidate results, wit…
▽ More
This paper describes the participation of QUST_NLP in the SemEval-2025 Task 7. We propose a three-stage retrieval framework specifically designed for fact-checked claim retrieval. Initially, we evaluate the performance of several retrieval models and select the one that yields the best results for candidate retrieval. Next, we employ multiple re-ranking models to enhance the candidate results, with each model selecting the Top-10 outcomes. In the final stage, we utilize weighted voting to determine the final retrieval outcomes. Our approach achieved 5th place in the monolingual track and 7th place in the crosslingual track. We release our system code at: https://github.com/warmth27/SemEval2025_Task7
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
AI to Identify Strain-sensitive Regions of the Optic Nerve Head Linked to Functional Loss in Glaucoma
Authors:
Thanadet Chuangsuwanich,
Monisha E. Nongpiur,
Fabian A. Braeu,
Tin A. Tun,
Alexandre Thiery,
Shamira Perera,
Ching Lin Ho,
Martin Buist,
George Barbastathis,
Tin Aung,
Michaël J. A. Girard
Abstract:
Objective: (1) To assess whether ONH biomechanics improves prediction of three progressive visual field loss patterns in glaucoma; (2) to use explainable AI to identify strain-sensitive ONH regions contributing to these predictions.
Methods: We recruited 237 glaucoma subjects. The ONH of one eye was imaged under two conditions: (1) primary gaze and (2) primary gaze with IOP elevated to ~35 mmHg…
▽ More
Objective: (1) To assess whether ONH biomechanics improves prediction of three progressive visual field loss patterns in glaucoma; (2) to use explainable AI to identify strain-sensitive ONH regions contributing to these predictions.
Methods: We recruited 237 glaucoma subjects. The ONH of one eye was imaged under two conditions: (1) primary gaze and (2) primary gaze with IOP elevated to ~35 mmHg via ophthalmo-dynamometry. Glaucoma experts classified the subjects into four categories based on the presence of specific visual field defects: (1) superior nasal step (N=26), (2) superior partial arcuate (N=62), (3) full superior hemifield defect (N=25), and (4) other/non-specific defects (N=124). Automatic ONH tissue segmentation and digital volume correlation were used to compute IOP-induced neural tissue and lamina cribrosa (LC) strains. Biomechanical and structural features were input to a Geometric Deep Learning model. Three classification tasks were performed to detect: (1) superior nasal step, (2) superior partial arcuate, (3) full superior hemifield defect. For each task, the data were split into 80% training and 20% testing sets. Area under the curve (AUC) was used to assess performance. Explainable AI techniques were employed to highlight the ONH regions most critical to each classification.
Results: Models achieved high AUCs of 0.77-0.88, showing that ONH strain improved VF loss prediction beyond morphology alone. The inferior and inferotemporal rim were identified as key strain-sensitive regions, contributing most to visual field loss prediction and showing progressive expansion with increasing disease severity.
Conclusion and Relevance: ONH strain enhances prediction of glaucomatous VF loss patterns. Neuroretinal rim, rather than the LC, was the most critical region contributing to model predictions.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Part$^{2}$GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting
Authors:
Tianjiao Yu,
Vedant Shah,
Muntasir Wahed,
Ying Shen,
Kiet A. Nguyen,
Ismini Lourentzou
Abstract:
Articulated objects are common in the real world, yet modeling their structure and motion remains a challenging task for 3D reconstruction methods. In this work, we introduce Part$^{2}$GS, a novel framework for modeling articulated digital twins of multi-part objects with high-fidelity geometry and physically consistent articulation. Part$^{2}$GS leverages a part-aware 3D Gaussian representation t…
▽ More
Articulated objects are common in the real world, yet modeling their structure and motion remains a challenging task for 3D reconstruction methods. In this work, we introduce Part$^{2}$GS, a novel framework for modeling articulated digital twins of multi-part objects with high-fidelity geometry and physically consistent articulation. Part$^{2}$GS leverages a part-aware 3D Gaussian representation that encodes articulated components with learnable attributes, enabling structured, disentangled transformations that preserve high-fidelity geometry. To ensure physically consistent motion, we propose a motion-aware canonical representation guided by physics-based constraints, including contact enforcement, velocity consistency, and vector-field alignment. Furthermore, we introduce a field of repel points to prevent part collisions and maintain stable articulation paths, significantly improving motion coherence over baselines. Extensive evaluations on both synthetic and real-world datasets show that Part$^{2}$GS consistently outperforms state-of-the-art methods by up to 10$\times$ in Chamfer Distance for movable parts.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Lower Bounds against the Ideal Proof System in Finite Fields
Authors:
Tal Elbaz,
Nashlen Govindasamy,
Jiaqi Lu,
Iddo Tzameret
Abstract:
Lower bounds against strong algebraic proof systems and specifically fragments of the Ideal Proof System (IPS), have been obtained in an ongoing line of work. All of these bounds, however, are proved only over large (or characteristic $0$) fields, yet finite fields are the more natural setting for propositional proof complexity, especially for progress toward lower bounds for Frege systems such as…
▽ More
Lower bounds against strong algebraic proof systems and specifically fragments of the Ideal Proof System (IPS), have been obtained in an ongoing line of work. All of these bounds, however, are proved only over large (or characteristic $0$) fields, yet finite fields are the more natural setting for propositional proof complexity, especially for progress toward lower bounds for Frege systems such as $AC^0[p]$-Frege. This work establishes lower bounds against fragments of IPS over fixed finite fields. Specifically, we show that a variant of the knapsack instance studied by Govindasamy, Hakoniemi, and Tzameret (FOCS'22) has no polynomial-size IPS refutation over finite fields when the refutation is multilinear and written as a constant-depth circuit. The key ingredient of our argument is the recent set-multilinearization result of Forbes (CCC'24), which extends the earlier result of Limaye, Srinivasan, and Tavenas (FOCS'21) to all fields, and an extension of the techniques of Govindasamy, Hakoniemi, and Tzameret to finite fields. We also separate this proof system from the one studied by Govindasamy, Hakoniemi, and Tzameret.
In addition, we present new lower bounds for read-once algebraic branching program refutations, roABP-IPS, in finite fields, extending results of Forbes, Shpilka, Tzameret, and Wigderson (Theor. of Comput.'21) and Hakoniemi, Limaye, and Tzameret (STOC'24).
Finally, we show that any lower bound against any proof system at least as strong as (non-multilinear) constant-depth IPS over finite fields for any instance, even a purely algebraic instance (i.e., not a translation of a Boolean formula or CNF), implies a hard CNF formula for the respective IPS fragment, and hence an $AC^0[p]$-Frege lower bound by known simulations over finite fields (Grochow and Pitassi (J. ACM'18)).
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
Authors:
Teng Li,
Quanfeng Lu,
Lirui Zhao,
Hao Li,
Xizhou Zhu,
Yu Qiao,
Jun Zhang,
Wenqi Shao
Abstract:
Unified image understanding and generation has emerged as a promising paradigm in multimodal artificial intelligence. Despite recent progress, the optimal architectural design for such unified models remains an open challenge. In this work, we start by analyzing the modality alignment behaviors of task-specific expert models for understanding and generation, as well as current unified models. Our…
▽ More
Unified image understanding and generation has emerged as a promising paradigm in multimodal artificial intelligence. Despite recent progress, the optimal architectural design for such unified models remains an open challenge. In this work, we start by analyzing the modality alignment behaviors of task-specific expert models for understanding and generation, as well as current unified models. Our analysis reveals a crucial observation: understanding tasks benefit from a progressively increasing modality alignment across network depth, which helps build up semantic information for better comprehension; In contrast, generation tasks follow a different trend: modality alignment increases in the early layers but decreases in the deep layers to recover spatial details. These divergent alignment patterns create a fundamental conflict in fully shared Transformer backbones, where a uniform representational flow often leads to performance compromises across two tasks. Motivated by this finding, we introduce UniFork, a novel Y-shaped architecture that shares the shallow layers for cross-task representation learning, while employing task-specific branches in deeper layers to avoid task interference. This design effectively balances shared learning and task specialization. Through extensive ablation experiments, we demonstrate that Unifork consistently outperforms conventional fully shared Transformer architectures, and achieves performance on par with or better than task-specific models.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
Authors:
Jiaqi Li,
Junshu Tang,
Zhiyong Xu,
Longhuang Wu,
Yuan Zhou,
Shuai Shao,
Tianbao Yu,
Zhiguo Cao,
Qinglin Lu
Abstract:
Recent advances in diffusion-based and controllable video generation have enabled high-quality and temporally coherent video synthesis, laying the groundwork for immersive interactive gaming experiences. However, current methods face limitations in dynamics, generality, long-term consistency, and efficiency, which limit the ability to create various gameplay videos. To address these gaps, we intro…
▽ More
Recent advances in diffusion-based and controllable video generation have enabled high-quality and temporally coherent video synthesis, laying the groundwork for immersive interactive gaming experiences. However, current methods face limitations in dynamics, generality, long-term consistency, and efficiency, which limit the ability to create various gameplay videos. To address these gaps, we introduce Hunyuan-GameCraft, a novel framework for high-dynamic interactive video generation in game environments. To achieve fine-grained action control, we unify standard keyboard and mouse inputs into a shared camera representation space, facilitating smooth interpolation between various camera and movement operations. Then we propose a hybrid history-conditioned training strategy that extends video sequences autoregressively while preserving game scene information. Additionally, to enhance inference efficiency and playability, we achieve model distillation to reduce computational overhead while maintaining consistency across long temporal sequences, making it suitable for real-time deployment in complex interactive environments. The model is trained on a large-scale dataset comprising over one million gameplay recordings across over 100 AAA games, ensuring broad coverage and diversity, then fine-tuned on a carefully annotated synthetic dataset to enhance precision and control. The curated game scene data significantly improves the visual fidelity, realism and action controllability. Extensive experiments demonstrate that Hunyuan-GameCraft significantly outperforms existing models, advancing the realism and playability of interactive game video generation.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Operation and performance of the CMS silicon strip tracker with proton-proton collisions at the CERN LHC
Authors:
CMS Collaboration
Abstract:
Salient aspects of the commissioning, calibration, and performance of the CMS silicon strip tracker are discussed, drawing on experience during operation with proton-proton collisions delivered by the CERN LHC. The data were obtained with a variety of luminosities. The operating temperature of the strip tracker was changed several times during this period and results are shown as a function of tem…
▽ More
Salient aspects of the commissioning, calibration, and performance of the CMS silicon strip tracker are discussed, drawing on experience during operation with proton-proton collisions delivered by the CERN LHC. The data were obtained with a variety of luminosities. The operating temperature of the strip tracker was changed several times during this period and results are shown as a function of temperature in several cases. Details of the system performance are presented, including occupancy, signal-to-noise ratio, Lorentz angle, and single-hit spatial resolution. Saturation effects in the APV25 readout chip preamplifier observed during early Run 2 are presented, showing the effect on various observables and the subsequent remedy. Studies of radiation effects on the strip tracker are presented both for the optical readout links and the silicon sensors. The observed effects are compared to simulation, where available, and they generally agree well with expectations.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Tensor network calculation of boundary and corner magnetization
Authors:
Roman Krcmar,
Jozef Genzor,
Andrej Gendiar,
Tomotoshi Nishino
Abstract:
The Corner Transfer Matrix Renormalization Group (CTMRG) algorithm is modified to measure the magnetization at the boundary of the system, including the corners of the square-shaped lattice. Using automatic differentiation, we calculate the magnetization's first derivative, allowing us to determine the boundary critical exponent $β$ accurately.
The Corner Transfer Matrix Renormalization Group (CTMRG) algorithm is modified to measure the magnetization at the boundary of the system, including the corners of the square-shaped lattice. Using automatic differentiation, we calculate the magnetization's first derivative, allowing us to determine the boundary critical exponent $β$ accurately.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Facial Landmark Visualization and Emotion Recognition Through Neural Networks
Authors:
Israel Juárez-Jiménez,
Tiffany Guadalupe Martínez Paredes,
Jesús García-Ramírez,
Eric Ramos Aguilar
Abstract:
Emotion recognition from facial images is a crucial task in human-computer interaction, enabling machines to learn human emotions through facial expressions. Previous studies have shown that facial images can be used to train deep learning models; however, most of these studies do not include a through dataset analysis. Visualizing facial landmarks can be challenging when extracting meaningful dat…
▽ More
Emotion recognition from facial images is a crucial task in human-computer interaction, enabling machines to learn human emotions through facial expressions. Previous studies have shown that facial images can be used to train deep learning models; however, most of these studies do not include a through dataset analysis. Visualizing facial landmarks can be challenging when extracting meaningful dataset insights; to address this issue, we propose facial landmark box plots, a visualization technique designed to identify outliers in facial datasets. Additionally, we compare two sets of facial landmark features: (i) the landmarks' absolute positions and (ii) their displacements from a neutral expression to the peak of an emotional expression. Our results indicate that a neural network achieves better performance than a random forest classifier.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
A Common Pool of Privacy Problems: Legal and Technical Lessons from a Large-Scale Web-Scraped Machine Learning Dataset
Authors:
Rachel Hong,
Jevan Hutson,
William Agnew,
Imaad Huda,
Tadayoshi Kohno,
Jamie Morgenstern
Abstract:
We investigate the contents of web-scraped data for training AI systems, at sizes where human dataset curators and compilers no longer manually annotate every sample. Building off of prior privacy concerns in machine learning models, we ask: What are the legal privacy implications of web-scraped machine learning datasets? In an empirical study of a popular training dataset, we find significant pre…
▽ More
We investigate the contents of web-scraped data for training AI systems, at sizes where human dataset curators and compilers no longer manually annotate every sample. Building off of prior privacy concerns in machine learning models, we ask: What are the legal privacy implications of web-scraped machine learning datasets? In an empirical study of a popular training dataset, we find significant presence of personally identifiable information despite sanitization efforts. Our audit provides concrete evidence to support the concern that any large-scale web-scraped dataset may contain personal data. We use these findings of a real-world dataset to inform our legal analysis with respect to existing privacy and data protection laws. We surface various privacy risks of current data curation practices that may propagate personal information to downstream models. From our findings, we argue for reorientation of current frameworks of "publicly available" information to meaningfully limit the development of AI built upon indiscriminate scraping of the internet.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
The effect of target orientation on the mean first passage time of a Brownian particle to a small elliptical absorber
Authors:
Sanchita Chakraborty,
Theodore Kolokolnikov,
Alan E. Lindsay
Abstract:
We develop a high order asymptotic expansion for the mean first passage time (MFPT) of the capture of Brownian particles by a small elliptical trap in a bounded two dimensional region. This new result describes the effect that trap orientation plays on the capture rate and extends existing results that give information only on the role of trap position on the capture rate. Our results are validate…
▽ More
We develop a high order asymptotic expansion for the mean first passage time (MFPT) of the capture of Brownian particles by a small elliptical trap in a bounded two dimensional region. This new result describes the effect that trap orientation plays on the capture rate and extends existing results that give information only on the role of trap position on the capture rate. Our results are validated against numerical simulations which confirm the accuracy of the asymptotic approximation. In the case of the unit disk domain, we identify a bifurcation such that the high order correction to the global MFPT (GMFPT) is minimized when the trap is orientated in the radial direction for traps centered at $0<r<r_c :=\sqrt{2-\sqrt{2}}$. When centered at position $r_c<r<1$, the GMFPT correction is minimized by orientating the trap in the angular direction. In the scenario of a general two-dimensional geometry, we identify the orientation that minimizes the GMFPT in terms of the regular part of the Neumann Green's function. This theory is demonstrated on several regular domains such as disks, ellipses and rectangles.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Feedback cooling scheme for an optically levitated oscillator with controlled cross-talk
Authors:
J. M. H. Gosling,
A. Pontin,
F. Alder,
M. Rademacher,
T. S. Monteiro,
P. F. Barker
Abstract:
Levitated optical mechanical systems have demonstrated excellent force and impulse sensitivity and are currently being developed for the creation of non-classical states of motion in these new quantum systems. An important requirement in the design of these systems is the ability to independently control and cool all three translational degrees of freedom. Here we describe the design and implement…
▽ More
Levitated optical mechanical systems have demonstrated excellent force and impulse sensitivity and are currently being developed for the creation of non-classical states of motion in these new quantum systems. An important requirement in the design of these systems is the ability to independently control and cool all three translational degrees of freedom. Here we describe the design and implementation of a stable and robust 3D velocity feedback cooling scheme with particular emphasis on creating minimal cross-talk between the independent oscillatory modes when cooling.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Structured Harmonic Generation via Geometric Phase Enabled Pump Shaping
Authors:
Ting-Ting Liu,
Shi-Hui Ding,
Chun-Yu Li,
Hui Liu,
Zhi-Han Zhu,
Peng Chen,
Yan-Qing Lu
Abstract:
Nonlinear optics is crucial for shaping the spatial structure of shortwave light and its interactions with matter, but achieving this through simple harmonic generation with a single pump is challenging. This study demonstrates nonlinear spin-orbit conversion using spin-dependent pump shaping via geometric phase, allowing the direct creation of desired structured harmonic waves from a Gaussian pum…
▽ More
Nonlinear optics is crucial for shaping the spatial structure of shortwave light and its interactions with matter, but achieving this through simple harmonic generation with a single pump is challenging. This study demonstrates nonlinear spin-orbit conversion using spin-dependent pump shaping via geometric phase, allowing the direct creation of desired structured harmonic waves from a Gaussian pump beam. By using the liquid-crystal flat optical elements fabricated with photoalignment, we experimentally produce higher-order cylindrically vectorial modes in second harmonic fields. We examine the vectorial spatial wavefunctions, their propagation invariance, and nonlinear spin-orbit conversion. Our results provide an efficient method for full structuring nonlinear light in broader harmonic systems, with significant applications in laser micromachining and high-energy physics.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Higher dimensional Sacks-Uhlenbeck-type functionals and applications
Authors:
Gianmichele Di Matteo,
Tobias Lamm
Abstract:
In this work, we generalize Sacks-Uhlenbeck's existence result for harmonic spheres, constructing for $n \ge 2$, regular, non-trivial, $n$-harmonic $n$-spheres into suitable target manifolds. We obtain an infinite family of new null-homotopic such maps. The proof follows a similar perturbative argument, which in high dimensions leads to a degenerate and double-phase-type Euler-Lagrange system, mak…
▽ More
In this work, we generalize Sacks-Uhlenbeck's existence result for harmonic spheres, constructing for $n \ge 2$, regular, non-trivial, $n$-harmonic $n$-spheres into suitable target manifolds. We obtain an infinite family of new null-homotopic such maps. The proof follows a similar perturbative argument, which in high dimensions leads to a degenerate and double-phase-type Euler-Lagrange system, making the uniform regularity needed to formalize the bubbling harder to achieve. Then, we develop a refined neck-analysis leading to an energy identity along the approximation, assuming a suitable Struwe-type entropy bound along a sequence of critical points. Finally, we combine these results to solve quite general min-max problems for the $n$-energy modulo bubbling.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network
Authors:
Mahin Montasir Afif,
Abdullah Al Noman,
K. M. Tahsin Kabir,
Md. Mortuza Ahmmed,
Md. Mostafizur Rahman,
Mufti Mahmud,
Md. Ashraful Babu
Abstract:
Generative Adversarial Networks (GAN) have shown potential in expanding limited medical imaging datasets. This study explores how different ratios of GAN-generated and real brain tumor MRI images impact the performance of a CNN in classifying healthy vs. tumorous scans. A DCGAN was used to create synthetic images which were mixed with real ones at various ratios to train a custom CNN. The CNN was…
▽ More
Generative Adversarial Networks (GAN) have shown potential in expanding limited medical imaging datasets. This study explores how different ratios of GAN-generated and real brain tumor MRI images impact the performance of a CNN in classifying healthy vs. tumorous scans. A DCGAN was used to create synthetic images which were mixed with real ones at various ratios to train a custom CNN. The CNN was then evaluated on a separate real-world test set. Our results indicate that the model maintains high sensitivity and precision in tumor classification, even when trained predominantly on synthetic data. When only a small portion of GAN data was added, such as 900 real images and 100 GAN images, the model achieved excellent performance, with test accuracy reaching 95.2%, and precision, recall, and F1-score all exceeding 95%. However, as the proportion of GAN images increased further, performance gradually declined. This study suggests that while GANs are useful for augmenting limited datasets especially when real data is scarce, too much synthetic data can introduce artifacts that affect the model's ability to generalize to real world cases.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
$^{50}$Cr and $^{53}$Cr neutron capture cross sections measurement at the n_TOF facility at CERN
Authors:
P. Pérez-Maroto,
C. Guerrero,
A. Casanovas,
B. Fernández,
E. Mendoza,
V. Alcayne,
J. Lerendegui-Marco,
C. Domingo-Pardo,
J. M. Quesada,
R. Capote,
the n_TOF Collaboration
Abstract:
$^{50,53}$Cr are very relevant in criticality safety benchmarks related to nuclear reactors. The discrepancies between the neutron capture cross section evaluations have an important effect on the $k_{eff}$ and $k_{\infty}$ in criticality benchmarks particularly sensitive to chromium. The $^{50,53}$Cr(n,$γ…
▽ More
$^{50,53}$Cr are very relevant in criticality safety benchmarks related to nuclear reactors. The discrepancies between the neutron capture cross section evaluations have an important effect on the $k_{eff}$ and $k_{\infty}$ in criticality benchmarks particularly sensitive to chromium. The $^{50,53}$Cr(n,$γ$) cross sections is to be determined between 1 and 100 keV with an 8-10% accuracy following the requirements of the NEA High Priority Request List (HPRL) to solve the current discrepancies. We have measured the neutron capture cross sections by the time-of-flight technique at the EAR1 experimental area of the n_TOF facility, using an array of four C$_6$D$_6$ detectors with very low neutron sensitivity. The highly-enriched samples used are significantly thinner than in previous measurements, thus minimizing the multiple-scattering effects. We have produced, and analyzed with the R-matrix analysis code SAMMY, capture yields featuring 33 resonances of $^{50}$Cr and 51 of $^{53}$Cr with an accuracy between 5% and 9%, hence fulfilling the requirements made by the NEA. The differential and integral cross sections have been compared to previous data and evaluations. The new $^{50,53}$Cr(n,$γ$) cross sections measured at the CERN n\TOF facility provide a valuable input for upcoming evaluations, which are deemed necessary given that the results presented herein do not support the increase in both cross sections proposed in the recent INDEN evaluation.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Global Microprocessor Correctness in the Presence of Transient Execution
Authors:
Andrew T. Walter,
Konstantinos Athanasiou,
Panagiotis Manolios
Abstract:
Correctness for microprocessors is generally understood to be conformance with the associated instruction set architecture (ISA). This is the basis for one of the most important abstractions in computer science, allowing hardware designers to develop highly-optimized processors that are functionally "equivalent" to an ideal processor that executes instructions atomically. This specification is alm…
▽ More
Correctness for microprocessors is generally understood to be conformance with the associated instruction set architecture (ISA). This is the basis for one of the most important abstractions in computer science, allowing hardware designers to develop highly-optimized processors that are functionally "equivalent" to an ideal processor that executes instructions atomically. This specification is almost always informal, e.g., commercial microprocessors generally do not come with conformance specifications. In this paper, we advocate for the use of formal specifications, using the theory of refinement. We introduce notions of correctness that can be used to deal with transient execution attacks, including Meltdown and Spectre. Such attacks have shown that ubiquitous microprocessor optimizations, appearing in numerous processors for decades, are inherently buggy. Unlike alternative approaches that use non-interference properties, our notion of correctness is global, meaning it is single specification that: formalizes conformance, includes functional correctness and is parameterized by an microarchitecture. We introduce action skipping refinement, a new type of refinement and we describe how our notions of refinement can be decomposed into properties that are more amenable to automated verification using the the concept of shared-resource commitment refinement maps. We do this in the context of formal, fully executable bit- and cycle-accurate models of an ISA and a microprocessor. Finally, we show how light-weight formal methods based on property-based testing can be used to identify transient execution bugs.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Profile monitoring of random functions with Gaussian process basis expansions
Authors:
Takayuki Iguchi,
Jonathan R. Stewart,
Eric Chicken
Abstract:
We consider the problem of online profile monitoring of random functions that admit basis expansions possessing random coefficients for the purpose of out-of-control state detection. Our approach is applicable to a broad class of random functions which feature two sources of variation: additive error and random fluctuations through random coefficients in the basis representation of functions. We f…
▽ More
We consider the problem of online profile monitoring of random functions that admit basis expansions possessing random coefficients for the purpose of out-of-control state detection. Our approach is applicable to a broad class of random functions which feature two sources of variation: additive error and random fluctuations through random coefficients in the basis representation of functions. We focus on a two-phase monitoring problem with a first stage consisting of learning the in-control process and the second stage leveraging the learned process for out-of-control state detection. The foundations of our method are derived under the assumption that the coefficients in the basis expansion are Gaussian random variables, which facilitates the development of scalable and effective monitoring methodology for the observed processes that makes weak functional assumptions on the underlying process. We demonstrate the potential of our method through simulation studies that highlight some of the nuances that emerge in profile monitoring problems with random functions, and through an application.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Sacks-Uhlenbeck type regularity for subcritical generalized $p$-harmonic maps into Homogeneous targets
Authors:
Gianmichele Di Matteo,
Tobias Lamm
Abstract:
Adapting \cite{strz3}, we define generalized $p$-harmonic maps into Riemannian homogeneous targets, a notion of solutions not belonging to the energy space. Restricting our attention to the subcritical range $p$ greater than the domain dimension $n$, we show a uniform $C^{1,α}$-regularity result for a sequence of such maps in the limit $p \searrow n$, assuming a uniform $n$-energy bound on its ele…
▽ More
Adapting \cite{strz3}, we define generalized $p$-harmonic maps into Riemannian homogeneous targets, a notion of solutions not belonging to the energy space. Restricting our attention to the subcritical range $p$ greater than the domain dimension $n$, we show a uniform $C^{1,α}$-regularity result for a sequence of such maps in the limit $p \searrow n$, assuming a uniform $n$-energy bound on its elements. The method of the proof follows the exact same lines as in \cite{strz3} but we need to check uniformity of estimates not previously considered there.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
The Average Soft X-ray Spectra of eROSITA Active Galactic Nuclei
Authors:
Shi-Jiang Chen,
Johannes Buchner,
Teng Liu,
Scott Hagen,
Sophia G. H. Waddell,
Kirpal Nandra,
Mara Salvato,
Zsofi Igo,
Catarina Aydar,
Andrea Merloni,
Qingling Ni,
Jia-Lai Kang,
Zhen-Yi Cai,
Jun-Xian Wang,
Ruancun Li,
Miriam E. Ramos-Ceja,
Jeremy Sanders,
Antonis Georgakakis,
Yi Zhang
Abstract:
Context. AGNs are strong X-ray emitters shaped by disk-corona interactions. The soft excess (0.5-2.0 keV) reveals key information about the "warm corona" bridging the disk and hot corona. Yet, how this feature evolves with accretion properties remains poorly constrained, especially in large samples using spectral stacking. Aims. The eROSITA All-Sky Survey (eRASS:5) provides an unprecedented sample…
▽ More
Context. AGNs are strong X-ray emitters shaped by disk-corona interactions. The soft excess (0.5-2.0 keV) reveals key information about the "warm corona" bridging the disk and hot corona. Yet, how this feature evolves with accretion properties remains poorly constrained, especially in large samples using spectral stacking. Aims. The eROSITA All-Sky Survey (eRASS:5) provides an unprecedented sample. We investigate how the average AGN X-ray spectra evolve with accretion parameters, and explore disk-corona connection by further combining stacked UV data. Methods. We developed Xstack, a novel tool that stacks rest-frame X-ray spectra and responses while preserving spectral shape through optimized weighting. We stack 17929 AGNs ("spec-z" sample, 23 Ms) with similar X-ray loudness alpha_ox, UV luminosity L_UV, and 4159 AGNs ("BH-mass" sample, 3 Ms) with similar Eddington ratio lambda_Edd and black hole mass M_BH. The resulting stacked X-ray spectra are analyzed with a phenomenological model. We further fit the stacked optical-UV-Xray SED with AGNSED model. Results. Soft excess strengthens strongly with alpha_ox and lambda_Edd (~5), while the hard X-ray spectral shape remains largely unchanged, supporting that soft excess is dominated by warm corona rather than reflection. AGNSED modeling reveals that warm corona radius (R_g units) generally increases with lambda_Edd and decreases with M_BH, or equivalently the disk-to-warm-corona transition consistently occurs near 1e4 K. The hot corona contracts with lambda_Edd and is unaffected by M_BH, aligning with disk evaporation predictions. Conclusions. The soft excess likely originates from a warm corona, with the disk to warm corona transition tied to hydrogen ionization near 1e4 K - supporting earlier eFEDS-HSC stacking results (Hagen et al. 2024). This study shows the strength of spectral stacking in probing AGN disk-corona physics.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Quantitative correlation between structural (dis-)order and diffuseness of phase transition in lead scandium tantalate
Authors:
T. Granzow,
A. Aravindhan,
Y. Nouchokgwe,
V. Kovacova,
S. Glinsek,
S. Hirose,
T. Usui,
H. Uršič,
I. Goričan,
W. Jo,
C. -H. Hong,
E. Defay
Abstract:
Ferroelectrics show a phase transition to a paraelectric phase at a well-defined transition temperature. Introducing disorder makes this transition diffuse, and the system becomes a relaxor. Since the degree of (dis-)order is usually manipulated by varying the chemical composition, it is difficult to establish a direct relationship between disorder and the degree of diffuseness. Perovskite structu…
▽ More
Ferroelectrics show a phase transition to a paraelectric phase at a well-defined transition temperature. Introducing disorder makes this transition diffuse, and the system becomes a relaxor. Since the degree of (dis-)order is usually manipulated by varying the chemical composition, it is difficult to establish a direct relationship between disorder and the degree of diffuseness. Perovskite structured lead scandium tantalate (Pb[Sc$_{1/2}$Ta$_{1/2}$]O$_3$, PST) offers the opportunity to tune the character of the transition by thermal annealing without changing the stoichiometry. Here it is demonstrated that there is a linear correlation between the structural ordering, quantified by the intensity ratio $S$ of the pseudocubic (111)/(200) x-ray diffraction peaks, and the diffuseness parameter $γ$ deduced from temperature-dependent dielectric spectroscopy. The relation is universal, independent of whether the sample is a thin film, multilayer capacitor or bulk ceramic, and also independent of the absolute value of the dielectric permittivity.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.