-
HIDE and Seek: Detecting Hallucinations in Language Models via Decoupled Representations
Authors:
Anwoy Chatterjee,
Yash Goel,
Tanmoy Chakraborty
Abstract:
Contemporary Language Models (LMs), while impressively fluent, often generate content that is factually incorrect or unfaithful to the input context - a critical issue commonly referred to as 'hallucination'. This tendency of LMs to generate hallucinated content undermines their reliability, especially because these fabrications are often highly convincing and therefore difficult to detect. While…
▽ More
Contemporary Language Models (LMs), while impressively fluent, often generate content that is factually incorrect or unfaithful to the input context - a critical issue commonly referred to as 'hallucination'. This tendency of LMs to generate hallucinated content undermines their reliability, especially because these fabrications are often highly convincing and therefore difficult to detect. While several existing methods attempt to detect hallucinations, most rely on analyzing multiple generations per input, leading to increased computational cost and latency. To address this, we propose a single-pass, training-free approach for effective Hallucination detectIon via Decoupled rEpresentations (HIDE). Our approach leverages the hypothesis that hallucinations result from a statistical decoupling between an LM's internal representations of input context and its generated output. We quantify this decoupling using the Hilbert-Schmidt Independence Criterion (HSIC) applied to hidden-state representations extracted while generating the output sequence. We conduct extensive experiments on four diverse question answering datasets, evaluating both faithfulness and factuality hallucinations across six open-source LMs of varying scales and properties. Our results demonstrate that HIDE outperforms other single-pass methods in almost all settings, achieving an average relative improvement of ~29% in AUC-ROC over the best-performing single-pass strategy across various models and datasets. Additionally, HIDE shows competitive and often superior performance with multi-pass state-of-the-art methods, obtaining an average relative improvement of ~3% in AUC-ROC while consuming ~51% less computation time. Our findings highlight the effectiveness of exploiting internal representation decoupling in LMs for efficient and practical hallucination detection.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Pix2Geomodel: A Next-Generation Reservoir Geomodeling with Property-to-Property Translation
Authors:
Abdulrahman Al-Fakih,
Ardiansyah Koeshidayatullah,
Nabil A. Saraih,
Tapan Mukerji,
Rayan Kanfar,
Abdulmohsen Alali,
SanLinn I. Kaka
Abstract:
Accurate geological modeling is critical for reservoir characterization, yet traditional methods struggle with complex subsurface heterogeneity, and they have problems with conditioning to observed data. This study introduces Pix2Geomodel, a novel conditional generative adversarial network (cGAN) framework based on Pix2Pix, designed to predict reservoir properties (facies, porosity, permeability,…
▽ More
Accurate geological modeling is critical for reservoir characterization, yet traditional methods struggle with complex subsurface heterogeneity, and they have problems with conditioning to observed data. This study introduces Pix2Geomodel, a novel conditional generative adversarial network (cGAN) framework based on Pix2Pix, designed to predict reservoir properties (facies, porosity, permeability, and water saturation) from the Rotliegend reservoir of the Groningen gas field. Utilizing a 7.6 million-cell dataset from the Nederlandse Aardolie Maatschappij, accessed via EPOS-NL, the methodology included data preprocessing, augmentation to generate 2,350 images per property, and training with a U-Net generator and PatchGAN discriminator over 19,000 steps. Evaluation metrics include pixel accuracy (PA), mean intersection over union (mIoU), frequency weighted intersection over union (FWIoU), and visualizations assessed performance in masked property prediction and property-to-property translation tasks. Results demonstrated high accuracy for facies (PA 0.88, FWIoU 0.85) and water saturation (PA 0.96, FWIoU 0.95), with moderate success for porosity (PA 0.70, FWIoU 0.55) and permeability (PA 0.74, FWIoU 0.60), and robust translation performance (e.g., facies-to-facies PA 0.98, FWIoU 0.97). The framework captured spatial variability and geological realism, as validated by variogram analysis, and calculated the training loss curves for the generator and discriminator for each property. Compared to traditional methods, Pix2Geomodel offers enhanced fidelity in direct property mapping. Limitations include challenges with microstructural variability and 2D constraints, suggesting future integration of multi-modal data and 3D modeling (Pix2Geomodel v2.0). This study advances the application of generative AI in geoscience, supporting improved reservoir management and open science initiatives.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Optimizing Periodic Operations for Efficient Inland Waterway Lock Management
Authors:
Julian Golak,
Alexander Grigoriev,
Freija van Lent,
Tom van der Zanden
Abstract:
In inland waterways, the efficient management of water lock operations impacts the level of congestion and the resulting uncertainty in inland waterway transportation. To achieve reliable and efficient traffic, schedules should be easy to understand and implement, reducing the likelihood of errors. The simplest schedules follow periodic patterns, reducing complexity and facilitating predictable ma…
▽ More
In inland waterways, the efficient management of water lock operations impacts the level of congestion and the resulting uncertainty in inland waterway transportation. To achieve reliable and efficient traffic, schedules should be easy to understand and implement, reducing the likelihood of errors. The simplest schedules follow periodic patterns, reducing complexity and facilitating predictable management. Since vessels do not arrive in perfectly regular intervals, periodic schedules may lead to more wait time. The aim of this research is to estimate this cost by evaluating how effective these periodic schedules manage vessel traffic at water locks. The first objective is to estimate a periodic arrival pattern that closely matches a dataset of irregular vessel arrivals at a specific lock. We develop an algorithm that, given a fixed number of vessel streams, solves the problem in polynomial time. The solution then serves as input for the subsequent part, where we consider algorithms that compute operational schedules by formulating an optimisation problem with periodic arrival patterns as input, and the goal is to determine a periodic schedule that minimises the long-run average waiting time of vessels. We present a polynomial-time algorithm for the two-stream case and a pseudo-polynomial-time algorithm for the general case, along with incremental polynomial-time approximation schemes. In our numerical experiments, use AIS data to construct a periodic arrival pattern closely matching the observed data. Our experiments demonstrate that when evaluated against actual data, intuitive and straightforward policies often outperform optimal policies specifically trained on the periodic arrival pattern.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Experimental Evidence for the Propagation and Preservation of Machine Discoveries in Human Populations
Authors:
Levin Brinkmann,
Thomas F. Eisenmann,
Anne-Marie Nussberger,
Maxim Derex,
Sara Bonati,
Valerii Chirkov,
Iyad Rahwan
Abstract:
Intelligent machines with superhuman capabilities have the potential to uncover problem-solving strategies beyond human discovery. Emerging evidence from competitive gameplay, such as Go, demonstrates that AI systems are evolving from mere tools to sources of cultural innovation adopted by humans. However, the conditions under which intelligent machines transition from tools to drivers of persiste…
▽ More
Intelligent machines with superhuman capabilities have the potential to uncover problem-solving strategies beyond human discovery. Emerging evidence from competitive gameplay, such as Go, demonstrates that AI systems are evolving from mere tools to sources of cultural innovation adopted by humans. However, the conditions under which intelligent machines transition from tools to drivers of persistent cultural change remain unclear. We identify three key conditions for machines to fundamentally influence human problem-solving: the discovered strategies must be non-trivial, learnable, and offer a clear advantage. Using a cultural transmission experiment and an agent-based simulation, we demonstrate that when these conditions are met, machine-discovered strategies can be transmitted, understood, and preserved by human populations, leading to enduring cultural shifts. These findings provide a framework for understanding how machines can persistently expand human cognitive skills and underscore the need to consider their broader implications for human cognition and cultural evolution.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
The tangle-valued 1-cocycle for knots
Authors:
Thomas Fiedler
Abstract:
This paper contains the strongest and at the same time most calculable knot invariant ever.
Let $Θ$ be the topological moduli space of all ordered oriented tangles in 3-space. We construct a non-trivial combinatorial 1-cocycle $\mathbb{L}$ for $Θ$ that takes its values in $H_0(Θ;\mathbb{Z})$. The 1-cocycle $\mathbb{L}$ has a very nice property, called the {\em scan-property}: if we slide a tangl…
▽ More
This paper contains the strongest and at the same time most calculable knot invariant ever.
Let $Θ$ be the topological moduli space of all ordered oriented tangles in 3-space. We construct a non-trivial combinatorial 1-cocycle $\mathbb{L}$ for $Θ$ that takes its values in $H_0(Θ;\mathbb{Z})$. The 1-cocycle $\mathbb{L}$ has a very nice property, called the {\em scan-property}: if we slide a tangle $T$ over or under a given crossing $c$ of a fixed tangle $T'$, then the value of $\mathbb{L}$ on this arc $scan(T)$ in $Θ$ is already an isotopy invariant of $T$.
In particular, let $D$ be a framed long knot diagram. We take the product with a fixed long knot diagram $K$ and we consider the 2-cable, with a fixed crossing $c$ in $2K$. $\mathbb{L}(scan(2D))$ gives an element in $H_0(Θ)$. To this element we associate the {\em set of Alexander vectors}, consisting of the corresponding integer multiples of the one-variable Alexander polynomials of (the standard closures) of all sub-tangles of each of the tangles. We can vary the knots $(K,c)$ and moreover we can iterate our construction by starting now again the $scan$ with the tangles in $\mathbb{L}(scan(2D))$ and so on. The result is the infinite {\em Alexander tree}, which is an isotopy invariant of the knot represented by $D$.
{\em As an example we show with just one edge of the Alexander tree that the knot $8_{17}$ is not invertible!} This makes the Alexander tree a very promising candidate for a complete and "locally" calculable knot invariant, because the tangles in $\mathbb{L}(scan(2D))$ can be drawn with linear complexity and their Alexander polynomials can be calculated with quartic complexity with respect to the number of crossings of $D$.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Resolving the Ti-V Phase Diagram Discrepancy with First-Principles Calculations and Bayesian Learning
Authors:
Timofei Miryashkin,
Olga Klimanova,
Alexander Shapeev
Abstract:
Conflicting experiments disagree on whether the titanium-vanadium (Ti-V) binary alloy exhibits a body-centred cubic (BCC) miscibility gap or remains completely soluble. A leading hypothesis attributes the miscibility gap to oxygen contamination during alloy preparation. To resolve this controversy, we use an ab initio + machine-learning workflow that couples an actively-trained Moment Tensor Poten…
▽ More
Conflicting experiments disagree on whether the titanium-vanadium (Ti-V) binary alloy exhibits a body-centred cubic (BCC) miscibility gap or remains completely soluble. A leading hypothesis attributes the miscibility gap to oxygen contamination during alloy preparation. To resolve this controversy, we use an ab initio + machine-learning workflow that couples an actively-trained Moment Tensor Potential to Bayesian thermodynamic inference. Using this workflow, we obtain Ti-V binary system across the entire composition range, together with confidence intervals in the thermodynamic limit. The resulting diagram reproduces all experimental features, demonstrating the robustness of our approach, and clearly favors the variant with a BCC miscibility gap terminating at T = 980 K and c = 0.67. Because oxygen was excluded from simulations, the gap cannot be attributed to impurity effects, contradicting recent CALPHAD reassessments.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
On sequentially Cohen-Macaulay modules and sequentially generalized Cohen-Macaulay modules
Authors:
Nguyen Xuan Linh,
Le Thanh Nhan
Abstract:
We introduce the notions of sequential sequence and sequential f-sequence in order to characterize sequentially Cohen-Macaulay modules and sequentially generalized Cohen-Macaulay modules. Let R be a Noetherian local ring and M a finitely generated R-module. We show that M is sequentially Cohen-Macaulay (resp. sequentially generalized Cohen-Macaulay) if and only if there exists a system of paramete…
▽ More
We introduce the notions of sequential sequence and sequential f-sequence in order to characterize sequentially Cohen-Macaulay modules and sequentially generalized Cohen-Macaulay modules. Let R be a Noetherian local ring and M a finitely generated R-module. We show that M is sequentially Cohen-Macaulay (resp. sequentially generalized Cohen-Macaulay) if and only if there exists a system of parameters of M that is an M- sequential sequence (resp. each generalized regular sequence s.o.p of M is an M-sequential f-sequence) and R/AnnR(M) is a quotient of a Cohen-Macaulay local ring. As an application, we give new characterizations of Cohen-Macaulay modules and generalized Cohen-Macaulay modules.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition
Authors:
Zebin Wang,
Menghan Lin,
Bolin Shen,
Ken Anderson,
Molei Liu,
Tianxi Cai,
Yushun Dong
Abstract:
Graph Neural Networks (GNNs) have demonstrated remarkable utility across diverse applications, and their growing complexity has made Machine Learning as a Service (MLaaS) a viable platform for scalable deployment. However, this accessibility also exposes GNN to serious security threats, most notably model extraction attacks (MEAs), in which adversaries strategically query a deployed model to const…
▽ More
Graph Neural Networks (GNNs) have demonstrated remarkable utility across diverse applications, and their growing complexity has made Machine Learning as a Service (MLaaS) a viable platform for scalable deployment. However, this accessibility also exposes GNN to serious security threats, most notably model extraction attacks (MEAs), in which adversaries strategically query a deployed model to construct a high-fidelity replica. In this work, we evaluate the vulnerability of GNNs to MEAs and explore their potential for cost-effective model acquisition in non-adversarial research settings. Importantly, adaptive node querying strategies can also serve a critical role in research, particularly when labeling data is expensive or time-consuming. By selectively sampling informative nodes, researchers can train high-performing GNNs with minimal supervision, which is particularly valuable in domains such as biomedicine, where annotations often require expert input. To address this, we propose a node querying strategy tailored to a highly practical yet underexplored scenario, where bulk queries are prohibited, and only a limited set of initial nodes is available. Our approach iteratively refines the node selection mechanism over multiple learning cycles, leveraging historical feedback to improve extraction efficiency. Extensive experiments on benchmark graph datasets demonstrate our superiority over comparable baselines on accuracy, fidelity, and F1 score under strict query-size constraints. These results highlight both the susceptibility of deployed GNNs to extraction attacks and the promise of ethical, efficient GNN acquisition methods to support low-resource research environments.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
DreamJourney: Perpetual View Generation with Video Diffusion Models
Authors:
Bo Pan,
Yang Chen,
Yingwei Pan,
Ting Yao,
Wei Chen,
Tao Mei
Abstract:
Perpetual view generation aims to synthesize a long-term video corresponding to an arbitrary camera trajectory solely from a single input image. Recent methods commonly utilize a pre-trained text-to-image diffusion model to synthesize new content of previously unseen regions along camera movement. However, the underlying 2D diffusion model lacks 3D awareness and results in distorted artifacts. Mor…
▽ More
Perpetual view generation aims to synthesize a long-term video corresponding to an arbitrary camera trajectory solely from a single input image. Recent methods commonly utilize a pre-trained text-to-image diffusion model to synthesize new content of previously unseen regions along camera movement. However, the underlying 2D diffusion model lacks 3D awareness and results in distorted artifacts. Moreover, they are limited to generating views of static 3D scenes, neglecting to capture object movements within the dynamic 4D world. To alleviate these issues, we present DreamJourney, a two-stage framework that leverages the world simulation capacity of video diffusion models to trigger a new perpetual scene view generation task with both camera movements and object dynamics. Specifically, in stage I, DreamJourney first lifts the input image to 3D point cloud and renders a sequence of partial images from a specific camera trajectory. A video diffusion model is then utilized as generative prior to complete the missing regions and enhance visual coherence across the sequence, producing a cross-view consistent video adheres to the 3D scene and camera trajectory. Meanwhile, we introduce two simple yet effective strategies (early stopping and view padding) to further stabilize the generation process and improve visual quality. Next, in stage II, DreamJourney leverages a multimodal large language model to produce a text prompt describing object movements in current view, and uses video diffusion model to animate current view with object movements. Stage I and II are repeated recurrently, enabling perpetual dynamic scene view generation. Extensive experiments demonstrate the superiority of our DreamJourney over state-of-the-art methods both quantitatively and qualitatively. Our project page: https://dream-journey.vercel.app.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
On the Practicability of Ceramic-Tiled Walls for Sound Absorption by Tuning Cavities
Authors:
Ozgur T. Tugut,
Brahim Lemkalli,
Qingxiang Ji,
Mahmoud Addouche,
Benjamin Vial,
Sébastien Guenneau,
Richard Craster,
Claudio Bizzaglia,
Bogdan Ungureanu,
Muamer Kadic
Abstract:
We present the practicality of structuring ceramic tiles for enhancing sound absorption on rigid walls. The cornerstone of our methodology is to structure walls with cavities so that walls effectively behave as heterogeneous absorbing surfaces over a large frequency bandwidth. Using this approach, ceramic tiled walls are developed by integrating tuned cavity structures based on Helmholtz resonator…
▽ More
We present the practicality of structuring ceramic tiles for enhancing sound absorption on rigid walls. The cornerstone of our methodology is to structure walls with cavities so that walls effectively behave as heterogeneous absorbing surfaces over a large frequency bandwidth. Using this approach, ceramic tiled walls are developed by integrating tuned cavity structures based on Helmholtz resonators. Such a design leverages the empty joints between tiles to form resonator necks, while the space between the ceramic tiles and the wall acts as the resonator chambers. By arranging these resonators in a spatially graded array, we achieve broadband sound absorption which targets low-frequency noise generated by impacts, footsteps and ambient sources. This makes the system highly suitable for practical architectural applications. The study encompasses the entire process, from numerical modeling and analytical formulation to the fabrication and mounting of resonant tiles, followed by experimental validation, clearly demonstrating the effectiveness of the proposed solution in real-world conditions. The findings highlight the strong potential of this approach for practical tiled room acoustic treatment and noise mitigation.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering
Authors:
Binquan Ji,
Haibo Luo,
Yifei Lu,
Lei Hei,
Jiaqi Wang,
Tingjing Liao,
Lingyu Wang,
Shichao Wang,
Feiliang Ren
Abstract:
Knowledge-intensive multi-hop question answering (QA) tasks, which require integrating evidence from multiple sources to address complex queries, often necessitate multiple rounds of retrieval and iterative generation by large language models (LLMs). However, incorporating many documents and extended contexts poses challenges -such as hallucinations and semantic drift-for lightweight LLMs with few…
▽ More
Knowledge-intensive multi-hop question answering (QA) tasks, which require integrating evidence from multiple sources to address complex queries, often necessitate multiple rounds of retrieval and iterative generation by large language models (LLMs). However, incorporating many documents and extended contexts poses challenges -such as hallucinations and semantic drift-for lightweight LLMs with fewer parameters. This work proposes a novel framework called DEC (Dynamic Enhancement Chain). DEC first decomposes complex questions into logically coherent subquestions to form a hallucination-free reasoning chain. It then iteratively refines these subquestions through context-aware rewriting to generate effective query formulations. For retrieval, we introduce a lightweight discriminative keyword extraction module that leverages extracted keywords to achieve targeted, precise document recall with relatively low computational overhead. Extensive experiments on three multi-hop QA datasets demonstrate that DEC performs on par with or surpasses state-of-the-art benchmarks while significantly reducing token consumption. Notably, our approach attains state-of-the-art results on models with 8B parameters, showcasing its effectiveness in various scenarios, particularly in resource-constrained environments.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Brightening dark trions in WS2 monolayers via introducing atomic sulfur vacancies
Authors:
Xuguang Cao,
Wanggui Ye,
Debao Zhang,
Ji Zhou,
Lei Peng,
Changcheng Zheng,
Kenji Watanabe,
Takashi Taniguchi,
Jiqiang Ning,
Shijie Xu
Abstract:
Understanding the effects of atomic defects on the optical functionality of two-dimensional (2D) layered materials is critical to develop novel optical and optoelectronic applications of these ultimate materials. Herein, we correlate sulfur vacancies (VS) and luminescence properties of dark trions in monolayer WS2 through introducing VS defects and conducting a systematic optical spectroscopic cha…
▽ More
Understanding the effects of atomic defects on the optical functionality of two-dimensional (2D) layered materials is critical to develop novel optical and optoelectronic applications of these ultimate materials. Herein, we correlate sulfur vacancies (VS) and luminescence properties of dark trions in monolayer WS2 through introducing VS defects and conducting a systematic optical spectroscopic characterization at cryogenic and room temperatures. It is unraveled that the VS defects can brighten the dark trions via introducing a stronger spin-orbit coupling due to the space inversion symmetry broken by the defects. Furthermore, the wavefunction localization of the dark trions bound at VS defects results in significant enhancement of the phonon scattering from the K2 valley phonons and hence makes the K2 phonon replica dominant in the emission spectrum. Theoretical calculations of the temperature-dependent photoluminescence spectra with quantum mechanics-based multimode Brownian oscillator model show strong support for the above arguments. Brightening the dark excitons not only sheds light on the understanding of the intriguing excitonic properties of 2D semiconductors, but also may open a way for regulating the optoelectronic performance of two-dimensional semiconductors.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings
Authors:
Julian Herreilers,
Christiaan Jacobs,
Thomas Niesler
Abstract:
We introduce a new approach, the ContrastiveTransformer, that produces acoustic word embeddings (AWEs) for the purpose of very low-resource keyword spotting. The ContrastiveTransformer, an encoder-only model, directly optimises the embedding space using normalised temperature-scaled cross entropy (NT-Xent) loss. We use this model to perform keyword spotting for radio broadcasts in Luganda and Bamb…
▽ More
We introduce a new approach, the ContrastiveTransformer, that produces acoustic word embeddings (AWEs) for the purpose of very low-resource keyword spotting. The ContrastiveTransformer, an encoder-only model, directly optimises the embedding space using normalised temperature-scaled cross entropy (NT-Xent) loss. We use this model to perform keyword spotting for radio broadcasts in Luganda and Bambara, the latter a severely under-resourced language. We compare our model to various existing AWE approaches, including those constructed from large pre-trained self-supervised models, a recurrent encoder which previously used the NT-Xent loss, and a DTW baseline. We demonstrate that the proposed contrastive transformer approach offers performance improvements over all considered existing approaches to very low-resource keyword spotting in both languages.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
New Determination of the $^{14}$C(n, $γ$)$^{15}$C Reaction Rate and Its Astrophysical Implications
Authors:
Yuchen Jiang,
Zhenyu He,
Yudong Luo,
Wenyu Xin,
Jie Chen,
Xinyue Li,
Yangping Shen,
Bing Guo,
Guo Li,
Danyang Pang,
Tianli Ma,
Weike Nan,
Toshitaka Kajino,
Weiping Liu
Abstract:
We present a novel experiment to investigate the spectroscopic factor of the $^{15}$C ground state for the first time using single-neutron $removal$ transfer reactions on $^{15}$C. Two consistent spectroscopic factors were derived from the (p, d) and (d, t) reactions, which were subsequently used to deduce the $^{14}$C(n, $γ$)$^{15}$C reaction cross section and the corresponding stellar reaction r…
▽ More
We present a novel experiment to investigate the spectroscopic factor of the $^{15}$C ground state for the first time using single-neutron $removal$ transfer reactions on $^{15}$C. Two consistent spectroscopic factors were derived from the (p, d) and (d, t) reactions, which were subsequently used to deduce the $^{14}$C(n, $γ$)$^{15}$C reaction cross section and the corresponding stellar reaction rate. A typical cross section of (3.89 $\pm$ 0.76) $μ$b is determined at $E_\mathrm{_{c.m.}}$ = 23.3 keV. At the temperature range of 0.01-4 GK, our new reaction rate is 2.4-3.7 times higher than that of the first direct measurement and 20\%-25\% lower than that of the most recent direct measurement, respectively. Moreover, it is interesting that we can associate a long-standing nuclear structure issue, i.e., the so-called ``quenching'' effect, with this astrophysically relevant reaction. Finally, motivated by astrophysical interests of this reaction decades ago, implications of our new rate on several astrophysical problems are evaluated using state-of-the-art theoretical models. Our calculations demonstrate that the abundances of $^{14}$N and $^{15}$N can be enhanced in the inner regions of asymptotic giant branch (AGB) stars, though with minimal impact on the chemical compositions of the interstellar medium. In the inhomogeneous Big Bang nucleosynthesis, the updated reaction rate can lead to a $\sim 20\%$ variation in the final yields of $^{15}$N in neutron rich regions. For the $r$-process in the core-collapse supernovae, a slight difference of $\sim 0.2\%$ in the final abundances of heavy elements with $A > 90$ can be found by using our new rate.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Non-Intrusive MLOps-Driven Performance Intelligence in Software Data Planes
Authors:
Qiong Liu,
Jianke Lin,
Tianzhu Zhang,
Leonardo Linguaglossa
Abstract:
The last decade has witnessed the proliferation of network function virtualization (NFV) in the telco industry, thanks to its unparalleled flexibility, scalability, and cost-effectiveness. However, as the NFV infrastructure is shared by virtual network functions (VNFs), sporadic resource contentions are inevitable. Such contention makes it extremely challenging to guarantee the performance of the…
▽ More
The last decade has witnessed the proliferation of network function virtualization (NFV) in the telco industry, thanks to its unparalleled flexibility, scalability, and cost-effectiveness. However, as the NFV infrastructure is shared by virtual network functions (VNFs), sporadic resource contentions are inevitable. Such contention makes it extremely challenging to guarantee the performance of the provisioned network services, especially in high-speed regimes (e.g., Gigabit Ethernet). Existing solutions typically rely on direct traffic analysis (e.g., packet- or flow-level measurements) to detect performance degradation and identify bottlenecks, which is not always applicable due to significant integration overhead and system-level constraints.
This paper complements existing solutions with a lightweight, non-intrusive framework for online performance inference and adaptation. Instead of direct data-plane collection, we reuse hardware features in the underlying NFV infrastructure, introducing negligible interference in the data plane. This framework can be integrated into existing NFV systems with minimal engineering effort and operates without the need for predefined traffic models or VNF-specific customization. Through comprehensive evaluation across diverse NFV scenarios, our Drift-Resilient and Self-Tuning (DRST) framework delivers accurate performance inference, runtime bottleneck diagnose, and automated adaptation under runtime drift, via a lightweight MLOps pipeline.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Entropy Bounds for Perfect Matchings in Bipartite Hypergraphs
Authors:
Tantan Dai,
Alexander Divoux,
Tom Kelly
Abstract:
A hypergraph is \textit{bipartite with bipartition} $(A, B)$ if every edge has exactly one vertex in $A$, and a matching in such a hypergraph is \textit{$A$-perfect} if it saturates every vertex in $A$. We prove an upper bound on the number of $A$-perfect matchings in uniform hypergraphs with small maximum codegree. Using this result, we prove that there exist order-$n$ Latin squares with at most…
▽ More
A hypergraph is \textit{bipartite with bipartition} $(A, B)$ if every edge has exactly one vertex in $A$, and a matching in such a hypergraph is \textit{$A$-perfect} if it saturates every vertex in $A$. We prove an upper bound on the number of $A$-perfect matchings in uniform hypergraphs with small maximum codegree. Using this result, we prove that there exist order-$n$ Latin squares with at most $(n/e^{2.117})^n$ transversals when $n$ is odd and $n \equiv 0\pmod 3$. We also show that $k$-uniform $D$-regular hypergraphs on $n$ vertices have at most $((1+o(1))q/e^k)^{Dn/k}$ proper $q$-edge-colorings when $q = (1+o(1))D$ and the maximum codegree is $o(q)$.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Quantizing for Noisy Flash Memory Channels
Authors:
Juyun Oh,
Taewoo Park,
Jiwoong Im,
Yuval Cassuto,
Yongjune Kim
Abstract:
Flash memory-based processing-in-memory (flash-based PIM) offers high storage capacity and computational efficiency but faces significant reliability challenges due to noise in high-density multi-level cell (MLC) flash memories. Existing verify level optimization methods are designed for general storage scenarios and fail to address the unique requirements of flash-based PIM systems, where metrics…
▽ More
Flash memory-based processing-in-memory (flash-based PIM) offers high storage capacity and computational efficiency but faces significant reliability challenges due to noise in high-density multi-level cell (MLC) flash memories. Existing verify level optimization methods are designed for general storage scenarios and fail to address the unique requirements of flash-based PIM systems, where metrics such as mean squared error (MSE) and peak signal-to-noise ratio (PSNR) are critical. This paper introduces an integrated framework that jointly optimizes quantization and verify levels to minimize the MSE, considering both quantization and flash memory channel errors. We develop an iterative algorithm to solve the joint optimization problem. Experimental results on quantized images and SwinIR model parameters stored in flash memory show that the proposed method significantly improves the reliability of flash-based PIM systems.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Optimization-Free Patch Attack on Stereo Depth Estimation
Authors:
Hangcheng Liu,
Xu Kuang,
Xingshuo Han,
Xingwan Wu,
Haoran Ou,
Shangwei Guo,
Xingyi Huang,
Tao Xiang,
Tianwei Zhang
Abstract:
Stereo Depth Estimation (SDE) is essential for scene understanding in vision-based systems like autonomous driving. However, recent studies show that SDE models are vulnerable to adversarial attacks, which are often limited to unrealistic settings, e.g., digital perturbations on separate stereo views in static scenes, restricting their real-world applicability. This raises a critical question: how…
▽ More
Stereo Depth Estimation (SDE) is essential for scene understanding in vision-based systems like autonomous driving. However, recent studies show that SDE models are vulnerable to adversarial attacks, which are often limited to unrealistic settings, e.g., digital perturbations on separate stereo views in static scenes, restricting their real-world applicability. This raises a critical question: how can we design physically realizable, scene-adaptive, and transferable attacks against SDE under realistic constraints?
To answer this, we make two key contributions. First, we propose a unified attack framework that extends optimization-based techniques to four core stages of stereo matching: feature extraction, cost-volume construction, cost aggregation, and disparity regression. A comprehensive stage-wise evaluation across 9 mainstream SDE models, under constraints like photometric consistency, reveals that optimization-based patches suffer from poor transferability. Interestingly, partially transferable patches suggest that patterns, rather than pixel-level perturbations, may be key to generalizable attacks. Motivated by this, we present PatchHunter, the first optimization-free adversarial patch attack against SDE. PatchHunter formulates patch generation as a reinforcement learning-driven search over a structured space of visual patterns crafted to disrupt SDE assumptions.
We validate PatchHunter across three levels: the KITTI dataset, the CARLA simulator, and real-world vehicle deployment. PatchHunter not only surpasses optimization-based methods in effectiveness but also achieves significantly better black-box transferability. Even under challenging physical conditions like low light, PatchHunter maintains high attack success (e.g., D1-all > 0.4), whereas optimization-based methods fail.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning
Authors:
Kailing Li,
Qi'ao Xu,
Tianwen Qian,
Yuqian Fu,
Yang Jiao,
Xiaoling Wang
Abstract:
Embodied Visual Reasoning (EVR) seeks to follow complex, free-form instructions based on egocentric video, enabling semantic understanding and spatiotemporal reasoning in dynamic environments. Despite its promising potential, EVR encounters significant challenges stemming from the diversity of complex instructions and the intricate spatiotemporal dynamics in long-term egocentric videos. Prior solu…
▽ More
Embodied Visual Reasoning (EVR) seeks to follow complex, free-form instructions based on egocentric video, enabling semantic understanding and spatiotemporal reasoning in dynamic environments. Despite its promising potential, EVR encounters significant challenges stemming from the diversity of complex instructions and the intricate spatiotemporal dynamics in long-term egocentric videos. Prior solutions either employ Large Language Models (LLMs) over static video captions, which often omit critical visual details, or rely on end-to-end Vision-Language Models (VLMs) that struggle with stepwise compositional reasoning. Consider the complementary strengths of LLMs in reasoning and VLMs in perception, we propose CLiViS. It is a novel training-free framework that leverages LLMs for high-level task planning and orchestrates VLM-driven open-world visual perception to iteratively update the scene context. Building on this synergy, the core of CLiViS is a dynamic Cognitive Map that evolves throughout the reasoning process. This map constructs a structured representation of the embodied scene, bridging low-level perception and high-level reasoning. Extensive experiments across multiple benchmarks demonstrate the effectiveness and generality of CLiViS, especially in handling long-term visual dependencies. Code is available at https://github.com/Teacher-Tom/CLiViS.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
OpusLM: A Family of Open Unified Speech Language Models
Authors:
Jinchuan Tian,
William Chen,
Yifan Peng,
Jiatong Shi,
Siddhant Arora,
Shikhar Bharadwaj,
Takashi Maekaku,
Yusuke Shinohara,
Keita Goto,
Xiang Yue,
Huck Yang,
Shinji Watanabe
Abstract:
This paper presents Open Unified Speech Language Models (OpusLMs), a family of open foundational speech language models (SpeechLMs) up to 7B. Initialized from decoder-only text language models, the OpusLMs are continuously pre-trained on 213K hours of speech-text pairs and 292B text-only tokens. We demonstrate our OpusLMs achieve comparable (or even superior) performance with existing SpeechLMs in…
▽ More
This paper presents Open Unified Speech Language Models (OpusLMs), a family of open foundational speech language models (SpeechLMs) up to 7B. Initialized from decoder-only text language models, the OpusLMs are continuously pre-trained on 213K hours of speech-text pairs and 292B text-only tokens. We demonstrate our OpusLMs achieve comparable (or even superior) performance with existing SpeechLMs in speech recognition, speech synthesis, and text-only capabilities. Technically, this paper articulates our SpeechLM designs on tokenization, multi-stream language models, and multi-stage training strategies. We experimentally demonstrate the importance of model size scaling and the effect of annealing data selection. The OpusLMs are all built from publicly available materials and are fully transparent models. We release our code, data, checkpoints, and training logs to facilitate open SpeechLM research
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
HIRE: Lightweight High-Resolution Image Feature Enrichment for Multimodal LLMs
Authors:
Nikitha SR,
Aradhya Neeraj Mathur,
Tarun Ram Menta,
Rishabh Jain,
Mausoom Sarkar
Abstract:
The integration of high-resolution image features in modern multimodal large language models has demonstrated significant improvements in fine-grained visual understanding tasks, achieving high performance across multiple benchmarks. Since these features are obtained from large image encoders like ViT, they come with a significant increase in computational costs due to multiple calls to these enco…
▽ More
The integration of high-resolution image features in modern multimodal large language models has demonstrated significant improvements in fine-grained visual understanding tasks, achieving high performance across multiple benchmarks. Since these features are obtained from large image encoders like ViT, they come with a significant increase in computational costs due to multiple calls to these encoders. In this work, we first develop an intuition for feature upsampling as a natural extension of high-resolution feature generation. Through extensive experiments and ablations, we demonstrate how a shallow feature enricher can achieve competitive results with tremendous reductions in training and inference time as well as computational cost, with upto 1.5x saving in FLOPs.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Full-body WPT: wireless powering with meandered e-textiles
Authors:
Ryo Takahashi,
Takashi Sato,
Wakako Yukita,
Tomoyuki Yokota,
Takao Someya,
Yoshihiro Kawahara
Abstract:
We present Full-body WPT, wireless power networking around the human body using a meandered textile coil. Unlike traditional inductive systems that emit strong fields into the deep tissue inside the body, the meander coil enables localized generation of strong magnetic field constrained to the skin surface, even when scaled to the size of the human body. Such localized inductive system enhances bo…
▽ More
We present Full-body WPT, wireless power networking around the human body using a meandered textile coil. Unlike traditional inductive systems that emit strong fields into the deep tissue inside the body, the meander coil enables localized generation of strong magnetic field constrained to the skin surface, even when scaled to the size of the human body. Such localized inductive system enhances both safety and efficiency of wireless power around the body. Furthermore, the use of low-loss conductive yarn achieve energy-efficient and lightweight design. We analyze the performance of our design through simulations and experimental prototypes, demonstrating high power transfer efficiency and adaptability to user movement and posture. Our system provides a safe and efficient distributed power network using meandered textile coils integrated into wearable materials, highlighting the potential of body-centric wireless power networking as a foundational layer for ubiquitous health monitoring, augmented reality, and human-machine interaction systems.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
The second Hilbert coefficient of modules with almost maximal depth
Authors:
Van Duc Trung
Abstract:
Let $\mathbb{M} = \{ M_n \}$ be a good $\mathfrak{q}$-filtration of a finitely generated $R$-module $M$ of dimension $d$, where $(R,\mathfrak{m})$ is a local ring and $\mathfrak{q}$ is an $\mathfrak{m}$-primary ideal of $R$. In case $depth(M) \geq d-1$, we give an upper bound for the second Hilbert coefficient $e_2(\mathbb{M})$ generalizing results by Huckaba-Marley and Rossi-Valla proved assuming…
▽ More
Let $\mathbb{M} = \{ M_n \}$ be a good $\mathfrak{q}$-filtration of a finitely generated $R$-module $M$ of dimension $d$, where $(R,\mathfrak{m})$ is a local ring and $\mathfrak{q}$ is an $\mathfrak{m}$-primary ideal of $R$. In case $depth(M) \geq d-1$, we give an upper bound for the second Hilbert coefficient $e_2(\mathbb{M})$ generalizing results by Huckaba-Marley and Rossi-Valla proved assuming that $M$ is Cohen-Macaulay. We also give a condition for the equality, which relates to the depth of the associated graded module $gr_{\mathbb{M}}(M)$. A lower bound on $e_2(\mathbb{M})$ is proved generalizing a result by Rees and Narita.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
HalluRNN: Mitigating Hallucinations via Recurrent Cross-Layer Reasoning in Large Vision-Language Models
Authors:
Le Yu,
Kaishen Wang,
Jianlong Xiong,
Yue Cao,
Tao He
Abstract:
Though Large Vision-Language Models (LVLMs) have achieved remarkable performance across various tasks, they are still prone to hallucinations-generating outputs that are textually plausible but visually ungrounded. While prior approaches generally address this issue through data-centric fine-tuning or innovative decoding strategies, these methods often require substantial resources or task-specifi…
▽ More
Though Large Vision-Language Models (LVLMs) have achieved remarkable performance across various tasks, they are still prone to hallucinations-generating outputs that are textually plausible but visually ungrounded. While prior approaches generally address this issue through data-centric fine-tuning or innovative decoding strategies, these methods often require substantial resources or task-specific configurations. In this work, we introduce an architecture-level solution, HalluRNN, which enhances model stability through recurrent cross-layer reasoning. Specifically, we propose a novel Dual-Gated Depth Propagation Unit (DG-DPU) module, which is shared across layers and recurrently refines hidden states. This allows for the adaptive propagation of information throughout the model, enforces consistency across layers, and mitigates hallucinations caused by representational drift. By fine-tuning only the DG-DPU module, HalluRNN achieves strong and robust performance across multiple benchmarks.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Outflowing shocked gas dominates the NIR H$_2$ emission from the dual AGN NGC6240
Authors:
J. Carlsen,
C. Cicone,
B. Hagedorn,
K. Rubinur,
P. Andreani,
K. Dasyra,
P. Severgnini,
C. Vignali,
R. Morganti,
T. Oosterloo,
A. Lasrado,
E. Lopez-Rodriguez,
S. Shen
Abstract:
[Abridged] We present a multi-line study of the kinematics of the molecular and ionised gas phases in the central 2 kpc of NGC6240, based on JWST/NIRSpec and ALMA observations. We devised a new spectral-line fitting approach to de-blend rotating and non-rotating gas components, which is better tailored to the extreme feedback mechanisms at work in NGC6240. We find that ~65% of the Pa$α$, H$_2$, an…
▽ More
[Abridged] We present a multi-line study of the kinematics of the molecular and ionised gas phases in the central 2 kpc of NGC6240, based on JWST/NIRSpec and ALMA observations. We devised a new spectral-line fitting approach to de-blend rotating and non-rotating gas components, which is better tailored to the extreme feedback mechanisms at work in NGC6240. We find that ~65% of the Pa$α$, H$_2$, and [FeII] line fluxes within the NIRSpec field of view arise from gas components that are kinematically decoupled from the stars. The NIR H$_2$ lines show the most deviation from the stars, with peak emission between the two rotating stellar structures. The PAH 3.3$μ$m feature does not follow the NIR H$_2$ morphology, indicating that the latter does not trace PDRs. In the non-rotating gas components, we identify a biconical wind launched from the northern AGN, expanding along the minor axis of stellar rotation. This wind is dominated by ionised gas and, although it entrains some H$_2$, it does not show a H$_2$/PAH enhancement, suggesting either high UV irradiation or expansion along a relatively gas-free path. Furthermore, we find bright non-rotating gas emission between the two AGN and around the southern AGN, which we interpret as due to an outflow launched from the southern nucleus, coinciding with the molecular outflow previously studied in cold (sub-)millimeter tracers. The strong H$_2$/PAH enhancement measured in this region, coextensive with high velocity redshifted gas ($v\sim900$ km s$^{-1}$), suggests that the shocks responsible for the high H$_2$/PAH ratios are due to the outflow rather than to the collision of media during the merger. Our results show that the bulk of the NIR line emission in NGC6240 is decoupled from the stars, and that most of the warm H$_2$ is shock-excited and embedded in a powerful outflow, where it coexists with colder molecular gas.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Accelerating Residual Reinforcement Learning with Uncertainty Estimation
Authors:
Lakshita Dodeja,
Karl Schmeckpeper,
Shivam Vats,
Thomas Weng,
Mingxi Jia,
George Konidaris,
Stefanie Tellex
Abstract:
Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies. We propose two improvements to Residual RL that furth…
▽ More
Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies. We propose two improvements to Residual RL that further enhance its sample efficiency and make it suitable for stochastic base policies. First, we leverage uncertainty estimates of the base policy to focus exploration on regions in which the base policy is not confident. Second, we propose a simple modification to off-policy residual learning that allows it to observe base actions and better handle stochastic base policies. We evaluate our method with both Gaussian-based and Diffusion-based stochastic base policies on tasks from Robosuite and D4RL, and compare against state-of-the-art finetuning methods, demo-augmented RL methods, and other residual RL methods. Our algorithm significantly outperforms existing baselines in a variety of simulation benchmark environments. We also deploy our learned polices in the real world to demonstrate their robustness with zero-shot sim-to-real transfer.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Dynamics of Multiphase Carbon in the Turbulent Circumgalactic Medium
Authors:
Yue Hu,
Evan Scannapieco,
Edward Buie II,
Siyao Xu,
Samuel T Sebastian,
Om Biswal
Abstract:
The circumgalactic medium (CGM) plays a crucial role in regulating material and energy exchange between galaxies and their environments. The best means of observing this medium is through absorption-line spectroscopy, but we have yet to develop a consistent physical model that fully explains these results. Here we investigate the impact of turbulence and non-equilibrium chemistry on the properties…
▽ More
The circumgalactic medium (CGM) plays a crucial role in regulating material and energy exchange between galaxies and their environments. The best means of observing this medium is through absorption-line spectroscopy, but we have yet to develop a consistent physical model that fully explains these results. Here we investigate the impact of turbulence and non-equilibrium chemistry on the properties of the CGM, using three-dimensional hydrodynamic simulations that include the impact of an ionizing background. Increasing turbulence enhances small-scale density fluctuations, shifting the kinetic energy spectra from Kolmogorov to Burgers scaling. This is indicative of shock-dominated dissipation, which plays a critical role in driving carbon ionization and shaping the multiphase structure of the medium. At the same time, the presence of background radiation significantly alters the ionization balance, increasing the prevalence of C\textsc{ii} and C\textsc{iv}. Thus, turbulence and the background radiation have complementary roles: turbulence governs the spatial distribution and facilitates the formation of ionized species, whereas the background radiation modifies the overall ionization equilibrium, setting the observed distribution of multiphase carbon.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Physisorption on Nanomechanical Resonators: The Overlooked Influence of Trace Moisture
Authors:
Hemant Kumar Verma,
Suman Kumar Mandal,
Darkasha Khan,
Faizan Tariq Beigh,
Manoj Kandpal,
Jaspreet Singh,
Sushobhan Avasthi,
Srinivasan Raghavan,
Akshay Naik
Abstract:
Short gas pulses introduced in a vacuum chamber have long been utilized to showcase the ultra-low mass resolutions achievable with nanomechanical resonators. The resonance frequency shifts are used as evidence of gas adsorption. However, there is very little clarity as to what exactly is adsorbing on to the resonators. We demonstrate that the physisorption of gases on cantilevers is predominantly…
▽ More
Short gas pulses introduced in a vacuum chamber have long been utilized to showcase the ultra-low mass resolutions achievable with nanomechanical resonators. The resonance frequency shifts are used as evidence of gas adsorption. However, there is very little clarity as to what exactly is adsorbing on to the resonators. We demonstrate that the physisorption of gases on cantilevers is predominantly the effect of moisture content that is present even in ultra-high purity gases. The experimental work is performed at low temperatures and in a high vacuum and is supported by theoretical calculations and simulation.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
DRIMV_TSK: An Interpretable Surgical Evaluation Model for Incomplete Multi-View Rectal Cancer Data
Authors:
Wei Zhang,
Zi Wang,
Hanwen Zhou,
Zhaohong Deng,
Weiping Ding,
Yuxi Ge,
Te Zhang,
Yuanpeng Zhang,
Kup-Sze Choi,
Shitong Wang,
Shudong Hu
Abstract:
A reliable evaluation of surgical difficulty can improve the success of the treatment for rectal cancer and the current evaluation method is based on clinical data. However, more data about rectal cancer can be collected with the development of technology. Meanwhile, with the development of artificial intelligence, its application in rectal cancer treatment is becoming possible. In this paper, a m…
▽ More
A reliable evaluation of surgical difficulty can improve the success of the treatment for rectal cancer and the current evaluation method is based on clinical data. However, more data about rectal cancer can be collected with the development of technology. Meanwhile, with the development of artificial intelligence, its application in rectal cancer treatment is becoming possible. In this paper, a multi-view rectal cancer dataset is first constructed to give a more comprehensive view of patients, including the high-resolution MRI image view, pressed-fat MRI image view, and clinical data view. Then, an interpretable incomplete multi-view surgical evaluation model is proposed, considering that it is hard to obtain extensive and complete patient data in real application scenarios. Specifically, a dual representation incomplete multi-view learning model is first proposed to extract the common information between views and specific information in each view. In this model, the missing view imputation is integrated into representation learning, and second-order similarity constraint is also introduced to improve the cooperative learning between these two parts. Then, based on the imputed multi-view data and the learned dual representation, a multi-view surgical evaluation model with the TSK fuzzy system is proposed. In the proposed model, a cooperative learning mechanism is constructed to explore the consistent information between views, and Shannon entropy is also introduced to adapt the view weight. On the MVRC dataset, we compared it with several advanced algorithms and DRIMV_TSK obtained the best results.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization
Authors:
Tingting Liu,
Yuan Liu,
Jinhui Tang,
Liyin Yuan,
Chengyu Liu,
Chunlai Li,
Xiubao Sui,
Qian Chen
Abstract:
Thermal infrared (TIR) images, acquired through thermal radiation imaging, are unaffected by variations in lighting conditions and atmospheric haze. However, TIR images inherently lack color and texture information, limiting downstream tasks and potentially causing visual fatigue. Existing colorization methods primarily rely on single-band images with limited spectral information and insufficient…
▽ More
Thermal infrared (TIR) images, acquired through thermal radiation imaging, are unaffected by variations in lighting conditions and atmospheric haze. However, TIR images inherently lack color and texture information, limiting downstream tasks and potentially causing visual fatigue. Existing colorization methods primarily rely on single-band images with limited spectral information and insufficient feature extraction capabilities, which often result in image distortion and semantic ambiguity. In contrast, multiband infrared imagery provides richer spectral data, facilitating the preservation of finer details and enhancing semantic accuracy. In this paper, we propose a generative adversarial network (GAN)-based framework designed to integrate spectral information to enhance the colorization of infrared images. The framework employs a multi-stage spectral self-attention Transformer network (MTSIC) as the generator. Each spectral feature is treated as a token for self-attention computation, and a multi-head self-attention mechanism forms a spatial-spectral attention residual block (SARB), achieving multi-band feature mapping and reducing semantic confusion. Multiple SARB units are integrated into a Transformer-based single-stage network (STformer), which uses a U-shaped architecture to extract contextual information, combined with multi-scale wavelet blocks (MSWB) to align semantic information in the spatial-frequency dual domain. Multiple STformer modules are cascaded to form MTSIC, progressively optimizing the reconstruction quality. Experimental results demonstrate that the proposed method significantly outperforms traditional techniques and effectively enhances the visual quality of infrared images.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
DuaShepherd: Integrating Stepwise Correctness and Potential Rewards for Mathematical Reasoning
Authors:
Yuanhao Wu,
Juntong Song,
Hanning Zhang,
Tong Zhang,
Cheng Niu
Abstract:
In this paper, we propose DuaShepherd, a novel reward modeling framework that integrates two complementary reward signals, correctness and potential, to enhance the mathematical reasoning capabilities of Large Language Models (LLMs). While correctness-based signals emphasize identification of stepwise errors, potential-based signals focus on the likelihood of reaching the correct final answer. We…
▽ More
In this paper, we propose DuaShepherd, a novel reward modeling framework that integrates two complementary reward signals, correctness and potential, to enhance the mathematical reasoning capabilities of Large Language Models (LLMs). While correctness-based signals emphasize identification of stepwise errors, potential-based signals focus on the likelihood of reaching the correct final answer. We developed an automated pipeline for constructing large-scale reward modeling dataset with both signals. A unified, multi-head architecture was explored to train the two reward models in a multi-task setup, demonstrating benefits from learning both correctness and potential in parallel. By combining these two signals into a compound probability, our model achieves consistent performance improvements across multiple benchmarks. Empirical evaluations on MATH500 and ProcessBench confirm that this combined reward significantly outperforms models trained on either reward type alone, achieving state-of-the-art performance under comparable resource constraints.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Interfacial instability of confined 3D active droplets
Authors:
Bennett C. Sessa,
Federico Cao,
Robert A. Pelcovits,
Thomas R. Powers,
Guillaume Duclos
Abstract:
Instabilities of fluid-fluid interfaces are ubiquitous in passive soft matter. Adding activity to the interface or either fluid can dramatically change the stability of the interface. Using experiment and theory, we investigate the interfacial instability of a deformable 3D active nematic liquid crystal droplet in the isotropic phase surrounded by a passive fluid and confined between two parallel…
▽ More
Instabilities of fluid-fluid interfaces are ubiquitous in passive soft matter. Adding activity to the interface or either fluid can dramatically change the stability of the interface. Using experiment and theory, we investigate the interfacial instability of a deformable 3D active nematic liquid crystal droplet in the isotropic phase surrounded by a passive fluid and confined between two parallel plates. Spontaneous active flows drive the growth of undulations along the active/passive interface, with the mode number of the fastest-growing mode increasing with droplet radius and decreasing with gap height. We apply the lubrication approximation to a minimal nematohydrodynamic model to determine the growth rates of all interfacial modes. The magnitude of the growth rate is determined by the active timescale and the relaxation timescales associated with liquid crystalline order, as well as capillary and viscous stresses. We find multiple points of agreement between experiment and theory, including the shape evolution of individual droplets, the growth rates of unstable modes averaged across many droplets, and the extensional shear flows observed within droplets.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning
Authors:
Mingfei Lau,
Qian Chen,
Yeming Fang,
Tingting Xu,
Tongzhou Chen,
Pavel Golik
Abstract:
Our quality audit for three widely used public multilingual speech datasets - Mozilla Common Voice 17.0, FLEURS, and VoxPopuli - shows that in some languages, these datasets suffer from significant quality issues. We believe addressing these issues will make these datasets more useful as training and evaluation sets, and improve downstream models. We divide these quality issues into two categories…
▽ More
Our quality audit for three widely used public multilingual speech datasets - Mozilla Common Voice 17.0, FLEURS, and VoxPopuli - shows that in some languages, these datasets suffer from significant quality issues. We believe addressing these issues will make these datasets more useful as training and evaluation sets, and improve downstream models. We divide these quality issues into two categories: micro-level and macro-level. We find that macro-level issues are more prevalent in less institutionalized, often under-resourced languages. We provide a case analysis of Taiwanese Southern Min (nan_tw) that highlights the need for proactive language planning (e.g. orthography prescriptions, dialect boundary definition) and enhanced data quality control in the process of Automatic Speech Recognition (ASR) dataset creation. We conclude by proposing guidelines and recommendations to mitigate these issues in future dataset development, emphasizing the importance of sociolinguistic awareness in creating robust and reliable speech data resources.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Operator Splitting Methods: Numerical Solutions of Ordinary Differential Equations via Separation of Variables
Authors:
A. Banjara,
I. AlJabea,
T. Papamarkou,
F. Neubrander
Abstract:
This paper applies the concept of linear semigroups induced by nonlinear flows, originally developed by Dorroh and Neuberger in the 1990s, to the approximation of uniquely solvable initial value problems for nonlinear ordinary differential equations. Building on a framework rooted in the earlier works of Lie, Kowalewski, and Groebner, we analyze nonlinear systems through the lens of the Koopman-Li…
▽ More
This paper applies the concept of linear semigroups induced by nonlinear flows, originally developed by Dorroh and Neuberger in the 1990s, to the approximation of uniquely solvable initial value problems for nonlinear ordinary differential equations. Building on a framework rooted in the earlier works of Lie, Kowalewski, and Groebner, we analyze nonlinear systems through the lens of the Koopman-Lie semigroup exp(tK), where K is the linear Lie generator associated with the flow induced by the nonlinear differential equation. A central feature of this approach is the decomposition K = K1 + ... + KN, which enables the use of operator splitting methods. We revisit the foundational first-order splitting scheme introduced by H. F. Trotter in 1959 and extend it to higher-order schemes with improved error bounds. These theoretical developments are supported by numerical examples that demonstrate the accuracy and efficiency of the proposed methods, which are based entirely on the classical separation of variables technique for solving ordinary differential equations.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Radio emission from airplanes as observed with RNO-G
Authors:
RNO-G Collaboration,
:,
S. Agarwal,
J. A. Aguilar,
N. Alden,
S. Ali,
P. Allison,
M. Betts,
D. Besson,
A. Bishop,
O. Botner,
S. Bouma,
S. Buitink,
R. Camphyn,
J. Chan,
S. Chiche,
B. A. Clark,
A. Coleman,
K. Couberly,
S. de Kockere,
K. D. de Vries,
C. Deaconu,
P. Giri,
C. Glaser,
T. Glüsenkamp
, et al. (58 additional authors not shown)
Abstract:
This paper describes how intentional and unintentional radio emission from airplanes is recorded with the Radio Neutrino Observatory Greenland (RNO-G). We characterize the received signals and define a procedure to extract a clean set of impulsive signals. These signals are highly suitable for instrument calibration, also for future experiments. A set of signals is used to probe the timing precisi…
▽ More
This paper describes how intentional and unintentional radio emission from airplanes is recorded with the Radio Neutrino Observatory Greenland (RNO-G). We characterize the received signals and define a procedure to extract a clean set of impulsive signals. These signals are highly suitable for instrument calibration, also for future experiments. A set of signals is used to probe the timing precision of RNO-G in-situ, which is found to match expectations. We also discuss the impact of these signals on the ability to detect neutrinos with RNO-G.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Kaleidoscopic Teaming in Multi Agent Simulations
Authors:
Ninareh Mehrabi,
Tharindu Kumarage,
Kai-Wei Chang,
Aram Galstyan,
Rahul Gupta
Abstract:
Warning: This paper contains content that may be inappropriate or offensive.
AI agents have gained significant recent attention due to their autonomous tool usage capabilities and their integration in various real-world applications. This autonomy poses novel challenges for the safety of such systems, both in single- and multi-agent scenarios. We argue that existing red teaming or safety evaluat…
▽ More
Warning: This paper contains content that may be inappropriate or offensive.
AI agents have gained significant recent attention due to their autonomous tool usage capabilities and their integration in various real-world applications. This autonomy poses novel challenges for the safety of such systems, both in single- and multi-agent scenarios. We argue that existing red teaming or safety evaluation frameworks fall short in evaluating safety risks in complex behaviors, thought processes and actions taken by agents. Moreover, they fail to consider risks in multi-agent setups where various vulnerabilities can be exposed when agents engage in complex behaviors and interactions with each other. To address this shortcoming, we introduce the term kaleidoscopic teaming which seeks to capture complex and wide range of vulnerabilities that can happen in agents both in single-agent and multi-agent scenarios. We also present a new kaleidoscopic teaming framework that generates a diverse array of scenarios modeling real-world human societies. Our framework evaluates safety of agents in both single-agent and multi-agent setups. In single-agent setup, an agent is given a scenario that it needs to complete using the tools it has access to. In multi-agent setup, multiple agents either compete against or cooperate together to complete a task in the scenario through which we capture existing safety vulnerabilities in agents. We introduce new in-context optimization techniques that can be used in our kaleidoscopic teaming framework to generate better scenarios for safety analysis. Lastly, we present appropriate metrics that can be used along with our framework to measure safety of agents. Utilizing our kaleidoscopic teaming framework, we identify vulnerabilities in various models with respect to their safety in agentic use-cases.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
A Grassroots Network and Community Roadmap for Interconnected Autonomous Science Laboratories for Accelerated Discovery
Authors:
Rafael Ferreira da Silva,
Milad Abolhasani,
Dionysios A. Antonopoulos,
Laura Biven,
Ryan Coffee,
Ian T. Foster,
Leslie Hamilton,
Shantenu Jha,
Theresa Mayer,
Benjamin Mintz,
Robert G. Moore,
Salahudin Nimer,
Noah Paulson,
Woong Shin,
Frederic Suter,
Mitra Taheri,
Michela Taufer,
Newell R. Washburn
Abstract:
Scientific discovery is being revolutionized by AI and autonomous systems, yet current autonomous laboratories remain isolated islands unable to collaborate across institutions. We present the Autonomous Interconnected Science Lab Ecosystem (AISLE), a grassroots network transforming fragmented capabilities into a unified system that shorten the path from ideation to innovation to impact and accele…
▽ More
Scientific discovery is being revolutionized by AI and autonomous systems, yet current autonomous laboratories remain isolated islands unable to collaborate across institutions. We present the Autonomous Interconnected Science Lab Ecosystem (AISLE), a grassroots network transforming fragmented capabilities into a unified system that shorten the path from ideation to innovation to impact and accelerates discovery from decades to months. AISLE addresses five critical dimensions: (1) cross-institutional equipment orchestration, (2) intelligent data management with FAIR compliance, (3) AI-agent driven orchestration grounded in scientific principles, (4) interoperable agent communication interfaces, and (5) AI/ML-integrated scientific education. By connecting autonomous agents across institutional boundaries, autonomous science can unlock research spaces inaccessible to traditional approaches while democratizing cutting-edge technologies. This paradigm shift toward collaborative autonomous science promises breakthroughs in sustainable energy, materials development, and public health.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Optimal Parallel Algorithms for Convex Hulls in 2D and 3D under Noisy Primitive Operations
Authors:
Michael T. Goodrich,
Vinesh Sridhar
Abstract:
In the noisy primitives model, each primitive comparison performed by an algorithm, e.g., testing whether one value is greater than another, returns the incorrect answer with random, independent probability p < 1/2 and otherwise returns a correct answer. This model was first applied in the context of sorting and searching, and recent work by Eppstein, Goodrich, and Sridhar extends this model to se…
▽ More
In the noisy primitives model, each primitive comparison performed by an algorithm, e.g., testing whether one value is greater than another, returns the incorrect answer with random, independent probability p < 1/2 and otherwise returns a correct answer. This model was first applied in the context of sorting and searching, and recent work by Eppstein, Goodrich, and Sridhar extends this model to sequential algorithms involving geometric primitives such as orientation and sidedness tests. However, their approaches appear to be inherently sequential; hence, in this paper, we study parallel computational geometry algorithms for 2D and 3D convex hulls in the noisy primitives model. We give the first optimal parallel algorithms in the noisy primitives model for 2D and 3D convex hulls in the CREW PRAM model. The main technical contribution of our work concerns our ability to detect and fix errors during intermediate steps of our algorithm using a generalization of the failure sweeping technique.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Social Group Bias in AI Finance
Authors:
Thomas R. Cook,
Sophia Kazinnik
Abstract:
Financial institutions increasingly rely on large language models (LLMs) for high-stakes decision-making. However, these models risk perpetuating harmful biases if deployed without careful oversight. This paper investigates racial bias in LLMs specifically through the lens of credit decision-making tasks, operating on the premise that biases identified here are indicative of broader concerns acros…
▽ More
Financial institutions increasingly rely on large language models (LLMs) for high-stakes decision-making. However, these models risk perpetuating harmful biases if deployed without careful oversight. This paper investigates racial bias in LLMs specifically through the lens of credit decision-making tasks, operating on the premise that biases identified here are indicative of broader concerns across financial applications. We introduce a reproducible, counterfactual testing framework that evaluates how models respond to simulated mortgage applicants identical in all attributes except race. Our results reveal significant race-based discrepancies, exceeding historically observed bias levels. Leveraging layer-wise analysis, we track the propagation of sensitive attributes through internal model representations. Building on this, we deploy a control-vector intervention that effectively reduces racial disparities by up to 70% (33% on average) without impairing overall model performance. Our approach provides a transparent and practical toolkit for the identification and mitigation of bias in financial LLM deployments.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
FINCH EYE: The Optical and Optomechanical Design of a GRISM-based SWIR Hyperspectral Imaging Payload for a 3U CubeSat
Authors:
Iliya Shofman,
Mario Ghio Neto,
Theaswanth Ganesh,
Kenya He,
Aidan Armstrong,
Ksenya Narkevich
Abstract:
Crop residue is an important metric used for agricultural land-use monitoring and climate science research. Estimating crop residue coverage is essential to sustainable agricultural practices. The University of Toronto Aerospace Team is developing FINCH EYE, the optical payload for the upcoming FINCH 3U CubeSat, to measure crop residue cover. We conceived of a novel ultra-compact push-broom archit…
▽ More
Crop residue is an important metric used for agricultural land-use monitoring and climate science research. Estimating crop residue coverage is essential to sustainable agricultural practices. The University of Toronto Aerospace Team is developing FINCH EYE, the optical payload for the upcoming FINCH 3U CubeSat, to measure crop residue cover. We conceived of a novel ultra-compact push-broom architecture with a volume phase-holographic grism dispersive element to keep the design compact and simplify the mechanical assembly. The FINCH EYE will image hyperspectral data from 900nm to 1700nm at 10nm spectral resolution, with a spatial resolution of 100m, and a SNR of 100. In this paper, we will describe the optical design of FINCH EYE, which consists of a commercial objective lens, an InGaAs camera, and a custom lens-grism-lens spectrograph. We will also describe the optomechanical housing, emphasizing design features that facilitate proper alignment during assembly.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
An array of bulk-acoustic-wave sensors as a high-frequency antenna for gravitational waves
Authors:
G. Albani,
M. Borghesi,
L. Canonica,
R. Carobene,
F. De Guio,
M. Faverzani,
E. Ferri,
R. Gerosa,
A. Ghezzi,
A. Giachero,
C. Gotti,
D. Labranca,
L. Mariani,
A. Nucciotti,
G. Pessina,
D. Rozza,
T. Tabarelli de Fatis
Abstract:
In their simplest form, bulk acoustic wave (BAW) devices consist of a piezoelectric crystal between two electrodes that transduce the material's vibrations into electrical signals. They are adopted in frequency control and metrology, with well-established standards at frequencies of 5~MHz and above. Their use as a resonant-mass strain antenna for high-frequency gravitational waves has been recentl…
▽ More
In their simplest form, bulk acoustic wave (BAW) devices consist of a piezoelectric crystal between two electrodes that transduce the material's vibrations into electrical signals. They are adopted in frequency control and metrology, with well-established standards at frequencies of 5~MHz and above. Their use as a resonant-mass strain antenna for high-frequency gravitational waves has been recently proposed (Goryachev and Tobar, 2014). The estimated power spectral density sensitivity at the resonant frequencies is of the order of $10^{-21}\, \textrm{strain}/\sqrt{\textrm{Hz}}$. In this paper, after introducing the science opportunity and potential of gravitational wave detection with BAWs, we describe the two-stage BAUSCIA project plan to build a multimode antenna based on commercial BAWs, followed by an optimized array of custom BAWs. We show that commercially available BAWs already provide sensitivity comparable to current experiments around 10~MHz. Finally, we outline options for optimization of custom devices to improve sensitivity in an unexplored region, probe multiple frequencies between 0.1 and 10 MHz, and target specific signals, such as post-merger emission from neutron stars or emission from various dark matter candidates.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Mixed Planewave and Localized Orbital Basis for Sparse-Stochastic Hybrid TDDFT
Authors:
Kyle Chen,
Barry Y. Li,
Tucker Allen,
Daniel Neuhauser
Abstract:
We present a mixed basis-set approach to obtain optical absorption spectra within a generalized Kohn-Sham time-dependent density functional theory framework. All occupied valence molecular orbitals (MOs) are expanded in a plane-wave (PW) basis, while unoccupied MOs are derived primarily from localized atomic basis functions. The method accelerates spectral convergence when compared to fully PW-bas…
▽ More
We present a mixed basis-set approach to obtain optical absorption spectra within a generalized Kohn-Sham time-dependent density functional theory framework. All occupied valence molecular orbitals (MOs) are expanded in a plane-wave (PW) basis, while unoccupied MOs are derived primarily from localized atomic basis functions. The method accelerates spectral convergence when compared to fully PW-based simulations, with a $2-3$ fold reduction in the number of unoccupied MOs entering the Casida equation. The mixed-basis is placed on a common real-space grid, enabling our previously developed deterministic/sparse-stochastic evaluation of the exact exchange operator (DOI: 10.1021/acs.jctc.3c00987). This chemically intuitive and computationally efficient approach is validated across various molecular systems, including $π$-conjugated polymethine dyes, aromatic hydrocarbons, and a chlorophyll monomer.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
A geometric framework for momentum-based optimizers for low-rank training
Authors:
Steffen Schotthöfer,
Timon Klein,
Jonas Kusch
Abstract:
Low-rank pre-training and fine-tuning have recently emerged as promising techniques for reducing the computational and storage costs of large neural networks. Training low-rank parameterizations typically relies on conventional optimizers such as heavy ball momentum methods or Adam. In this work, we identify and analyze potential difficulties that these training methods encounter when used to trai…
▽ More
Low-rank pre-training and fine-tuning have recently emerged as promising techniques for reducing the computational and storage costs of large neural networks. Training low-rank parameterizations typically relies on conventional optimizers such as heavy ball momentum methods or Adam. In this work, we identify and analyze potential difficulties that these training methods encounter when used to train low-rank parameterizations of weights. In particular, we show that classical momentum methods can struggle to converge to a local optimum due to the geometry of the underlying optimization landscape. To address this, we introduce novel training strategies derived from dynamical low-rank approximation, which explicitly account for the underlying geometric structure. Our approach leverages and combines tools from dynamical low-rank approximation and momentum-based optimization to design optimizers that respect the intrinsic geometry of the parameter space. We validate our methods through numerical experiments, demonstrating faster convergence, and stronger validation metrics at given parameter budgets.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Photogranulometry -- Dataset of soil images with corresponding particle size distributions
Authors:
Thomas Plante St-Cyr,
François Duhaime,
Jean-Sébastien Dubé,
Simon Grenier
Abstract:
Traditional particle size distribution (PSD) analyses create significant downtime and are expensive in labor and maintenance. These drawbacks could be alleviated using optical grain size analysis integrated into routine geotechnical laboratory workflow. This paper presents a high-resolution dataset of 12,714 images of 321 different soil samples collected in the Montreal, Quebec region, alongside t…
▽ More
Traditional particle size distribution (PSD) analyses create significant downtime and are expensive in labor and maintenance. These drawbacks could be alleviated using optical grain size analysis integrated into routine geotechnical laboratory workflow. This paper presents a high-resolution dataset of 12,714 images of 321 different soil samples collected in the Montreal, Quebec region, alongside their PSD analysis. It is designed to provide a robust starting point for training convolutional neural networks (CNN) in geotechnical applications. Soil samples were photographed in a standardized top-view position with a resolution of 45 MP and a minimum scale of 39.4 micrometers per pixel, both in their moist and dry states. A custom test bench employing 13x9 inch white aluminum trays, on which the samples are spread in a thin layer, was used. For samples exceeding a size limit, a coning and quartering method was employed for mass reduction.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Automata on $S$-adic words
Authors:
Valérie Berthé,
Toghrul Karimov,
Mihir Vahanwala
Abstract:
A fundamental question in logic and verification is the following: for which unary predicates $P_1, \ldots, P_k$ is the monadic second-order theory of $\langle \mathbb{N}; <, P_1, \ldots, P_k \rangle$ decidable? Equivalently, for which infinite words $α$ can we decide whether a given Büchi automaton $A$ accepts $α$? Carton and Thomas showed decidability in case $α$ is a fixed point of a letter-to-…
▽ More
A fundamental question in logic and verification is the following: for which unary predicates $P_1, \ldots, P_k$ is the monadic second-order theory of $\langle \mathbb{N}; <, P_1, \ldots, P_k \rangle$ decidable? Equivalently, for which infinite words $α$ can we decide whether a given Büchi automaton $A$ accepts $α$? Carton and Thomas showed decidability in case $α$ is a fixed point of a letter-to-word substitution $σ$, i.e., $σ(α) = α$. However, abundantly more words, e.g., Sturmian words, are characterised by a broader notion of self-similarity that uses a set $S$ of substitutions. A word $α$ is said to be directed by a sequence $s = (σ_n)_{n \in \mathbb{N}}$ over $S$ if there is a sequence of words $(α_n)_{n \in \mathbb{N}}$ such that $α_0 = α$ and $α_n = σ_n(α_{n+1})$ for all $n$; such $α$ is called $S$-adic. We study the automaton acceptance problem for such words and prove, among others, the following. Given finite $S$ and an automaton $A$, we can compute an automaton $B$ that accepts $s \in S^ω$ if and only if $s$ directs a word $α$ accepted by $A$. Thus we can algorithmically answer questions of the form "Which $S$-adic words are accepted by a given automaton $A$?"
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
AQUA20: A Benchmark Dataset for Underwater Species Classification under Challenging Conditions
Authors:
Taufikur Rahman Fuad,
Sabbir Ahmed,
Shahriar Ivan
Abstract:
Robust visual recognition in underwater environments remains a significant challenge due to complex distortions such as turbidity, low illumination, and occlusion, which severely degrade the performance of standard vision systems. This paper introduces AQUA20, a comprehensive benchmark dataset comprising 8,171 underwater images across 20 marine species reflecting real-world environmental challenge…
▽ More
Robust visual recognition in underwater environments remains a significant challenge due to complex distortions such as turbidity, low illumination, and occlusion, which severely degrade the performance of standard vision systems. This paper introduces AQUA20, a comprehensive benchmark dataset comprising 8,171 underwater images across 20 marine species reflecting real-world environmental challenges such as illumination, turbidity, occlusions, etc., providing a valuable resource for underwater visual understanding. Thirteen state-of-the-art deep learning models, including lightweight CNNs (SqueezeNet, MobileNetV2) and transformer-based architectures (ViT, ConvNeXt), were evaluated to benchmark their performance in classifying marine species under challenging conditions. Our experimental results show ConvNeXt achieving the best performance, with a Top-3 accuracy of 98.82% and a Top-1 accuracy of 90.69%, as well as the highest overall F1-score of 88.92% with moderately large parameter size. The results obtained from our other benchmark models also demonstrate trade-offs between complexity and performance. We also provide an extensive explainability analysis using GRAD-CAM and LIME for interpreting the strengths and pitfalls of the models. Our results reveal substantial room for improvement in underwater species recognition and demonstrate the value of AQUA20 as a foundation for future research in this domain. The dataset is publicly available at: https://huggingface.co/datasets/taufiktrf/AQUA20.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Transient Concepts in Streaming Graphs
Authors:
Aida Sheshbolouki,
M. Tamer Ozsu
Abstract:
Concept Drift (CD) occurs when a change in a hidden context can induce changes in a target concept. CD is a natural phenomenon in non-stationary settings such as data streams. Understanding, detection, and adaptation to CD in streaming data is (i) vital for effective and efficient analytics as reliable output depends on adaptation to fresh input, (ii) challenging as it requires efficient operation…
▽ More
Concept Drift (CD) occurs when a change in a hidden context can induce changes in a target concept. CD is a natural phenomenon in non-stationary settings such as data streams. Understanding, detection, and adaptation to CD in streaming data is (i) vital for effective and efficient analytics as reliable output depends on adaptation to fresh input, (ii) challenging as it requires efficient operations as well as effective performance evaluations, and (iii) impactful as it applies to a variety of use cases and is a crucial initial step for data management systems. Current works are mostly focused on passive CD detection as part of supervised adaptation, on independently generated data instances or graph snapshots, on target concepts as a function of data labels, on static data management, and on specific temporal order of data record. These methods do not always work. We revisit CD for the streaming graphs setting and introduce two first-of-its-kind frameworks SGDD and SGDP for streaming graph CD detection and prediction. Both frameworks discern the change of generative source. SGDD detects the CDs due to the changes of generative parameters with significant delays such that it is difficult to evaluate the performance, while SGDP predicts these CDs between 7374 to 0.19 milliseconds ahead of their occurrence, without accessing the payloads of data records.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
The extinction of the contact process in a one-dimensional random environment with long-range interactions
Authors:
Pablo A. Gomes,
Marcelo R. Hilário,
Bernardo N. B. de Lima,
Thomas Mountford
Abstract:
We study the contact process on the long-range percolation cluster on $\mathbb{Z}$ where each edge $\langle i,j \rangle$ is open with probability $|i-j|^{-s}$ for $s> 2$. Using a renormalization procedure we apply Peierls-type argument to prove that the contact process dies out if the transmission rate is smaller than a critical threshold. Our methods involve the control of crossing probabilities…
▽ More
We study the contact process on the long-range percolation cluster on $\mathbb{Z}$ where each edge $\langle i,j \rangle$ is open with probability $|i-j|^{-s}$ for $s> 2$. Using a renormalization procedure we apply Peierls-type argument to prove that the contact process dies out if the transmission rate is smaller than a critical threshold. Our methods involve the control of crossing probabilities for percolation on randomly-stretched lattices as in https://doi.org/10.1214/22-AAP1887.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Resource Rational Contractualism Should Guide AI Alignment
Authors:
Sydney Levine,
Matija Franklin,
Tan Zhi-Xuan,
Secil Yanik Guyot,
Lionel Wong,
Daniel Kilov,
Yejin Choi,
Joshua B. Tenenbaum,
Noah Goodman,
Seth Lazar,
Iason Gabriel
Abstract:
AI systems will soon have to navigate human environments and make decisions that affect people and other AI agents whose goals and values diverge. Contractualist alignment proposes grounding those decisions in agreements that diverse stakeholders would endorse under the right conditions, yet securing such agreement at scale remains costly and slow -- even for advanced AI. We therefore propose Reso…
▽ More
AI systems will soon have to navigate human environments and make decisions that affect people and other AI agents whose goals and values diverge. Contractualist alignment proposes grounding those decisions in agreements that diverse stakeholders would endorse under the right conditions, yet securing such agreement at scale remains costly and slow -- even for advanced AI. We therefore propose Resource-Rational Contractualism (RRC): a framework where AI systems approximate the agreements rational parties would form by drawing on a toolbox of normatively-grounded, cognitively-inspired heuristics that trade effort for accuracy. An RRC-aligned agent would not only operate efficiently, but also be equipped to dynamically adapt to and interpret the ever-changing human social world.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
A Liquid-Nitrogen-Cooled Ca+ Ion Optical Clock with a Systematic Uncertainty of 4.6E-19
Authors:
Baolin Zhang,
Zixiao Ma,
Yao Huang,
Huili Han,
Ruming Hu,
Yuzhuo Wang,
Huaqing Zhang,
Liyan Tang,
Tingyun Shi,
Hua Guan,
Kein Gao
Abstract:
We report a single-ion optical clock based on the 4S_1/2-3D_5/2 transition of the 40Ca+ ion, operated in a liquid nitrogen cryogenic environment,achieving a total systematic uncertainty of 4.6E-19. We employ a refined temperature evaluation scheme to reduce the frequency uncertainty due to blackbody radiation (BBR), and the 3D sideband cooling has been implemented to minimize the second-order Dopp…
▽ More
We report a single-ion optical clock based on the 4S_1/2-3D_5/2 transition of the 40Ca+ ion, operated in a liquid nitrogen cryogenic environment,achieving a total systematic uncertainty of 4.6E-19. We employ a refined temperature evaluation scheme to reduce the frequency uncertainty due to blackbody radiation (BBR), and the 3D sideband cooling has been implemented to minimize the second-order Doppler shift. We have precisely determined the average Zeeman coefficient of the 40Ca+ clock transition to be 14.345(40) Hz/mT^2, thereby significantly reducing the quadratic Zeeman shift uncertainty. Moreover, the cryogenic environment enables the lowest reported heating rate due to ambient electric field noise in trapped-ion optical clocks.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.