-
Scalable and Site-Specific Frequency Tuning of Two-Level System Defects in Superconducting Qubit Arrays
Authors:
Larry Chen,
Kan-Heng Lee,
Chuan-Hong Liu,
Brian Marinelli,
Ravi K. Naik,
Ziqi Kang,
Noah Goss,
Hyunseong Kim,
David I. Santiago,
Irfan Siddiqi
Abstract:
State-of-the-art superconducting quantum processors containing tens to hundreds of qubits have demonstrated the building blocks for realizing fault-tolerant quantum computation. Nonetheless, a fundamental barrier to scaling further is the prevalence of fluctuating quantum two-level system (TLS) defects that can couple resonantly to qubits, causing excess decoherence and enhanced gate errors. Here…
▽ More
State-of-the-art superconducting quantum processors containing tens to hundreds of qubits have demonstrated the building blocks for realizing fault-tolerant quantum computation. Nonetheless, a fundamental barrier to scaling further is the prevalence of fluctuating quantum two-level system (TLS) defects that can couple resonantly to qubits, causing excess decoherence and enhanced gate errors. Here we introduce a scalable architecture for site-specific and in-situ manipulation of TLS frequencies out of the spectral vicinity of our qubits. Our method is resource efficient, combining TLS frequency tuning and universal single qubit control into a single on-chip control line per qubit. We independently control each qubit's dissipative environment to dynamically improve both qubit coherence times and single qubit gate fidelities -- with a constant time overhead that does not scale with the device size. Over a period of 40 hours across 6 qubits, we demonstrate a $36\%$ improvement in average single qubit error rates and a $17\%$ improvement in average energy relaxation times. Critically, we realize a 4-fold suppression in the occurrence of TLS-induced performance outliers, and a complete reduction of simultaneous outlier events. These results mark a significant step toward overcoming the challenges that TLS defects pose to scaling superconducting quantum processors.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
Authors:
Sunghyun Ahn,
Youngwan Jo,
Kijung Lee,
Sein Kwon,
Inpyo Hong,
Sanghyun Park
Abstract:
Video anomaly detection (VAD) is crucial for video analysis and surveillance in computer vision. However, existing VAD models rely on learned normal patterns, which makes them difficult to apply to diverse environments. Consequently, users should retrain models or develop separate AI models for new environments, which requires expertise in machine learning, high-performance hardware, and extensive…
▽ More
Video anomaly detection (VAD) is crucial for video analysis and surveillance in computer vision. However, existing VAD models rely on learned normal patterns, which makes them difficult to apply to diverse environments. Consequently, users should retrain models or develop separate AI models for new environments, which requires expertise in machine learning, high-performance hardware, and extensive data collection, limiting the practical usability of VAD. To address these challenges, this study proposes customizable video anomaly detection (C-VAD) technique and the AnyAnomaly model. C-VAD considers user-defined text as an abnormal event and detects frames containing a specified event in a video. We effectively implemented AnyAnomaly using a context-aware visual question answering without fine-tuning the large vision language model. To validate the effectiveness of the proposed model, we constructed C-VAD datasets and demonstrated the superiority of AnyAnomaly. Furthermore, our approach showed competitive performance on VAD benchmark datasets, achieving state-of-the-art results on the UBnormal dataset and outperforming other methods in generalization across all datasets. Our code is available online at github.com/SkiddieAhn/Paper-AnyAnomaly.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Classification of Fragile Topology Enabled by Matrix Homotopy
Authors:
Ki Young Lee,
Stephan Wong,
Sachin Vaidya,
Terry A. Loring,
Alexander Cerjan
Abstract:
The moire flat bands in twisted bilayer graphene have attracted considerable attention not only because of the emergence of correlated phases but also due to their nontrivial topology. Specifically, they exhibit a new class of topology that can be nullified by the addition of trivial bands, termed fragile topology, which suggests the need for an expansion of existing classification schemes. Here,…
▽ More
The moire flat bands in twisted bilayer graphene have attracted considerable attention not only because of the emergence of correlated phases but also due to their nontrivial topology. Specifically, they exhibit a new class of topology that can be nullified by the addition of trivial bands, termed fragile topology, which suggests the need for an expansion of existing classification schemes. Here, we develop a Z2 energy-resolved topological marker for classifying fragile phases using a system's position-space description, enabling the direct classification of finite, disordered, and aperiodic materials. By translating the physical symmetries protecting the system's fragile topological phase into matrix symmetries of the system's Hamiltonian and position operators, we use matrix homotopy to construct our topological marker while simultaneously yielding a quantitative measure of topological robustness. We show our framework's effectiveness using a C2T-symmetric twisted bilayer graphene model and photonic crystal as a continuum example. We have found that fragile topology can persist both under strong disorder and in heterostructures lacking a bulk spectral gap, and even an example of disorder-induced re-entrant topology. Overall, the proposed scheme serves as an effective tool for elucidating aspects of fragile topology, offering guidance for potential applications across a variety of experimental platforms from topological photonics to correlated phases in materials.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm
Authors:
Hyeonjun Kim,
Kanghoon Lee,
Junho Park,
Jiachen Li,
Jinkyoo Park
Abstract:
Multi-Agent Reinforcement Learning (MARL) has shown promise in solving complex problems involving cooperation and competition among agents, such as an Unmanned Surface Vehicle (USV) swarm used in search and rescue, surveillance, and vessel protection. However, aligning system behavior with user preferences is challenging due to the difficulty of encoding expert intuition into reward functions. To…
▽ More
Multi-Agent Reinforcement Learning (MARL) has shown promise in solving complex problems involving cooperation and competition among agents, such as an Unmanned Surface Vehicle (USV) swarm used in search and rescue, surveillance, and vessel protection. However, aligning system behavior with user preferences is challenging due to the difficulty of encoding expert intuition into reward functions. To address the issue, we propose a Reinforcement Learning with Human Feedback (RLHF) approach for MARL that resolves credit-assignment challenges through an Agent-Level Feedback system categorizing feedback into intra-agent, inter-agent, and intra-team types. To overcome the challenges of direct human feedback, we employ a Large Language Model (LLM) evaluator to validate our approach using feedback scenarios such as region constraints, collision avoidance, and task allocation. Our method effectively refines USV swarm policies, addressing key challenges in multi-agent systems while maintaining fairness and performance consistency.
△ Less
Submitted 7 March, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
Positivity of generalized cluster scattering diagrams
Authors:
Amanda Burcroff,
Kyungyong Lee,
Lang Mou
Abstract:
We introduce a new class of combinatorial objects, named tight gradings, which are certain nonnegative integer-valued functions on maximal Dyck paths. Using tight gradings, we derive a manifestly positive formula for any wall-function in a rank-2 generalized cluster scattering diagram. We further prove that any consistent rank-2 scattering diagram is positive with respect to the coefficients of in…
▽ More
We introduce a new class of combinatorial objects, named tight gradings, which are certain nonnegative integer-valued functions on maximal Dyck paths. Using tight gradings, we derive a manifestly positive formula for any wall-function in a rank-2 generalized cluster scattering diagram. We further prove that any consistent rank-2 scattering diagram is positive with respect to the coefficients of initial wall-functions. Moreover, our formula yields explicit expressions for relative Gromov-Witten invariants on weighted projective planes and the Euler characteristics of moduli spaces of framed stable representations on complete bipartite quivers. Finally, by leveraging the rank-2 positivity, we show that any higher-rank generalized cluster scattering diagram has positive wall-functions, which leads to a proof of the positivity of the Laurent phenomenon and the strong positivity of Chekhov-Shapiro's generalized cluster algebras.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
A Linear Decomposition Method to Analyze and Study Pulsar Mode Changes
Authors:
Longfei Hao,
Zhixuan Li,
Faxin Shen,
Yonghua Xu,
Yuxiang Huang,
Kejia Lee,
Qingzheng Yu,
Hongguang Wang
Abstract:
In this paper, we present the linear decomposition method (LDM), which we developed to detect and analyze pulsar profile variations and mode changing behaviour. We developed LDM utilizing the likelihood function approach assuming the Gaussian noise. The LDM projects pulse profiles onto significance-ordered orthonormal vector bases. We show that the method is similar to the principal component anal…
▽ More
In this paper, we present the linear decomposition method (LDM), which we developed to detect and analyze pulsar profile variations and mode changing behaviour. We developed LDM utilizing the likelihood function approach assuming the Gaussian noise. The LDM projects pulse profiles onto significance-ordered orthonormal vector bases. We show that the method is similar to the principal component analysis (PCA), but LDM can handle more general situations. We use simulated dataset and data from the Kunming 40-m radio telescope to demonstrate the application of the LDM. We found that the LDM successfully identified mode changes for well-known mode-changing PSR B0329+54 and found a continuous pulse profile evolution for PSR B0355+54 . We also show that the LDM can be used to improve the timing precision for mode changing PSR B0329+54.
△ Less
Submitted 6 March, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
Node-level Contrastive Unlearning on Graph Neural Networks
Authors:
Hong kyu Lee,
Qiuchen Zhang,
Carl Yang,
Li Xiong
Abstract:
Graph unlearning aims to remove a subset of graph entities (i.e. nodes and edges) from a graph neural network (GNN) trained on the graph. Unlike machine unlearning for models trained on Euclidean-structured data, effectively unlearning a model trained on non-Euclidean-structured data, such as graphs, is challenging because graph entities exhibit mutual dependencies. Existing works utilize graph pa…
▽ More
Graph unlearning aims to remove a subset of graph entities (i.e. nodes and edges) from a graph neural network (GNN) trained on the graph. Unlike machine unlearning for models trained on Euclidean-structured data, effectively unlearning a model trained on non-Euclidean-structured data, such as graphs, is challenging because graph entities exhibit mutual dependencies. Existing works utilize graph partitioning, influence function, or additional layers to achieve graph unlearning. However, none of them can achieve high scalability and effectiveness without additional constraints. In this paper, we achieve more effective graph unlearning by utilizing the embedding space. The primary training objective of a GNN is to generate proper embeddings for each node that encapsulates both structural information and node feature representations. Thus, directly optimizing the embedding space can effectively remove the target nodes' information from the model. Based on this intuition, we propose node-level contrastive unlearning (Node-CUL). It removes the influence of the target nodes (unlearning nodes) by contrasting the embeddings of remaining nodes and neighbors of unlearning nodes. Through iterative updates, the embeddings of unlearning nodes gradually become similar to those of unseen nodes, effectively removing the learned information without directly incorporating unseen data. In addition, we introduce a neighborhood reconstruction method that optimizes the embeddings of the neighbors in order to remove influence of unlearning nodes to maintain the utility of the GNN model. Experiments on various graph data and models show that our Node-CUL achieves the best unlearn efficacy and enhanced model utility with requiring comparable computing resources with existing frameworks.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios
Authors:
Bryan Chen Zhengyu Tan,
Roy Ka-Wei Lee
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities in simulating human behaviour and social intelligence. However, they risk perpetuating societal biases, especially when demographic information is involved. We introduce a novel framework using cosine distance to measure semantic shifts in responses and an LLM-judged Preference Win Rate (WR) to assess how demographic prompts af…
▽ More
Large language models (LLMs) have demonstrated remarkable capabilities in simulating human behaviour and social intelligence. However, they risk perpetuating societal biases, especially when demographic information is involved. We introduce a novel framework using cosine distance to measure semantic shifts in responses and an LLM-judged Preference Win Rate (WR) to assess how demographic prompts affect response quality across power-disparate social scenarios. Evaluating five LLMs over 100 diverse social scenarios and nine demographic axes, our findings suggest a "default persona" bias toward middle-aged, able-bodied, native-born, Caucasian, atheistic males with centrist views. Moreover, interactions involving specific demographics are associated with lower-quality responses. Lastly, the presence of power disparities increases variability in response semantics and quality across demographic groups, suggesting that implicit biases may be heightened under power-imbalanced conditions. These insights expose the demographic biases inherent in LLMs and offer potential paths toward future bias mitigation efforts in LLMs.
△ Less
Submitted 22 April, 2025; v1 submitted 3 March, 2025;
originally announced March 2025.
-
Trajectory-Class-Aware Multi-Agent Reinforcement Learning
Authors:
Hyungho Na,
Kwanghyeon Lee,
Sumin Lee,
Il-Chul Moon
Abstract:
In the context of multi-agent reinforcement learning, generalization is a challenge to solve various tasks that may require different joint policies or coordination without relying on policies specialized for each task. We refer to this type of problem as a multi-task, and we train agents to be versatile in this multi-task setting through a single training process. To address this challenge, we in…
▽ More
In the context of multi-agent reinforcement learning, generalization is a challenge to solve various tasks that may require different joint policies or coordination without relying on policies specialized for each task. We refer to this type of problem as a multi-task, and we train agents to be versatile in this multi-task setting through a single training process. To address this challenge, we introduce TRajectory-class-Aware Multi-Agent reinforcement learning (TRAMA). In TRAMA, agents recognize a task type by identifying the class of trajectories they are experiencing through partial observations, and the agents use this trajectory awareness or prediction as additional information for action policy. To this end, we introduce three primary objectives in TRAMA: (a) constructing a quantized latent space to generate trajectory embeddings that reflect key similarities among them; (b) conducting trajectory clustering using these trajectory embeddings; and (c) building a trajectory-class-aware policy. Specifically for (c), we introduce a trajectory-class predictor that performs agent-wise predictions on the trajectory class; and we design a trajectory-class representation model for each trajectory class. Each agent takes actions based on this trajectory-class representation along with its partial observation for task-aware execution. The proposed method is evaluated on various tasks, including multi-task problems built upon StarCraft II. Empirical results show further performance improvements over state-of-the-art baselines.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive Learning
Authors:
Ukcheol Shin,
Kyunghyun Lee,
Jean Oh
Abstract:
Deploying depth estimation networks in the real world requires high-level robustness against various adverse conditions to ensure safe and reliable autonomy. For this purpose, many autonomous vehicles employ multi-modal sensor systems, including an RGB camera, NIR camera, thermal camera, LiDAR, or Radar. They mainly adopt two strategies to use multiple sensors: modality-wise and multi-modal fused…
▽ More
Deploying depth estimation networks in the real world requires high-level robustness against various adverse conditions to ensure safe and reliable autonomy. For this purpose, many autonomous vehicles employ multi-modal sensor systems, including an RGB camera, NIR camera, thermal camera, LiDAR, or Radar. They mainly adopt two strategies to use multiple sensors: modality-wise and multi-modal fused inference. The former method is flexible but memory-inefficient, unreliable, and vulnerable. Multi-modal fusion can provide high-level reliability, yet it needs a specialized architecture. In this paper, we propose an effective solution, named align-and-fuse strategy, for the depth estimation from multi-spectral images. In the align stage, we align embedding spaces between multiple spectrum bands to learn shareable representation across multi-spectral images by minimizing contrastive loss of global and spatially aligned local features with geometry cue. After that, in the fuse stage, we train an attachable feature fusion module that can selectively aggregate the multi-spectral features for reliable and robust prediction results. Based on the proposed method, a single-depth network can achieve both spectral-invariant and multi-spectral fused depth estimation while preserving reliability, memory efficiency, and flexibility.
△ Less
Submitted 2 March, 2025;
originally announced March 2025.
-
Validity of the total quasi-steady-state approximation in stochastic biochemical reaction networks
Authors:
Yun Min Song,
Kangmin Lee,
Jae Kyoung Kim
Abstract:
Stochastic models for biochemical reaction networks are widely used to explore their complex dynamics but face significant challenges, including difficulties in determining rate constants and high computational costs. To address these issues, model reduction approaches based on deterministic quasi-steady-state approximations (QSSA) have been employed, resulting in propensity functions in the form…
▽ More
Stochastic models for biochemical reaction networks are widely used to explore their complex dynamics but face significant challenges, including difficulties in determining rate constants and high computational costs. To address these issues, model reduction approaches based on deterministic quasi-steady-state approximations (QSSA) have been employed, resulting in propensity functions in the form of deterministic non-elementary reaction functions, such as the Michaelis-Menten equation. In particular, the total QSSA (tQSSA), known for its accuracy in deterministic frameworks, has been perceived as universally valid for stochastic model reduction. However, recent studies have challenged this perception. In this review, we demonstrate that applying tQSSA in stochastic model reduction can distort dynamics, even in cases where the deterministic tQSSA is rigorously valid. This highlights the need for caution when using deterministic QSSA in stochastic model reduction to avoid erroneous conclusions from model simulations.
△ Less
Submitted 2 March, 2025;
originally announced March 2025.
-
COSMOS Spectroscopic Redshift Compilation (First Data Release): 165k Redshifts Encompassing Two Decades of Spectroscopy
Authors:
Ali Ahmad Khostovan,
Jeyhan S. Kartaltepe,
Mara Salvato,
Olivier Ilbert,
Caitlin M. Casey,
Hiddo Algera,
Jacqueline Antwi-Danso,
Andrew Battisti,
Malte Brinch,
Marcella Brusa,
Antonello Calabro,
Peter L. Capak,
Nima Chartab,
Olivia R. Cooper,
Isa G. Cox,
Behnam Darvish,
Nicole E. Drakos,
Andreas L. Faisst,
Matthew R. George,
Ghassem Gozaliasl,
Santosh Harish,
Gunther Hasinger,
Hossein Hatamnia,
Angela Iovino,
Shuowen Jin
, et al. (28 additional authors not shown)
Abstract:
We present the COSMOS Spectroscopic Redshift Compilation encompassing ~ 20 years of spectroscopic redshifts within the 2 deg$^2$ COSMOS legacy field. This compilation contains 165,312 redshifts of 97,929 unique objects from 108 individual observing programs up to $z \sim 8$ with median stellar mass $\sim 10^{9}$ to $10^{10}$ M$_\odot$ (redshift dependent). Rest-frame $NUVrJ$ colors and SFR -- stel…
▽ More
We present the COSMOS Spectroscopic Redshift Compilation encompassing ~ 20 years of spectroscopic redshifts within the 2 deg$^2$ COSMOS legacy field. This compilation contains 165,312 redshifts of 97,929 unique objects from 108 individual observing programs up to $z \sim 8$ with median stellar mass $\sim 10^{9}$ to $10^{10}$ M$_\odot$ (redshift dependent). Rest-frame $NUVrJ$ colors and SFR -- stellar mass correlations show the compilation primarily contains low- to intermediate-mass star-forming and massive, quiescent galaxies at $z < 1.25$ and mostly low-mass bursty star-forming galaxies at $z > 2$. Sources in the compilation cover a diverse range of environments, including protoclusters such as ``Hyperion''. The full compilation is 50\% spectroscopically complete by $i \sim 23.2$ and $K_s \sim 21.3$ mag; however, this is redshift dependent. Spatially, the compilation is $>50$\% complete within the CANDELS area, while the outer regions of COSMOS are $>10$\% complete limited to $i < 24$ mag and $K_S < 22.5$ mag, separately. We demonstrate how the compilation can be used to validate photometric redshifts and investigate calibration metrics. By training self-organizing maps on COSMOS2020/Classic and projecting the compilation onto it, we find key galaxy subpopulations that currently lack spectroscopic coverage including $z < 1$ intermediate-mass quiescent galaxies and low-/intermediate-mass bursty star-forming galaxies, $z \sim 2$ massive quiescent galaxies, and $z > 3$ massive star-forming galaxies. This approach highlights how combining self-organizing maps with our compilation can provide guidance for future spectroscopic observations to get a complete spectroscopic view of galaxy populations. Lastly, the compilation will undergo periodic data releases that incorporate new spectroscopic redshift measurements, providing a lasting legacy resource for the community.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Authors:
Otto Brookes,
Maksim Kukushkin,
Majid Mirmehdi,
Colleen Stephens,
Paula Dieguez,
Thurston C. Hicks,
Sorrel Jones,
Kevin Lee,
Maureen S. McCarthy,
Amelia Meier,
Emmanuelle Normand,
Erin G. Wessling,
Roman M. Wittig,
Kevin Langergraber,
Klaus Zuberbühler,
Lukas Boesch,
Thomas Schmid,
Mimi Arandjelovic,
Hjalmar Kühl,
Tilo Burghardt
Abstract:
Computer vision analysis of camera trap video footage is essential for wildlife conservation, as captured behaviours offer some of the earliest indicators of changes in population health. Recently, several high-impact animal behaviour datasets and methods have been introduced to encourage their use; however, the role of behaviour-correlated background information and its significant effect on out-…
▽ More
Computer vision analysis of camera trap video footage is essential for wildlife conservation, as captured behaviours offer some of the earliest indicators of changes in population health. Recently, several high-impact animal behaviour datasets and methods have been introduced to encourage their use; however, the role of behaviour-correlated background information and its significant effect on out-of-distribution generalisation remain unexplored. In response, we present the PanAf-FGBG dataset, featuring 20 hours of wild chimpanzee behaviours, recorded at over 350 individual camera locations. Uniquely, it pairs every video with a chimpanzee (referred to as a foreground video) with a corresponding background video (with no chimpanzee) from the same camera location. We present two views of the dataset: one with overlapping camera locations and one with disjoint locations. This setup enables, for the first time, direct evaluation of in-distribution and out-of-distribution conditions, and for the impact of backgrounds on behaviour recognition models to be quantified. All clips come with rich behavioural annotations and metadata including unique camera IDs and detailed textual scene descriptions. Additionally, we establish several baselines and present a highly effective latent-space normalisation technique that boosts out-of-distribution performance by +5.42% mAP for convolutional and +3.75% mAP for transformer-based models. Finally, we provide an in-depth analysis on the role of backgrounds in out-of-distribution behaviour recognition, including the so far unexplored impact of background durations (i.e., the count of background frames within foreground videos).
△ Less
Submitted 19 March, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
The Chinese pulsar timing array data release I. Polarimetry for 56 millisecond pulsars
Authors:
Jiangwei Xu,
Jinchen Jiang,
Heng Xu,
Bojun Wang,
Zihan Xue,
Siyuan Chen,
Yanjun Guo,
R. Nicolas Caballero,
Kejia Lee,
Jianping Yuan,
Yonghua Xu,
Jingbo Wang,
Longfei Hao,
Zhixuan Li,
Yuxiang Huang,
Zezhong Xu,
Jintao Luo,
Jinlin Han,
Peng Jiang,
Zhiqiang Shen,
Min Wang,
Na Wang,
Renxin Xu,
Xiangping Wu,
Lei Qian
, et al. (5 additional authors not shown)
Abstract:
We present polarization pulse profiles for 56 millisecond pulsars (MSPs) monitored by the Chinese Pulsar Timing Array (CPTA) collaboration using the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The observations centered at 1.25 GHz with a raw bandwidth of 500 MHz. Due to the high sensitivity ($\sim$16 K/Jy) of the FAST telescope and our long integration time, the high signal-to-no…
▽ More
We present polarization pulse profiles for 56 millisecond pulsars (MSPs) monitored by the Chinese Pulsar Timing Array (CPTA) collaboration using the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The observations centered at 1.25 GHz with a raw bandwidth of 500 MHz. Due to the high sensitivity ($\sim$16 K/Jy) of the FAST telescope and our long integration time, the high signal-to-noise ratio polarization profiles show features hardly detected before. Among 56 pulsars, the polarization profiles of PSRs J0406$+$3039, J1327$+$3423, and J2022$+$2534 were not previously reported. 80\% of MSPs in the sample show weak components below 3\% of peak flux, 25\% of pulsars show interpulse-like structures, and most pulsars show linear polarization position angle jumps. Six pulsars seem to be emitting for full rotation phase, with another thirteen pulsars being good candidates for such a 360$^\circ$ radiator. We find that the distribution of the polarization percentage in our sample is compatible with the normal pulsar distribution. Our detailed evaluation of the MSP polarization properties suggests that the wave propagation effects in the pulsar magnetosphere are important in shaping the MSP polarization pulse profiles.
△ Less
Submitted 20 April, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
Twisted oxide membrane interface by local atomic registry design
Authors:
Min-Su Kim,
Kyoungjun Lee,
Ryo Ishikawa,
Kyung Song,
Naafis Ahnaf Shahed,
Ki-Tae Eom,
Mark S. Rzchowski,
Evgeny Y. Tsymbal,
Naoya Shibata,
Teruyasu Mizoguchi,
Chang-Beom Eom,
Si-Young Choi
Abstract:
Interplay of lattice, orbital, and charge degrees of freedom in complex oxide materials has hosted a plethora of exotic quantum phases and physical properties. Recent advances in synthesis of freestanding complex oxide membranes and twisted heterostructures assembled from membranes provide new opportunities for discovery using moiré design with local lattice control. To this end, we designed moiré…
▽ More
Interplay of lattice, orbital, and charge degrees of freedom in complex oxide materials has hosted a plethora of exotic quantum phases and physical properties. Recent advances in synthesis of freestanding complex oxide membranes and twisted heterostructures assembled from membranes provide new opportunities for discovery using moiré design with local lattice control. To this end, we designed moiré crystals at the coincidence site lattice condition, providing commensurate structure within the moiré supercell arising from the multi-atom complex oxide unit cell. We fabricated such twisted bilayers from freestanding SrTiO3 membranes and used depth sectioning-based TEM methods to discover ordered charge states at the moiré interface. By selectively imaging SrTiO3 atomic planes at different depths through the bilayer, we clearly resolved the moiré periodic structure at the twisted interface and found that it exhibits lattice-dependent charge disproportionation in the local atomic registry within the moiré supercell. Our density-functional modelling of the twisted oxide interface predicts that these moiré phenomena are accompanied by the emergence of a two-dimensional flat band that can drive new electronic phases. Our work provides a novel guideline for controlling moiré periodicity in twisted oxides and opens pathways to exploit the new functionalities via moiré lattice-driven charge-orbital correlation.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Authors:
Changyeon Kim,
Minho Heo,
Doohyun Lee,
Jinwoo Shin,
Honglak Lee,
Joseph J. Lim,
Kimin Lee
Abstract:
Reinforcement Learning (RL) agents have demonstrated their potential across various robotic tasks. However, they still heavily rely on human-engineered reward functions, requiring extensive trial-and-error and access to target behavior information, often unavailable in real-world settings. This paper introduces REDS: REward learning from Demonstration with Segmentations, a novel reward learning fr…
▽ More
Reinforcement Learning (RL) agents have demonstrated their potential across various robotic tasks. However, they still heavily rely on human-engineered reward functions, requiring extensive trial-and-error and access to target behavior information, often unavailable in real-world settings. This paper introduces REDS: REward learning from Demonstration with Segmentations, a novel reward learning framework that leverages action-free videos with minimal supervision. Specifically, REDS employs video demonstrations segmented into subtasks from diverse sources and treats these segments as ground-truth rewards. We train a dense reward function conditioned on video segments and their corresponding subtasks to ensure alignment with ground-truth reward signals by minimizing the Equivalent-Policy Invariant Comparison distance. Additionally, we employ contrastive learning objectives to align video representations with subtasks, ensuring precise subtask inference during online interactions. Our experiments show that REDS significantly outperforms baseline methods on complex robotic manipulation tasks in Meta-World and more challenging real-world tasks, such as furniture assembly in FurnitureBench, with minimal human intervention. Moreover, REDS facilitates generalization to unseen tasks and robot embodiments, highlighting its potential for scalable deployment in diverse environments.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
H I absorption line and anomalous dispersion in the radio pulses of PSR B1937+21
Authors:
Jinchen Jiang,
Shunshun Cao,
Kejia Lee,
Bojun Wang,
Heng Xu,
Siyuan Chen,
Yanjun Guo,
Peng Jiang,
Weicong Jing,
Jiguang Lu,
Jiangwei Xu,
Renxin Xu,
Zihan Xue
Abstract:
We use the Five-hundred-meter Aperture Spherical radio Telescope (FAST) to observe the bright millisecond pulsar (MSP) PSR B1937+21 (J1939+2134) and record the data in the band from 1.0 GHz to 1.5 GHz. We measure the neutral hydrogen (HI) emission and absorption lines near 1420 MHz ($λ\simeq 21$ cm). We derive the kinematic distance of the pulsar with the HI observation, and update the upper bound…
▽ More
We use the Five-hundred-meter Aperture Spherical radio Telescope (FAST) to observe the bright millisecond pulsar (MSP) PSR B1937+21 (J1939+2134) and record the data in the band from 1.0 GHz to 1.5 GHz. We measure the neutral hydrogen (HI) emission and absorption lines near 1420 MHz ($λ\simeq 21$ cm). We derive the kinematic distance of the pulsar with the HI observation, and update the upper bound of kinematic distance from the previous $14.8\pm 0.9$ npc in the Outer Arm to the nearer $9.4\pm 0.5$ kpc in the Perseus Arm. By comparing with the archival absorption spectra observed decades ago, we notice possible variations in the absorption spectra towards this pulsar, which corresponds to a possible tiny-scale atomic structure (TSAS) of a few AU in size. We also verify the apparent faster-than-light anomalous dispersion at the HI absorption line of this pulsar previously reported.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
In-Context Learning with Hypothesis-Class Guidance
Authors:
Ziqian Lin,
Shubham Kumar Bharti,
Kangwook Lee
Abstract:
Recent research has investigated the underlying mechanisms of in-context learning (ICL) both theoretically and empirically, often using data generated from simple function classes. However, the existing work often focuses on the sequence consisting solely of labeled examples, while in practice, labeled examples are typically accompanied by an instruction, providing some side information about the…
▽ More
Recent research has investigated the underlying mechanisms of in-context learning (ICL) both theoretically and empirically, often using data generated from simple function classes. However, the existing work often focuses on the sequence consisting solely of labeled examples, while in practice, labeled examples are typically accompanied by an instruction, providing some side information about the task. In this work, we propose ICL with hypothesis-class guidance (ICL-HCG), a novel synthetic data model for ICL where the input context consists of the literal description of a (finite) hypothesis class H and $(x,y)$ pairs from a hypothesis chosen from H. Under our framework ICL-HCG, we conduct extensive experiments to explore: (i) a variety of generalization abilities to new hypothesis classes; (ii) different model architectures; (iii) sample complexity; (iv) in-context data imbalance; (v) the role of instruction; and (vi) the effect of pretraining hypothesis diversity. As a result, we show that (a) Transformers can successfully learn ICL-HCG and generalize to unseen hypotheses and unseen hypothesis classes, and (b) compared with ICL without instruction, ICL-HCG achieves significantly higher accuracy, demonstrating the role of instructions.
△ Less
Submitted 28 February, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
Coronal Abundance Fractionation Linked to Chromospheric Transverse MHD Waves in a Solar Active Region Observed with FISS/GST and EIS/Hinode
Authors:
Kyoung-Sun Lee,
Jongchul Chae,
Hannah Kwak,
Kyuhyoun Cho,
Kyeore Lee,
Juhyung Kang,
Eun-Kyung Lim,
Donguk Song
Abstract:
Elemental abundances in the solar corona differ from those in the photosphere, with low first ionization potential (FIP) elements being enhanced, a phenomenon known as the FIP effect. This enhancement is attributed to ponderomotive forces linked to magnetohydrodynamic (MHD) waves, particularly incompressible transverse waves. Our study investigates the relationship between coronal abundance fracti…
▽ More
Elemental abundances in the solar corona differ from those in the photosphere, with low first ionization potential (FIP) elements being enhanced, a phenomenon known as the FIP effect. This enhancement is attributed to ponderomotive forces linked to magnetohydrodynamic (MHD) waves, particularly incompressible transverse waves. Our study investigates the relationship between coronal abundance fractionation and chromospheric transverse MHD waves by examining the spatial correlation between FIP fractionation and these waves and by analyzing their properties to test the ponderomotive force model. We used H alpha data from the Fast Imaging Solar Spectrograph at the Goode Solar Telescope to detect chromospheric transverse MHD waves and \ion{Si}{X} (low FIP) and \ion{S}{X} (high FIP) spectra from Hinode EUV Imaging Spectrometer to determine relative abundances in an active region. Extrapolated linear force free magnetic fields from Solar Dynamics Observatory/Helioseismic and Magnetic Imager magnetograms further linked the observed chromospheric waves with coronal composition. Approximately 400 wave packets were identified and characterized by their period, velocity amplitude, propagation speed, and direction. These incompressible or weakly compressible waves were mainly observed near loop footpoints in the sunspot penumbra and superpenumbral fibrils. Regions of high FIP fractionation coincided with closed magnetic fields where these waves were present, and low-frequency, downward-propagating waves comprised about 43/% of the total. Our results demonstrate a strong correlation between coronal abundance fractionation and chromospheric transverse MHD waves, supporting the view that the FIP effect is driven by the ponderomotive force from these waves.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Letters from Future Self: Augmenting the Letter-Exchange Exercise with LLM-based Agents to Enhance Young Adults' Career Exploration
Authors:
Hayeon Jeon,
Suhwoo Yoon,
Keyeun Lee,
Seo Hyeong Kim,
Esther Hehsun Kim,
Seonghye Cho,
Yena Ko,
Soeun Yang,
Laura Dabbish,
John Zimmerman,
Eun-mee Kim,
Hajin Lim
Abstract:
Young adults often encounter challenges in career exploration. Self-guided interventions, such as the letter-exchange exercise, where participants envision and adopt the perspective of their future selves by exchanging letters with their envisioned future selves, can support career development. However, the broader adoption of such interventions may be limited without structured guidance. To addre…
▽ More
Young adults often encounter challenges in career exploration. Self-guided interventions, such as the letter-exchange exercise, where participants envision and adopt the perspective of their future selves by exchanging letters with their envisioned future selves, can support career development. However, the broader adoption of such interventions may be limited without structured guidance. To address this, we integrated Large Language Model (LLM)-based agents that simulate participants' future selves into the letter-exchange exercise and evaluated their effectiveness. A one-week experiment (N=36) compared three conditions: (1) participants manually writing replies to themselves from the perspective of their future selves (baseline), (2) future-self agents generating letters to participants, and (3) future-self agents engaging in chat conversations with participants. Results indicated that exchanging letters with future-self agents enhanced participants' engagement during the exercise, while overall benefits of the intervention on future orientation, career self-concept, and psychological support remained comparable across conditions. We discuss design implications for AI-augmented interventions for supporting young adults' career exploration.
△ Less
Submitted 5 May, 2025; v1 submitted 26 February, 2025;
originally announced February 2025.
-
Emerging Practices in Participatory AI Design in Public Sector Innovation
Authors:
Devansh Saxena,
Zoe Kahn,
Erina Seh-Young Moon,
Lauren M. Chambers,
Corey Jackson,
Min Kyung Lee,
Motahhare Eslami,
Shion Guha,
Sheena Erete,
Lilly Irani,
Deirdre Mulligan,
John Zimmerman
Abstract:
Local and federal agencies are rapidly adopting AI systems to augment or automate critical decisions, efficiently use resources, and improve public service delivery. AI systems are being used to support tasks associated with urban planning, security, surveillance, energy and critical infrastructure, and support decisions that directly affect citizens and their ability to access essential services.…
▽ More
Local and federal agencies are rapidly adopting AI systems to augment or automate critical decisions, efficiently use resources, and improve public service delivery. AI systems are being used to support tasks associated with urban planning, security, surveillance, energy and critical infrastructure, and support decisions that directly affect citizens and their ability to access essential services. Local governments act as the governance tier closest to citizens and must play a critical role in upholding democratic values and building community trust especially as it relates to smart city initiatives that seek to transform public services through the adoption of AI. Community-centered and participatory approaches have been central for ensuring the appropriate adoption of technology; however, AI innovation introduces new challenges in this context because participatory AI design methods require more robust formulation and face higher standards for implementation in the public sector compared to the private sector. This requires us to reassess traditional methods used in this space as well as develop new resources and methods. This workshop will explore emerging practices in participatory algorithm design - or the use of public participation and community engagement - in the scoping, design, adoption, and implementation of public sector algorithms.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
The GigaMIDI Dataset with Features for Expressive Music Performance Detection
Authors:
Keon Ju Maverick Lee,
Jeff Ens,
Sara Adkins,
Pedro Sarmento,
Mathieu Barthet,
Philippe Pasquier
Abstract:
The Musical Instrument Digital Interface (MIDI), introduced in 1983, revolutionized music production by allowing computers and instruments to communicate efficiently. MIDI files encode musical instructions compactly, facilitating convenient music sharing. They benefit Music Information Retrieval (MIR), aiding in research on music understanding, computational musicology, and generative music. The G…
▽ More
The Musical Instrument Digital Interface (MIDI), introduced in 1983, revolutionized music production by allowing computers and instruments to communicate efficiently. MIDI files encode musical instructions compactly, facilitating convenient music sharing. They benefit Music Information Retrieval (MIR), aiding in research on music understanding, computational musicology, and generative music. The GigaMIDI dataset contains over 1.4 million unique MIDI files, encompassing 1.8 billion MIDI note events and over 5.3 million MIDI tracks. GigaMIDI is currently the largest collection of symbolic music in MIDI format available for research purposes under fair dealing. Distinguishing between non-expressive and expressive MIDI tracks is challenging, as MIDI files do not inherently make this distinction. To address this issue, we introduce a set of innovative heuristics for detecting expressive music performance. These include the Distinctive Note Velocity Ratio (DNVR) heuristic, which analyzes MIDI note velocity; the Distinctive Note Onset Deviation Ratio (DNODR) heuristic, which examines deviations in note onset times; and the Note Onset Median Metric Level (NOMML) heuristic, which evaluates onset positions relative to metric levels. Our evaluation demonstrates these heuristics effectively differentiate between non-expressive and expressive MIDI tracks. Furthermore, after evaluation, we create the most substantial expressive MIDI dataset, employing our heuristic, NOMML. This curated iteration of GigaMIDI encompasses expressively-performed instrument tracks detected by NOMML, containing all General MIDI instruments, constituting 31% of the GigaMIDI dataset, totalling 1,655,649 tracks.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Conditional Generative Adversarial Networks for Channel Estimation in RIS-Assisted ISAC Systems
Authors:
Alice Faisal,
Ibrahim Al-Nahhal,
Kyesan Lee,
Octavia A. Dobre,
Hyundong Shin
Abstract:
Integrated sensing and communication (ISAC) technology has been explored as a potential advancement for future wireless networks, striving to effectively use spectral resources for both communication and sensing. The integration of reconfigurable intelligent surfaces (RIS) with ISAC further enhances this capability by optimizing the propagation environment, thereby improving both the sensing accur…
▽ More
Integrated sensing and communication (ISAC) technology has been explored as a potential advancement for future wireless networks, striving to effectively use spectral resources for both communication and sensing. The integration of reconfigurable intelligent surfaces (RIS) with ISAC further enhances this capability by optimizing the propagation environment, thereby improving both the sensing accuracy and communication quality. Within this domain, accurate channel estimation is crucial to ensure a reliable deployment. Traditional deep learning (DL) approaches, while effective, can impose performance limitations in modeling the complex dynamics of wireless channels. This paper proposes a novel application of conditional generative adversarial networks (CGANs) to solve the channel estimation problem of an RIS-assisted ISAC system. The CGAN framework adversarially trains two DL networks, enabling the generator network to not only learn the mapping relationship from observed data to real channel conditions but also to improve its output based on the discriminator network feedback, thus effectively optimizing the training process and estimation accuracy. The numerical simulations demonstrate that the proposed CGAN-based method improves the estimation performance effectively compared to conventional DL techniques. The results highlight the CGAN's potential to revolutionize channel estimation, paving the way for more accurate and reliable ISAC deployments.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
No Galaxy-Scale [CII] Fast Outflow in the z=6.72 Red Quasar HSC J1205$-$0000
Authors:
Mahoshi Sawamura,
Takuma Izumi,
Kouichiro Nakanishi,
Takeshi Okuda,
Michael A. Strauss,
Masatoshi Imanishi,
Yoshiki Matsuoka,
Yoshiki Toba,
Hideki Umehata,
Takuya Hashimoto,
Shunsuke Baba,
Tomotsugu Goto,
Toshihiro Kawaguchi,
Kotaro Kohno,
Dragan Salak,
Taiki Kawamuro,
Kazushi Iwasawa,
Masafusa Onoue,
Chien-Hsiu Lee,
Kianhong Lee
Abstract:
HSC 120505.09-000027.9 (J1205$-$0000) is one of the highest redshift ($z=6.72$) dust-reddened quasars (red quasars) known to date. We present an improved analysis of Atacama Large Millimeter/submillimeter Array data of the [CII] $158\ \rm{μm}$ line and the underlying rest-frame far-infrared (FIR) continuum emission, previously reported in Izumi et al. (2021a), toward J1205$-$0000. Red quasars are…
▽ More
HSC 120505.09-000027.9 (J1205$-$0000) is one of the highest redshift ($z=6.72$) dust-reddened quasars (red quasars) known to date. We present an improved analysis of Atacama Large Millimeter/submillimeter Array data of the [CII] $158\ \rm{μm}$ line and the underlying rest-frame far-infrared (FIR) continuum emission, previously reported in Izumi et al. (2021a), toward J1205$-$0000. Red quasars are thought to be a transitional phase from an obscured starburst to a luminous blue quasar, in some cases associated with massive outflows driven by the active galactic nucleus (AGN). J1205$-$0000 has a high FIR luminosity, $L_{\mathrm{FIR}}=2.5\times 10^{12}\ L_{\odot}$ and a total IR luminosity of $L_{\mathrm{TIR}}=3.5\times 10^{12}\ L_{\odot}$, corresponding to a star formation rate (SFR) of $\sim 528\ M_{\odot}\ \mathrm{yr}^{-1}$. With the [CII]-based dynamical mass of $\sim 1 \times 10^{11}~M_\odot$, we conclude that J1205$-$0000 is hosted by a starburst galaxy. In contradiction to Izumi et al. (2021a), our improved analysis shows no hint of a broad component in the [CII] line spectrum. Thus there is no evidence for a host galaxy-scale fast [CII] outflow, despite the fact that J1205$-$0000 has fast nuclear ionized outflows seen in the rest-frame UV. We explore several scenarios for this discrepancy (e.g., early phase of AGN feedback, reliability of the [CII] line as a tracer of outflows), and we claim that it is still too early to conclude that there is no significant negative AGN feedback on star formation in this red quasar.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization
Authors:
Yao Xiao,
Hai Ye,
Linyao Chen,
Hwee Tou Ng,
Lidong Bing,
Xiaoli Li,
Roy Ka-wei Lee
Abstract:
Iterative data generation and model retraining are widely used to align large language models (LLMs). It typically involves a policy model to generate on-policy responses and a reward model to guide training data selection. Direct Preference Optimization (DPO) further enhances this process by constructing preference pairs of chosen and rejected responses. In this work, we aim to \emph{scale up} th…
▽ More
Iterative data generation and model retraining are widely used to align large language models (LLMs). It typically involves a policy model to generate on-policy responses and a reward model to guide training data selection. Direct Preference Optimization (DPO) further enhances this process by constructing preference pairs of chosen and rejected responses. In this work, we aim to \emph{scale up} the number of on-policy samples via repeated random sampling to improve alignment performance. Conventional practice selects the sample with the highest reward as chosen and the lowest as rejected for DPO. However, our experiments reveal that this strategy leads to a \emph{decline} in performance as the sample size increases. To address this, we investigate preference data construction through the lens of underlying normal distribution of sample rewards. We categorize the reward space into seven representative points and systematically explore all 21 ($C_7^2$) pairwise combinations. Through evaluations on four models using AlpacaEval 2, we find that selecting the rejected response at reward position $μ- 2σ$ rather than the minimum reward, is crucial for optimal performance. We finally introduce a scalable preference data construction strategy that consistently enhances model performance as the sample scale increases.
△ Less
Submitted 21 May, 2025; v1 submitted 23 February, 2025;
originally announced February 2025.
-
Hyper-active repeating fast radio bursts from rotation modulated starquakes on magnetars
Authors:
Jia-Wei Luo,
Jia-Rui Niu,
Wei-Yang Wang,
Yong-Kun Zhang,
De-Jiang Zhou,
Heng Xu,
Pei Wang,
Chen-Hui Niu,
Zhen-Hui Zhang,
Shuai Zhang,
Ce Cai,
Jin-Lin Han,
Di Li,
Ke-Jia Lee,
Wei-Wei Zhu,
Bing Zhang
Abstract:
The non-detection of periodicity related to rotation challenges magnetar models for fast radio bursts (FRBs) with FRB emission from close to the magnetar surface. Moreover, a bimodal distribution of the burst waiting times is widely observed in hyper-active FRBs, a significant deviation from the exponential distribution expected from stationary Poisson processes. By combining the epidemic-type aft…
▽ More
The non-detection of periodicity related to rotation challenges magnetar models for fast radio bursts (FRBs) with FRB emission from close to the magnetar surface. Moreover, a bimodal distribution of the burst waiting times is widely observed in hyper-active FRBs, a significant deviation from the exponential distribution expected from stationary Poisson processes. By combining the epidemic-type aftershock sequence (ETAS) earthquake model and the rotating vector model (RVM) involving the rotation of the magnetar and orientations of the spin and magnetic axes, we find that starquake events modulated by the rotation of FRB-emitting magnetar can explain the bimodal distribution of FRB waiting times, as well as the non-detection of periodicity in hyper-active repeating FRBs. We analyze data from multiple FRB sources, demonstrating that differences in waiting time distributions and, to some extent, observed energies can be explained by varying parameters related to geometric properties of the magnetar FRB emission and starquake dynamics. Our results show that the assumption that all FRBs are repeaters is compatible with our model. Notably, we find that hyper-active repeaters tend to have small magnetic inclination angles in order to hide their periodicity. We also show that our model can reproduce the waiting time distribution of a pulsar phase of the galactic magnetar SGR J1935+2154 with a larger inclination angle than the hyper-active repeaters, which could explain the detection of spin period and the relatively low observed energy for FRBs from the magnetar. The spin periods of hyper-active repeaters are not well constrained, but most likely fall in the valley region between the two peaks of the waiting time distributions.
△ Less
Submitted 9 June, 2025; v1 submitted 23 February, 2025;
originally announced February 2025.
-
Audio Visual Segmentation Through Text Embeddings
Authors:
Kyungbok Lee,
You Zhang,
Zhiyao Duan
Abstract:
The goal of Audio-Visual Segmentation (AVS) is to localize and segment the sounding source objects from video frames. Research on AVS suffers from data scarcity due to the high cost of fine-grained manual annotations. Recent works attempt to overcome the challenge of limited data by leveraging the vision foundation model, Segment Anything Model (SAM), prompting it with audio to enhance its ability…
▽ More
The goal of Audio-Visual Segmentation (AVS) is to localize and segment the sounding source objects from video frames. Research on AVS suffers from data scarcity due to the high cost of fine-grained manual annotations. Recent works attempt to overcome the challenge of limited data by leveraging the vision foundation model, Segment Anything Model (SAM), prompting it with audio to enhance its ability to segment sounding source objects. While this approach alleviates the model's burden on understanding visual modality by utilizing knowledge of pre-trained SAM, it does not address the fundamental challenge of learning audio-visual correspondence with limited data. To address this limitation, we propose \textbf{AV2T-SAM}, a novel framework that bridges audio features with the text embedding space of pre-trained text-prompted SAM. Our method leverages multimodal correspondence learned from rich text-image paired datasets to enhance audio-visual alignment. Furthermore, we introduce a novel feature, $\mathbf{\textit{\textbf{f}}_{CLIP} \odot \textit{\textbf{f}}_{CLAP}}$, which emphasizes shared semantics of audio and visual modalities while filtering irrelevant noise. Our approach outperforms existing methods on the AVSBench dataset by effectively utilizing pre-trained segmentation models and cross-modal semantic alignment. The source code is released at https://github.com/bok-bok/AV2T-SAM.
△ Less
Submitted 29 May, 2025; v1 submitted 22 February, 2025;
originally announced February 2025.
-
The Design Space of Recent AI-assisted Research Tools for Ideation, Sensemaking, and Scientific Creativity
Authors:
Runlong Ye,
Matthew Varona,
Oliver Huang,
Patrick Yung Kang Lee,
Michael Liut,
Carolina Nobre
Abstract:
Generative AI (GenAI) tools are radically expanding the scope and capability of automation in knowledge work such as academic research. While promising for augmenting cognition and streamlining processes, AI-assisted research tools may also increase automation bias and hinder critical thinking. To examine recent developments, we surveyed publications from leading HCI venues over the past three yea…
▽ More
Generative AI (GenAI) tools are radically expanding the scope and capability of automation in knowledge work such as academic research. While promising for augmenting cognition and streamlining processes, AI-assisted research tools may also increase automation bias and hinder critical thinking. To examine recent developments, we surveyed publications from leading HCI venues over the past three years, closely analyzing thirteen tools to better understand the novel capabilities of these AI-assisted systems and the design spaces they enable: seven employing traditional AI or customized transformer-based approaches, and six integrating open-access large language models (LLMs). Our analysis characterizes the emerging design space, distinguishes between tools focused on workflow mimicry versus generative exploration, and yields four critical design recommendations to guide the development of future systems that foster meaningful cognitive engagement: providing user agency and control, differentiating divergent/convergent thinking support, ensuring adaptability, and prioritizing transparency/accuracy. This work discusses how these insights signal a shift from mere workflow replication towards generative co-creation, presenting new opportunities for the community to craft intuitive, AI-driven research interfaces and interactions.
△ Less
Submitted 19 April, 2025; v1 submitted 22 February, 2025;
originally announced February 2025.
-
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Authors:
Jaydeep Borkar,
Matthew Jagielski,
Katherine Lee,
Niloofar Mireshghallah,
David A. Smith,
Christopher A. Choquette-Choo
Abstract:
Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downst…
▽ More
Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downstream fine-tuning stage. We find that the amount and ease of PII memorization is a dynamic property of a model that evolves throughout training pipelines and depends on commonly altered design choices. We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization, and this is a significant factor (in our settings, up to 1/3); (2) adding PII can increase memorization of other PII significantly (in our settings, as much as $\approx\!7.5\times$); and (3) removing PII can lead to other PII being memorized. Model creators should consider these first- and second-order privacy risks when training models to avoid the risk of new PII regurgitation.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
Boundary-Driven Complex Brillouin Zone in Non-Hermitian Electric Circuits
Authors:
Yung Kim,
Sonu Verma,
Minwook Kyung,
Kyungmin Lee,
Wenwen Liu,
Shuang Zhang,
Bumki Min,
Moon Jip Park
Abstract:
Complex-valued physical quantities, often non-conserved, represent key phenomena in non-Hermitian systems such as dissipation and localization. Recent advancements in non-Hermitian physics have revealed boundary-condition-sensitive band structures, characterized by a continuous manifold of complex-valued momentum known as the generalized Brillouin zone (GBZ). However, the ability to actively manip…
▽ More
Complex-valued physical quantities, often non-conserved, represent key phenomena in non-Hermitian systems such as dissipation and localization. Recent advancements in non-Hermitian physics have revealed boundary-condition-sensitive band structures, characterized by a continuous manifold of complex-valued momentum known as the generalized Brillouin zone (GBZ). However, the ability to actively manipulate the GBZ and its associated topological properties has remained largely unexplored. Here, we demonstrate a controllable manipulation of the GBZ by adjusting the boundary Hamiltonian and leveraging the boundary sensitivity in a circuit lattice. Our observations reveal that the GBZ forms multiple separated manifolds containing both decaying and growing wave functions, in contrast to the previously observed non-Hermitian skin effect under open boundary condition (OBC). By continuously deforming the GBZ, we observe the topological phase transitions of innate topological structure of GBZ that are enriched by complex properties of non-Hermitian physical variables. Notably, such topological phase transition is governed by boundary conditions rather than bulk properties, underscoring the extreme boundary sensitivity unique to non-Hermitian systems.
△ Less
Submitted 20 February, 2025;
originally announced February 2025.
-
Can Hallucination Correction Improve Video-Language Alignment?
Authors:
Lingjun Zhao,
Mingyang Xie,
Paola Cascante-Bonilla,
Hal Daumé III,
Kwonjoon Lee
Abstract:
Large Vision-Language Models often generate hallucinated content that is not grounded in its visual inputs. While prior work focuses on mitigating hallucinations, we instead explore leveraging hallucination correction as a training objective to improve video-language alignment. We introduce HACA, a self-training framework learning to correct hallucinations in descriptions that do not align with th…
▽ More
Large Vision-Language Models often generate hallucinated content that is not grounded in its visual inputs. While prior work focuses on mitigating hallucinations, we instead explore leveraging hallucination correction as a training objective to improve video-language alignment. We introduce HACA, a self-training framework learning to correct hallucinations in descriptions that do not align with the video content. By identifying and correcting inconsistencies, HACA enhances the model's ability to align video and textual representations for spatio-temporal reasoning. Our experimental results show consistent gains in video-caption binding and text-to-video retrieval tasks, demonstrating that hallucination correction-inspired tasks serve as an effective strategy for improving vision and language alignment.
△ Less
Submitted 20 February, 2025;
originally announced February 2025.
-
Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Authors:
Minbeom Kim,
Kang-il Lee,
Seongho Joo,
Hwaran Lee,
Thibaut Thonet,
Kyomin Jung
Abstract:
Personalized alignments for individual users have been a long-standing goal in large language models (LLMs). We introduce Drift, a novel framework that personalizes LLMs at decoding time with implicit user preferences. Traditional Reinforcement Learning from Human Feedback (RLHF) requires thousands of annotated examples and expensive gradient updates. In contrast, Drift personalizes LLMs in a trai…
▽ More
Personalized alignments for individual users have been a long-standing goal in large language models (LLMs). We introduce Drift, a novel framework that personalizes LLMs at decoding time with implicit user preferences. Traditional Reinforcement Learning from Human Feedback (RLHF) requires thousands of annotated examples and expensive gradient updates. In contrast, Drift personalizes LLMs in a training-free manner, using only a few dozen examples to steer a frozen model through efficient preference modeling. Our approach models user preferences as a composition of predefined, interpretable attributes and aligns them at decoding time to enable personalized generation. Experiments on both a synthetic persona dataset (Perspective) and a real human-annotated dataset (PRISM) demonstrate that Drift significantly outperforms RLHF baselines while using only 50-100 examples. Our results and analysis show that Drift is both computationally efficient and interpretable.
△ Less
Submitted 7 May, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
High-level, high-resolution ocean modeling at all scales with Oceananigans
Authors:
Gregory L. Wagner,
Simone Silvestri,
Navid C. Constantinou,
Ali Ramadhan,
Jean-Michel Campin,
Chris Hill,
Tomas Chor,
Jago Strong-Wright,
Xin Kai Lee,
Francis Poulin,
Andre Souza,
Keaton J. Burns,
John Marshall,
Raffaele Ferrari
Abstract:
We describe the vision, user interface, governing equations, and numerical methods that underpin new ocean modeling software called ``Oceananigans''. Oceananigans is being developed by the Climate Modeling Alliance as part of a larger project to build a trainable climate model with quantifiable uncertainty. We argue that Oceananigans status as a popular, capable modeling system realizes a vision f…
▽ More
We describe the vision, user interface, governing equations, and numerical methods that underpin new ocean modeling software called ``Oceananigans''. Oceananigans is being developed by the Climate Modeling Alliance as part of a larger project to build a trainable climate model with quantifiable uncertainty. We argue that Oceananigans status as a popular, capable modeling system realizes a vision for accelerating progress in Earth system modeling that balances demands for model accuracy and performance, needed for state-of-the-art science, against accessibility, which is needed to accelerate development. This vision combines three cooperative elements: (i) a relatively simple finite volume algorithm (ii) optimized for high-resolution simulations on GPUs which is (iii) exposed behind an expressive, high-level user interface (using the Julia programming language in our case). We offer evidence for the vision's potential by illustrating the creative potential of our user interface, showcasing Oceananigans physics with example simulations that range from simple classroom problems to a realistic global ocean simulation spanning all scales of oceanic fluid motion, and describing advances in parameterization, numerical methods, and computational efficiency.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models
Authors:
Daewon Chae,
June Suk Choi,
Jinkyu Kim,
Kimin Lee
Abstract:
Fine-tuning text-to-image diffusion models to maximize rewards has proven effective for enhancing model performance. However, reward fine-tuning methods often suffer from slow convergence due to online sample generation. Therefore, obtaining diverse samples with strong reward signals is crucial for improving sample efficiency and overall performance. In this work, we introduce DiffExp, a simple ye…
▽ More
Fine-tuning text-to-image diffusion models to maximize rewards has proven effective for enhancing model performance. However, reward fine-tuning methods often suffer from slow convergence due to online sample generation. Therefore, obtaining diverse samples with strong reward signals is crucial for improving sample efficiency and overall performance. In this work, we introduce DiffExp, a simple yet effective exploration strategy for reward fine-tuning of text-to-image models. Our approach employs two key strategies: (a) dynamically adjusting the scale of classifier-free guidance to enhance sample diversity, and (b) randomly weighting phrases of the text prompt to exploit high-quality reward signals. We demonstrate that these strategies significantly enhance exploration during online sample generation, improving the sample efficiency of recent reward fine-tuning methods, such as DDPO and AlignProp.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Learning-based Dynamic Robot-to-Human Handover
Authors:
Hyeonseong Kim,
Chanwoo Kim,
Matthew Pan,
Kyungjae Lee,
Sungjoon Choi
Abstract:
This paper presents a novel learning-based approach to dynamic robot-to-human handover, addressing the challenges of delivering objects to a moving receiver. We hypothesize that dynamic handover, where the robot adjusts to the receiver's movements, results in more efficient and comfortable interaction compared to static handover, where the receiver is assumed to be stationary. To validate this, we…
▽ More
This paper presents a novel learning-based approach to dynamic robot-to-human handover, addressing the challenges of delivering objects to a moving receiver. We hypothesize that dynamic handover, where the robot adjusts to the receiver's movements, results in more efficient and comfortable interaction compared to static handover, where the receiver is assumed to be stationary. To validate this, we developed a nonparametric method for generating continuous handover motion, conditioned on the receiver's movements, and trained the model using a dataset of 1,000 human-to-human handover demonstrations. We integrated preference learning for improved handover effectiveness and applied impedance control to ensure user safety and adaptiveness. The approach was evaluated in both simulation and real-world settings, with user studies demonstrating that dynamic handover significantly reduces handover time and improves user comfort compared to static methods. Videos and demonstrations of our approach are available at https://zerotohero7886.github.io/dyn-r2h-handover .
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study
Authors:
Isaac Lim,
Shaun Khoo,
Roy Ka-Wei Lee,
Watson Chua,
Jia Yi Goh,
Jessica Foo
Abstract:
Ensuring the safety of Large Language Models (LLMs) in diverse linguistic settings remains challenging, particularly for low-resource languages. Existing safety alignment methods are English-centric, limiting their effectiveness. We systematically compare Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Kahneman-Tversky Optimization (KTO) for aligning SEA-Lion-v2.1-Instruct,…
▽ More
Ensuring the safety of Large Language Models (LLMs) in diverse linguistic settings remains challenging, particularly for low-resource languages. Existing safety alignment methods are English-centric, limiting their effectiveness. We systematically compare Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Kahneman-Tversky Optimization (KTO) for aligning SEA-Lion-v2.1-Instruct, a Llama 3-8B variant, to reduce toxicity in Singlish. Our results show that SFT+KTO achieves superior safety alignment with higher sample efficiency than DPO. Additionally, we introduce KTO-S, which enhances stability via improved KL divergence regularization. Our approach reduces Singlish toxicity by 99\%, generalizes to TOXIGEN, and maintains strong performance on standard LLM benchmarks, providing a scalable framework for safer AI deployment in multilingual contexts.
△ Less
Submitted 8 April, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Inference for Log-Gaussian Cox Point Processes using Bayesian Deep Learning: Application to Human Oral Microbiome Image Data
Authors:
Shuwan Wang,
Christopher K. Wikle,
Athanasios C. Micheas,
Jessica L. Mark Welch,
Jacqueline R. Starr,
Kyu Ha Lee
Abstract:
It is common in nature to see aggregation of objects in space. Exploring the mechanism associated with the locations of such clustered observations can be essential to understanding the phenomenon, such as the source of spatial heterogeneity, or comparison to other event generating processes in the same domain. Log-Gaussian Cox processes (LGCPs) represent an important class of models for quantifyi…
▽ More
It is common in nature to see aggregation of objects in space. Exploring the mechanism associated with the locations of such clustered observations can be essential to understanding the phenomenon, such as the source of spatial heterogeneity, or comparison to other event generating processes in the same domain. Log-Gaussian Cox processes (LGCPs) represent an important class of models for quantifying aggregation in a spatial point pattern. However, implementing likelihood-based Bayesian inference for such models presents many computational challenges, particularly in high dimensions. In this paper, we propose a novel likelihood-free inference approach for LGCPs using the recently developed BayesFlow approach, where invertible neural networks are employed to approximate the posterior distribution of the parameters of interest. BayesFlow is a neural simulation-based method based on "amortized" posterior estimation. That is, after an initial training procedure, fast feed-forward operations allow rapid posterior inference for any data within the same model family. Comprehensive numerical studies validate the reliability of the framework and show that BayesFlow achieves substantial computational gain in repeated application, especially for two-dimensional LGCPs. We demonstrate the utility and robustness of the method by applying it to two distinct oral microbial biofilm images.
△ Less
Submitted 18 March, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Roadmap to fault tolerant quantum computation using topological qubit arrays
Authors:
David Aasen,
Morteza Aghaee,
Zulfi Alam,
Mariusz Andrzejczuk,
Andrey Antipov,
Mikhail Astafev,
Lukas Avilovas,
Amin Barzegar,
Bela Bauer,
Jonathan Becker,
Juan M. Bello-Rivas,
Umesh Bhaskar,
Alex Bocharov,
Srini Boddapati,
David Bohn,
Jouri Bommer,
Parsa Bonderson,
Jan Borovsky,
Leo Bourdet,
Samuel Boutin,
Tom Brown,
Gary Campbell,
Lucas Casparis,
Srivatsa Chakravarthi,
Rui Chao
, et al. (157 additional authors not shown)
Abstract:
We describe a concrete device roadmap towards a fault-tolerant quantum computing architecture based on noise-resilient, topologically protected Majorana-based qubits. Our roadmap encompasses four generations of devices: a single-qubit device that enables a measurement-based qubit benchmarking protocol; a two-qubit device that uses measurement-based braiding to perform single-qubit Clifford operati…
▽ More
We describe a concrete device roadmap towards a fault-tolerant quantum computing architecture based on noise-resilient, topologically protected Majorana-based qubits. Our roadmap encompasses four generations of devices: a single-qubit device that enables a measurement-based qubit benchmarking protocol; a two-qubit device that uses measurement-based braiding to perform single-qubit Clifford operations; an eight-qubit device that can be used to show an improvement of a two-qubit operation when performed on logical qubits rather than directly on physical qubits; and a topological qubit array supporting lattice surgery demonstrations on two logical qubits. Devices that enable this path require a superconductor-semiconductor heterostructure that supports a topological phase, quantum dots and coupling between those quantum dots that can create the appropriate loops for interferometric measurements, and a microwave readout system that can perform fast, low-error single-shot measurements. We describe the key design components of these qubit devices, along with the associated protocols for demonstrations of single-qubit benchmarking, Clifford gate execution, quantum error detection, and quantum error correction, which differ greatly from those in more conventional qubits. Finally, we comment on implications and advantages of this architecture for utility-scale quantum computation.
△ Less
Submitted 7 April, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Interpretable Machine Learning for Kronecker Coefficients
Authors:
Giorgi Butbaia,
Kyu-Hwan Lee,
Fabian Ruehle
Abstract:
We analyze the saliency of neural networks and employ interpretable machine learning models to predict whether the Kronecker coefficients of the symmetric group are zero or not. Our models use triples of partitions as input features, as well as b-loadings derived from the principal component of an embedding that captures the differences between partitions. Across all approaches, we achieve an accu…
▽ More
We analyze the saliency of neural networks and employ interpretable machine learning models to predict whether the Kronecker coefficients of the symmetric group are zero or not. Our models use triples of partitions as input features, as well as b-loadings derived from the principal component of an embedding that captures the differences between partitions. Across all approaches, we achieve an accuracy of approximately 83% and derive explicit formulas for a decision function in terms of b-loadings. Additionally, we develop transformer-based models for prediction, achieving the highest reported accuracy of over 99%.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Standalone FPGA-Based QAOA Emulator for Weighted-MaxCut on Embedded Devices
Authors:
Seonghyun Choi,
Kyeongwon Lee,
Jae-Jin Lee,
Woojoo Lee
Abstract:
Quantum computing QC emulation is crucial for advancing QC applications, especially given the scalability constraints of current devices. FPGA-based designs offer an efficient and scalable alternative to traditional large-scale platforms, but most are tightly integrated with high-performance systems, limiting their use in mobile and edge environments. This study introduces a compact, standalone FP…
▽ More
Quantum computing QC emulation is crucial for advancing QC applications, especially given the scalability constraints of current devices. FPGA-based designs offer an efficient and scalable alternative to traditional large-scale platforms, but most are tightly integrated with high-performance systems, limiting their use in mobile and edge environments. This study introduces a compact, standalone FPGA-based QC emulator designed for embedded systems, leveraging the Quantum Approximate Optimization Algorithm (QAOA) to solve the Weighted-MaxCut problem. By restructuring QAOA operations for hardware compatibility, the proposed design reduces time complexity from O(N^2) to O(N), where N equals 2^n for n qubits. This reduction, coupled with a pipeline architecture, significantly minimizes resource consumption, enabling support for up to nine qubits on mid-tier FPGAs, roughly three times more than comparable designs. Additionally, the emulator achieved energy savings ranging from 1.53 times for two-qubit configurations to up to 852 times for nine-qubit configurations, compared to software-based QAOA on embedded processors. These results highlight the practical scalability and resource efficiency of the proposed design, providing a robust foundation for QC emulation in resource-constrained edge devices.
△ Less
Submitted 27 March, 2025; v1 submitted 16 February, 2025;
originally announced February 2025.
-
CacheFocus: Dynamic Cache Re-Positioning for Efficient Retrieval-Augmented Generation
Authors:
Kun-Hui Lee,
Eunhwan Park,
Donghoon Han,
Seung-Hoon Na
Abstract:
Large Language Models (LLMs) excel across a variety of language tasks yet are constrained by limited input lengths and high computational costs. Existing approaches\textemdash such as relative positional encodings (e.g., RoPE, ALiBi) and sliding window mechanisms\textemdash partially alleviate these issues but often require additional training or suffer from performance degradation with longer inp…
▽ More
Large Language Models (LLMs) excel across a variety of language tasks yet are constrained by limited input lengths and high computational costs. Existing approaches\textemdash such as relative positional encodings (e.g., RoPE, ALiBi) and sliding window mechanisms\textemdash partially alleviate these issues but often require additional training or suffer from performance degradation with longer inputs. In this paper, we introduce \textbf{\textit{CacheFocus}}, a method that enhances length normalization and reduces inference latency without any further training. Our approach leverages query-independent, offline caching to efficiently reuse a Context KV Cache Store. We address the amplification of abnormal token distributions problem by re-positioning cached keys and introducing Layer-Adaptive Cache Pruning to discard low-relevance caches during pre-filling. Additionally, our Adaptive Positional Allocation Strategy dynamically reassigns cache positions to maximize the use of the available positional encoding range. Experiments on the Natural Questions and TriviaQA datasets demonstrate that CacheFocus outperforms alternative methods even when inputs exceed the $4$K limit of the \texttt{LLaMA-2} model, emphasizing its practical effectiveness for long-context LLMs. Moreover, even with large maximum input length of \texttt{Qwen2}, the performance of CacheFocus shows that it maintains consistent performance even as the number of documents increases, effectively managing long-text generation without degradation.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions
Authors:
Ming Shan Hee,
Roy Ka-Wei Lee
Abstract:
Hateful meme detection presents a significant challenge as a multimodal task due to the complexity of interpreting implicit hate messages and contextual cues within memes. Previous approaches have fine-tuned pre-trained vision-language models (PT-VLMs), leveraging the knowledge they gained during pre-training and their attention mechanisms to understand meme content. However, the reliance of these…
▽ More
Hateful meme detection presents a significant challenge as a multimodal task due to the complexity of interpreting implicit hate messages and contextual cues within memes. Previous approaches have fine-tuned pre-trained vision-language models (PT-VLMs), leveraging the knowledge they gained during pre-training and their attention mechanisms to understand meme content. However, the reliance of these models on implicit knowledge and complex attention mechanisms renders their decisions difficult to explain, which is crucial for building trust in meme classification. In this paper, we introduce IntMeme, a novel framework that leverages Large Multimodal Models (LMMs) for hateful meme classification with explainable decisions. IntMeme addresses the dual challenges of improving both accuracy and explainability in meme moderation. The framework uses LMMs to generate human-like, interpretive analyses of memes, providing deeper insights into multimodal content and context. Additionally, it uses independent encoding modules for both memes and their interpretations, which are then combined to enhance classification performance. Our approach addresses the opacity and misclassification issues associated with PT-VLMs, optimizing the use of LMMs for hateful meme detection. We demonstrate the effectiveness of IntMeme through comprehensive experiments across three datasets, showcasing its superiority over state-of-the-art models.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Nonlocal Electrical Detection of Reciprocal Orbital Edelstein Effect
Authors:
Weiguang Gao,
Liyang Liao,
Hironari Isshiki,
Nico Budai,
Junyeon Kim,
Hyun-Woo Lee,
Kyung-Jin Lee,
Dongwook Go,
Yuriy Mokrousov,
Shinji Miwa,
Yoshichika Otani
Abstract:
Spin-Orbitronics leverages the spin and orbital degrees of freedom in solids for information processing. The orbital Edelstein effect and orbital Hall effect, where the charge current induces a nonequilibrium orbital angular momentum, offer a promising method to manipulate nanomagnets efficiently using light elements. Despite extensive research, understanding the Onsager reciprocity of orbital tra…
▽ More
Spin-Orbitronics leverages the spin and orbital degrees of freedom in solids for information processing. The orbital Edelstein effect and orbital Hall effect, where the charge current induces a nonequilibrium orbital angular momentum, offer a promising method to manipulate nanomagnets efficiently using light elements. Despite extensive research, understanding the Onsager reciprocity of orbital transport, fundamentally rooted in the second law of thermodynamics and time-reversal symmetry, remains elusive. In this study, we experimentally demonstrate the Onsager reciprocity of orbital transport in an orbital Edelstein system by utilizing nonlocal measurements. This method enables the precise identification of the chemical potential generated by orbital accumulation, avoiding the limitations associated with local measurements. Remarkably, we observe that the direct and inverse orbital-charge conversion processes produce identical electric voltages, confirming Onsager reciprocity in orbital transport. Additionally, we find that the orbital decay length, approximately 100 nm at room temperature, is independent of Cu thickness and decreases with lowering temperature, revealing a distinct contrast to spin transport behavior. Our findings provide valuable insights into both the reciprocity of the charge-orbital interconversion and the nonlocal correlation of orbital degree of freedom, laying the ground for orbitronics devices with long-range interconnections.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Learning to Explain Air Traffic Situation
Authors:
Hong-ah Chai,
Seokbin Yoon,
Keumjin Lee
Abstract:
Understanding how air traffic controllers construct a mental 'picture' of complex air traffic situations is crucial but remains a challenge due to the inherently intricate, high-dimensional interactions between aircraft, pilots, and controllers. Previous work on modeling the strategies of air traffic controllers and their mental image of traffic situations often centers on specific air traffic con…
▽ More
Understanding how air traffic controllers construct a mental 'picture' of complex air traffic situations is crucial but remains a challenge due to the inherently intricate, high-dimensional interactions between aircraft, pilots, and controllers. Previous work on modeling the strategies of air traffic controllers and their mental image of traffic situations often centers on specific air traffic control tasks or pairwise interactions between aircraft, neglecting to capture the comprehensive dynamics of an air traffic situation. To address this issue, we propose a machine learning-based framework for explaining air traffic situations. Specifically, we employ a Transformer-based multi-agent trajectory model that encapsulates both the spatio-temporal movement of aircraft and social interaction between them. By deriving attention scores from the model, we can quantify the influence of individual aircraft on overall traffic dynamics. This provides explainable insights into how air traffic controllers perceive and understand the traffic situation. Trained on real-world air traffic surveillance data collected from the terminal airspace around Incheon International Airport in South Korea, our framework effectively explicates air traffic situations. This could potentially support and enhance the decision-making and situational awareness of air traffic controllers.
△ Less
Submitted 27 May, 2025; v1 submitted 15 February, 2025;
originally announced February 2025.
-
LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization
Authors:
Erica Zhang,
Ryunosuke Goto,
Naomi Sagan,
Jurik Mutter,
Nick Phillips,
Ash Alizadeh,
Kangwook Lee,
Jose Blanchet,
Mert Pilanci,
Robert Tibshirani
Abstract:
We introduce LLM-Lasso, a novel framework that leverages large language models (LLMs) to guide feature selection in Lasso $\ell_1$ regression. Unlike traditional methods that rely solely on numerical data, LLM-Lasso incorporates domain-specific knowledge extracted from natural language, enhanced through a retrieval-augmented generation (RAG) pipeline, to seamlessly integrate data-driven modeling w…
▽ More
We introduce LLM-Lasso, a novel framework that leverages large language models (LLMs) to guide feature selection in Lasso $\ell_1$ regression. Unlike traditional methods that rely solely on numerical data, LLM-Lasso incorporates domain-specific knowledge extracted from natural language, enhanced through a retrieval-augmented generation (RAG) pipeline, to seamlessly integrate data-driven modeling with contextual insights. Specifically, the LLM generates penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model. Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model, while less relevant features are assigned higher penalties, reducing their influence. Importantly, LLM-Lasso has an internal validation step that determines how much to trust the contextual knowledge in our prediction pipeline. Hence it addresses key challenges in robustness, making it suitable for mitigating potential inaccuracies or hallucinations from the LLM. In various biomedical case studies, LLM-Lasso outperforms standard Lasso and existing feature selection baselines, all while ensuring the LLM operates without prior access to the datasets. To our knowledge, this is the first approach to effectively integrate conventional feature selection techniques directly with LLM-based domain-specific reasoning.
△ Less
Submitted 20 February, 2025; v1 submitted 14 February, 2025;
originally announced February 2025.
-
A Bayesian Multivariate Spatial Point Pattern Model: Application to Oral Microbiome FISH Image Data
Authors:
Kyu Ha Lee,
Brent A. Coull,
Suman Majumder,
Patrick J. La Riviere,
Jessica L. Mark Welch,
Jacqueline R. Starr
Abstract:
Advances in cellular imaging technologies, especially those based on fluorescence in situ hybridization (FISH) now allow detailed visualization of the spatial organization of human or bacterial cells. Quantifying this spatial organization is crucial for understanding the function of multicellular tissues or biofilms, with implications for human health and disease. To address the need for better me…
▽ More
Advances in cellular imaging technologies, especially those based on fluorescence in situ hybridization (FISH) now allow detailed visualization of the spatial organization of human or bacterial cells. Quantifying this spatial organization is crucial for understanding the function of multicellular tissues or biofilms, with implications for human health and disease. To address the need for better methods to achieve such quantification, we propose a flexible multivariate point process model that characterizes and estimates complex spatial interactions among multiple cell types. The proposed Bayesian framework is appealing due to its unified estimation process and the ability to directly quantify uncertainty in key estimates of interest, such as those of inter-type correlation and the proportion of variance due to inter-type relationships. To ensure stable and interpretable estimation, we consider shrinkage priors for coefficients associated with latent processes. Model selection and comparison are conducted by using a deviance information criterion designed for models with latent variables, effectively balancing the risk of overfitting with that of oversimplifying key quantities. Furthermore, we develop a hierarchical modeling approach to integrate multiple image-specific estimates from a given subject, allowing inference at both the global and subject-specific levels. We apply the proposed method to microbial biofilm image data from the human tongue dorsum and find that specific taxon pairs, such as Streptococcus mitis-Streptococcus salivarius and Streptococcus mitis-Veillonella, exhibit strong positive spatial correlations, while others, such as Actinomyces-Rothia, show slight negative correlations. For most of the taxa, a substantial portion of spatial variance can be attributed to inter-taxon relationships.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Machine learning the vanishing order of rational L-functions
Authors:
Joanna Bieri,
Giorgi Butbaia,
Edgar Costa,
Alyson Deines,
Kyu-Hwan Lee,
David Lowry-Duda,
Thomas Oliver,
Yidi Qi,
Tamara Veenstra
Abstract:
In this paper, we study the vanishing order of rational $L$-functions from a data scientific perspective. Each $L$-function is represented in our data by finitely many Dirichlet coefficients, the normalisation of which depends on the context. We observe murmuration-like patterns in averages across our dataset, find that PCA clusters rational $L$-functions by their vanishing order, and record that…
▽ More
In this paper, we study the vanishing order of rational $L$-functions from a data scientific perspective. Each $L$-function is represented in our data by finitely many Dirichlet coefficients, the normalisation of which depends on the context. We observe murmuration-like patterns in averages across our dataset, find that PCA clusters rational $L$-functions by their vanishing order, and record that LDA and neural networks may accurately predict this quantity.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Learning Euler Factors of Elliptic Curves
Authors:
Angelica Babei,
François Charton,
Edgar Costa,
Xiaoyu Huang,
Kyu-Hwan Lee,
David Lowry-Duda,
Ashvni Narayanan,
Alexey Pozdnyakov
Abstract:
We apply transformer models and feedforward neural networks to predict Frobenius traces $a_p$ from elliptic curves given other traces $a_q$. We train further models to predict $a_p \bmod 2$ from $a_q \bmod 2$, and cross-analysis such as $a_p \bmod 2$ from $a_q$. Our experiments reveal that these models achieve high accuracy, even in the absence of explicit number-theoretic tools like functional eq…
▽ More
We apply transformer models and feedforward neural networks to predict Frobenius traces $a_p$ from elliptic curves given other traces $a_q$. We train further models to predict $a_p \bmod 2$ from $a_q \bmod 2$, and cross-analysis such as $a_p \bmod 2$ from $a_q$. Our experiments reveal that these models achieve high accuracy, even in the absence of explicit number-theoretic tools like functional equations of $L$-functions. We also present partial interpretability findings.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Absolute frequency measurement of a Lu$^+$ $(^{3}\rm D_1)$ optical frequency standard via link to international atomic time
Authors:
Zhao Zhang,
Qi Zhao,
Qin Qichen,
N. Jayjong,
M. D. K. Lee,
K. J. Arnold,
M. D. Barrett
Abstract:
We report on an absolute frequency measurement of the ${\rm Lu}^{+}\,(^{3}\rm D_1)$ standard frequency which is defined as the hyperfine-average of $^{1}\rm S_0$ to $^{3}\rm D_1$ optical clock transitions in $^{176}{\rm Lu}^{+}$. The measurement result of $353\,638\,794\,073\,800.35(33)$Hz with a fractional uncertainty of $9.2 \times 10^{-16}$ was obtained by operating a single-ion…
▽ More
We report on an absolute frequency measurement of the ${\rm Lu}^{+}\,(^{3}\rm D_1)$ standard frequency which is defined as the hyperfine-average of $^{1}\rm S_0$ to $^{3}\rm D_1$ optical clock transitions in $^{176}{\rm Lu}^{+}$. The measurement result of $353\,638\,794\,073\,800.35(33)$Hz with a fractional uncertainty of $9.2 \times 10^{-16}$ was obtained by operating a single-ion $^{176}{\rm Lu}^{+}$ frequency standard intermittently over 3 months with a total uptime of 162 hours. Traceability to the International System of Units (SI) is realized by remote link to International Atomic Time. This is the first reported absolute frequency value for a ${\rm Lu}^{+}\,(^{3}\rm D_1)$ optical frequency standard.
△ Less
Submitted 27 May, 2025; v1 submitted 14 February, 2025;
originally announced February 2025.
-
Chinese Pulsar Timing Array upper limits on microhertz gravitational waves from supermassive black-hole binaries using PSR J1713+0747 FAST data
Authors:
R. Nicolas Caballero,
Heng Xu,
Kejia Lee,
Siyuan Chen,
Yanjun Guo,
Jinchen Jiang,
Bojun Wan,
Jiangwei Xu,
Zihan Xue
Abstract:
We derive the gravitational-wave (GW) strain upper limits from resolvable supermassive black-hole binaries using the data from the Five-hundred-meter Aperture Spherical radio Telescope (FAST), in the context of the Chinese Pulsar Timing Array project. We focus on circular orbits in the $μ$Hz GW frequency band between $10^{-7}$ and $3\times10^{-6}$ Hz. This frequency band is higher than the traditi…
▽ More
We derive the gravitational-wave (GW) strain upper limits from resolvable supermassive black-hole binaries using the data from the Five-hundred-meter Aperture Spherical radio Telescope (FAST), in the context of the Chinese Pulsar Timing Array project. We focus on circular orbits in the $μ$Hz GW frequency band between $10^{-7}$ and $3\times10^{-6}$ Hz. This frequency band is higher than the traditional pulsar timing array band and is less explored. We used the data of the millisecond pulsar PSR J1713+5307 observed between August 2019 and April 2021. A dense observation campaign was carried out in September 2020 to allow for the $μ$Hz band coverage. Our sky-average continuous source upper limit at the 95% confidence level at 1$μ$Hz is 1.26$\times10^{-12}$, while the same limit in the direction of the pulsar is 4.77$\times10^{-13}$.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.