-
RePO: Replay-Enhanced Policy Optimization
Authors:
Siheng Li,
Zhanhui Zhou,
Wai Lam,
Chao Yang,
Chaochao Lu
Abstract:
Reinforcement learning (RL) is vital for optimizing large language models (LLMs). Recent Group Relative Policy Optimization (GRPO) estimates advantages using multiple on-policy outputs per prompt, leading to high computational costs and low data efficiency. To address this, we introduce Replay-Enhanced Policy Optimization (RePO), which leverages diverse replay strategies to retrieve off-policy sam…
▽ More
Reinforcement learning (RL) is vital for optimizing large language models (LLMs). Recent Group Relative Policy Optimization (GRPO) estimates advantages using multiple on-policy outputs per prompt, leading to high computational costs and low data efficiency. To address this, we introduce Replay-Enhanced Policy Optimization (RePO), which leverages diverse replay strategies to retrieve off-policy samples from a replay buffer, allowing policy optimization based on a broader and more diverse set of samples for each prompt. Experiments on five LLMs across seven mathematical reasoning benchmarks demonstrate that RePO achieves absolute average performance gains of $18.4$ and $4.1$ points for Qwen2.5-Math-1.5B and Qwen3-1.7B, respectively, compared to GRPO. Further analysis indicates that RePO increases computational cost by $15\%$ while raising the number of effective optimization steps by $48\%$ for Qwen3-1.7B, with both on-policy and off-policy sample numbers set to $8$. The repository can be accessed at https://github.com/SihengLi99/RePO.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Foundation Models in Medical Imaging -- A Review and Outlook
Authors:
Vivien van Veldhuizen,
Vanessa Botha,
Chunyao Lu,
Melis Erdal Cesur,
Kevin Groot Lipman,
Edwin D. de Jong,
Hugo Horlings,
Clárisa Sanchez,
Cees Snoek,
Ritse Mann,
Eric Marcus,
Jonas Teuwen
Abstract:
Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pa…
▽ More
Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
SafeCoT: Improving VLM Safety with Minimal Reasoning
Authors:
Jiachen Ma,
Zhanhui Zhou,
Chao Yang,
Chaochao Lu
Abstract:
Ensuring safe and appropriate responses from vision-language models (VLMs) remains a critical challenge, particularly in high-risk or ambiguous scenarios. We introduce SafeCoT, a lightweight, interpretable framework that leverages rule-based chain-of-thought (CoT) supervision to improve refusal behavior in VLMs. Unlike prior methods that rely on large-scale safety annotations or complex modeling,…
▽ More
Ensuring safe and appropriate responses from vision-language models (VLMs) remains a critical challenge, particularly in high-risk or ambiguous scenarios. We introduce SafeCoT, a lightweight, interpretable framework that leverages rule-based chain-of-thought (CoT) supervision to improve refusal behavior in VLMs. Unlike prior methods that rely on large-scale safety annotations or complex modeling, SafeCoT uses minimal supervision to help models reason about safety risks and make context-aware refusals. Experiments across multiple benchmarks show that SafeCoT significantly reduces overrefusal and enhances generalization, even with limited training data. Our approach offers a scalable solution for aligning VLMs with safety-critical objectives.
△ Less
Submitted 11 June, 2025; v1 submitted 9 June, 2025;
originally announced June 2025.
-
Generalizable Articulated Object Reconstruction from Casually Captured RGBD Videos
Authors:
Weikun Peng,
Jun Lv,
Cewu Lu,
Manolis Savva
Abstract:
Articulated objects are prevalent in daily life. Understanding their kinematic structure and reconstructing them have numerous applications in embodied AI and robotics. However, current methods require carefully captured data for training or inference, preventing practical, scalable, and generalizable reconstruction of articulated objects. We focus on reconstruction of an articulated object from a…
▽ More
Articulated objects are prevalent in daily life. Understanding their kinematic structure and reconstructing them have numerous applications in embodied AI and robotics. However, current methods require carefully captured data for training or inference, preventing practical, scalable, and generalizable reconstruction of articulated objects. We focus on reconstruction of an articulated object from a casually captured RGBD video shot with a hand-held camera. A casually captured video of an interaction with an articulated object is easy to acquire at scale using smartphones. However, this setting is quite challenging, as the object and camera move simultaneously and there are significant occlusions as the person interacts with the object. To tackle these challenges, we introduce a coarse-to-fine framework that infers joint parameters and segments movable parts of the object from a dynamic RGBD video. To evaluate our method under this new setting, we build a 20$\times$ larger synthetic dataset of 784 videos containing 284 objects across 11 categories. We compare our approach with existing methods that also take video as input. Experiments show that our method can reconstruct synthetic and real articulated objects across different categories from dynamic RGBD videos, outperforming existing methods significantly.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Synthesis by Design: Controlled Data Generation via Structural Guidance
Authors:
Lei Xu,
Sirui Chen,
Yuxuan Huang,
Chaochao Lu
Abstract:
Mathematical reasoning remains challenging for LLMs due to complex logic and the need for precise computation. Existing methods enhance LLM reasoning by synthesizing datasets through problem rephrasing, but face issues with generation quality and problem complexity. To address this, we propose to extract structural information with generated problem-solving code from mathematical reasoning and gui…
▽ More
Mathematical reasoning remains challenging for LLMs due to complex logic and the need for precise computation. Existing methods enhance LLM reasoning by synthesizing datasets through problem rephrasing, but face issues with generation quality and problem complexity. To address this, we propose to extract structural information with generated problem-solving code from mathematical reasoning and guide data generation with structured solutions. Applied to MATH and GSM8K, our approach produces 39K problems with labeled intermediate steps and a 6.1K-problem benchmark of higher difficulty. Results on our benchmark show that model performance declines as reasoning length increases. Additionally, we conducted fine-tuning experiments using the proposed training data on a range of LLMs, and the results validate the effectiveness of our dataset. We hope the proposed method and dataset will contribute to future research in enhancing LLM reasoning capabilities. Our code and data are available at https://github.com/OpenCausaLab/StructuralGeneration.
△ Less
Submitted 10 June, 2025; v1 submitted 9 June, 2025;
originally announced June 2025.
-
Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse
Authors:
Zhekai Duan,
Yuan Zhang,
Shikai Geng,
Gaowen Liu,
Joschka Boedecker,
Chris Xiaoxuan Lu
Abstract:
Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repeti…
▽ More
Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repetitive nature of ECoT to (1) cache and reuse high-level reasoning across timesteps and (2) parallelise the generation of modular reasoning steps. Additionally, we introduce an asynchronous scheduler that decouples reasoning from action decoding, further boosting responsiveness. Fast ECoT requires no model changes or additional training and integrates easily into existing VLA pipelines. Experiments in both simulation (LIBERO) and real-world robot tasks show up to a 7.5% reduction in latency with comparable or improved task success rate and reasoning faithfulness, bringing ECoT policies closer to practical real-time deployment.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Mitigating Object Hallucination via Robust Local Perception Search
Authors:
Zixian Gao,
Chao Yang,
Zhanhui Zhou,
Xing Xu,
Chaochao Lu
Abstract:
Recent advancements in Multimodal Large Language Models (MLLMs) have enabled them to effectively integrate vision and language, addressing a variety of downstream tasks. However, despite their significant success, these models still exhibit hallucination phenomena, where the outputs appear plausible but do not align with the content of the images. To mitigate this issue, we introduce Local Percept…
▽ More
Recent advancements in Multimodal Large Language Models (MLLMs) have enabled them to effectively integrate vision and language, addressing a variety of downstream tasks. However, despite their significant success, these models still exhibit hallucination phenomena, where the outputs appear plausible but do not align with the content of the images. To mitigate this issue, we introduce Local Perception Search (LPS), a decoding method during inference that is both simple and training-free, yet effectively suppresses hallucinations. This method leverages local visual prior information as a value function to correct the decoding process. Additionally, we observe that the impact of the local visual prior on model performance is more pronounced in scenarios with high levels of image noise. Notably, LPS is a plug-and-play approach that is compatible with various models. Extensive experiments on widely used hallucination benchmarks and noisy data demonstrate that LPS significantly reduces the incidence of hallucinations compared to the baseline, showing exceptional performance, particularly in noisy settings.
△ Less
Submitted 7 June, 2025;
originally announced June 2025.
-
Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic
Authors:
Thanh Vinh Vo,
Young Lee,
Haozhe Ma,
Chien Lu,
Tze-Yun Leong
Abstract:
Hidden confounders that influence both states and actions can bias policy learning in reinforcement learning (RL), leading to suboptimal or non-generalizable behavior. Most RL algorithms ignore this issue, learning policies from observational trajectories based solely on statistical associations rather than causal effects. We propose DoSAC (Do-Calculus Soft Actor-Critic with Backdoor Adjustment),…
▽ More
Hidden confounders that influence both states and actions can bias policy learning in reinforcement learning (RL), leading to suboptimal or non-generalizable behavior. Most RL algorithms ignore this issue, learning policies from observational trajectories based solely on statistical associations rather than causal effects. We propose DoSAC (Do-Calculus Soft Actor-Critic with Backdoor Adjustment), a principled extension of the SAC algorithm that corrects for hidden confounding via causal intervention estimation. DoSAC estimates the interventional policy $π(a | \mathrm{do}(s))$ using the backdoor criterion, without requiring access to true confounders or causal labels. To achieve this, we introduce a learnable Backdoor Reconstructor that infers pseudo-past variables (previous state and action) from the current state to enable backdoor adjustment from observational data. This module is integrated into a soft actor-critic framework to compute both the interventional policy and its entropy. Empirical results on continuous control benchmarks show that DoSAC outperforms baselines under confounded settings, with improved robustness, generalization, and policy reliability.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
VLMs Can Aggregate Scattered Training Patches
Authors:
Zhanhui Zhou,
Lingjie Chen,
Chao Yang,
Chaochao Lu
Abstract:
One way to mitigate risks in vision-language models (VLMs) is to remove dangerous samples in their training data. However, such data moderation can be easily bypassed when harmful images are split into small, benign-looking patches, scattered across many training samples. VLMs may then learn to piece these fragments together during training and generate harmful responses at inference, either from…
▽ More
One way to mitigate risks in vision-language models (VLMs) is to remove dangerous samples in their training data. However, such data moderation can be easily bypassed when harmful images are split into small, benign-looking patches, scattered across many training samples. VLMs may then learn to piece these fragments together during training and generate harmful responses at inference, either from full images or text references. For instance, if trained on image patches from a bloody scene paired with the descriptions "safe," VLMs may later describe, the full image or a text reference to the scene, as "safe." We define the core ability of VLMs enabling this attack as $\textit{visual stitching}$ -- the ability to integrate visual information spread across multiple training samples that share the same textual descriptions. In our work, we first demonstrate visual stitching abilities in common open-source VLMs on three datasets where each image is labeled with a unique synthetic ID: we split each $(\texttt{image}, \texttt{ID})$ pair into $\{(\texttt{patch}, \texttt{ID})\}$ pairs at different granularity for finetuning, and we find that tuned models can verbalize the correct IDs from full images or text reference. Building on this, we simulate the adversarial data poisoning scenario mentioned above by using patches from dangerous images and replacing IDs with text descriptions like ``safe'' or ``unsafe'', demonstrating how harmful content can evade moderation in patches and later be reconstructed through visual stitching, posing serious VLM safety risks. Code is available at https://github.com/ZHZisZZ/visual-stitching.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
Authors:
Xiaoying Zhang,
Hao Sun,
Yipeng Zhang,
Kaituo Feng,
Chaochao Lu,
Chao Yang,
Helen Meng
Abstract:
Recent advances in reinforcement learning (RL) with numerical feedback, such as scalar rewards, have significantly enhanced the complex reasoning capabilities of large language models (LLMs). Despite this success, we identify three key challenges encountered by RL with solely numerical feedback: performance plateaus, limited effectiveness of self-reflection, and persistent failures. We then demons…
▽ More
Recent advances in reinforcement learning (RL) with numerical feedback, such as scalar rewards, have significantly enhanced the complex reasoning capabilities of large language models (LLMs). Despite this success, we identify three key challenges encountered by RL with solely numerical feedback: performance plateaus, limited effectiveness of self-reflection, and persistent failures. We then demonstrate that RL-finetuned models, even after exhibiting performance plateaus, can generate correct refinements on persistently failed problems by leveraging natural language feedback in the form of critiques. Building on this insight, we propose Critique-GRPO, an online RL framework that integrates both natural language and numerical feedback for effective policy optimization. Critique-GRPO enables LLMs to learn from initial responses and critique-guided refinements simultaneously while maintaining exploration. Extensive experiments using Qwen2.5-7B-Base and Qwen3-8B-Base show that Critique-GRPO consistently outperforms supervised learning-based and RL-based fine-tuning approaches across eight challenging mathematical, STEM, and general reasoning tasks, improving average pass@1 scores by approximately 4.5% and 5%, respectively. Notably, Critique-GRPO surpasses a strong baseline that incorporates expert demonstrations within online RL. Further analysis reveals two critical insights about policy exploration: (1) higher entropy does not always guarantee efficient learning from exploration, and (2) longer responses do not necessarily lead to more effective exploration.
△ Less
Submitted 4 June, 2025; v1 submitted 3 June, 2025;
originally announced June 2025.
-
Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs
Authors:
Wenjing Tang,
Xinyu He,
Yongxi Huang,
Yunxiao Xiao,
Cewu Lu,
Panpan Cai
Abstract:
Task planning under uncertainty is essential for home-service robots operating in the real world. Tasks involve ambiguous human instructions, hidden or unknown object locations, and open-vocabulary object types, leading to significant open-ended uncertainty and a boundlessly large planning space. To address these challenges, we propose Tru-POMDP, a planner that combines structured belief generatio…
▽ More
Task planning under uncertainty is essential for home-service robots operating in the real world. Tasks involve ambiguous human instructions, hidden or unknown object locations, and open-vocabulary object types, leading to significant open-ended uncertainty and a boundlessly large planning space. To address these challenges, we propose Tru-POMDP, a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning. Tru-POMDP introduces a hierarchical Tree of Hypotheses (TOH), which systematically queries an LLM to construct high-quality particle beliefs over possible world states and human goals. We further formulate an open-ended POMDP model that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses. Experiments on complex object rearrangement tasks across diverse kitchen environments show that Tru-POMDP significantly outperforms state-of-the-art LLM-based and LLM-tree-search hybrid planners, achieving higher success rates with significantly better plans, stronger robustness to ambiguity and occlusion, and greater planning efficiency.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
Authors:
Bo Peng,
Zhiheng Wang,
Heyang Gong,
Chaochao Lu
Abstract:
In modern dialogue systems, the ability to implicitly infer user backgrounds from conversations and leverage this information for personalized assistance is crucial. However, the scarcity of high-quality data remains a fundamental challenge to evaluating and improving this capability. Traditional dataset construction methods are labor-intensive, resource-demanding, and raise privacy concerns. To a…
▽ More
In modern dialogue systems, the ability to implicitly infer user backgrounds from conversations and leverage this information for personalized assistance is crucial. However, the scarcity of high-quality data remains a fundamental challenge to evaluating and improving this capability. Traditional dataset construction methods are labor-intensive, resource-demanding, and raise privacy concerns. To address these issues, we propose a novel approach for automatic synthetic data generation and introduce the Implicit Personalized Dialogue (IP-Dialog) benchmark along with a training dataset, covering 10 tasks and 12 user attribute types. Additionally, we develop a systematic evaluation framework with four metrics to assess both attribute awareness and reasoning capabilities. We further propose five causal graphs to elucidate models' reasoning pathways during implicit personalization. Extensive experiments yield insightful observations and prove the reliability of our dataset.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
StochasTok: Improving Fine-Grained Subword Understanding in LLMs
Authors:
Anya Sims,
Thom Foster,
Klara Kaleb,
Tuan-Duy H. Nguyen,
Joseph Lee,
Jakob N. Foerster,
Yee Whye Teh,
Cong Lu
Abstract:
Subword-level understanding is integral to numerous tasks, including understanding multi-digit numbers, spelling mistakes, abbreviations, rhyming, and wordplay. Despite this, current large language models (LLMs) still often struggle with seemingly simple subword-level tasks like How many 'r's in 'strawberry'?. A key factor behind these failures is tokenization which obscures the fine-grained struc…
▽ More
Subword-level understanding is integral to numerous tasks, including understanding multi-digit numbers, spelling mistakes, abbreviations, rhyming, and wordplay. Despite this, current large language models (LLMs) still often struggle with seemingly simple subword-level tasks like How many 'r's in 'strawberry'?. A key factor behind these failures is tokenization which obscures the fine-grained structure of words. Current alternatives, such as character-level and dropout tokenization methods, significantly increase computational costs and provide inconsistent improvements. In this paper we revisit tokenization and introduce StochasTok, a simple, efficient stochastic tokenization scheme that randomly splits tokens during training, allowing LLMs to 'see' their internal structure. Our experiments show that pretraining with StochasTok substantially improves LLMs' downstream performance across multiple subword-level language games, including character counting, substring identification, and math tasks. Furthermore, StochasTok's simplicity allows seamless integration at any stage of the training pipeline; and we demonstrate that post-training with StochasTok can instill improved subword understanding into existing pretrained models, thus avoiding costly pretraining from scratch. These dramatic improvements achieved with a minimal change suggest StochasTok holds exciting potential when applied to larger, more capable models. Code open-sourced at: https://github.com/anyasims/stochastok.
△ Less
Submitted 10 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation
Authors:
Bingqian Lin,
Yunshuang Nie,
Khun Loun Zai,
Ziming Wei,
Mingfei Han,
Rongtao Xu,
Minzhe Niu,
Jianhua Han,
Liang Lin,
Cewu Lu,
Xiaodan Liang
Abstract:
Building Vision-Language Navigation (VLN) agents which can navigate following natural language instructions is a long-standing goal in human-robot interaction applications. Recent studies have revealed the potential of training open-source Large Language Models (LLMs) to unleash LLMs' reasoning ability for improving navigation, and simultaneously mitigate the domain gap between LLMs' training corp…
▽ More
Building Vision-Language Navigation (VLN) agents which can navigate following natural language instructions is a long-standing goal in human-robot interaction applications. Recent studies have revealed the potential of training open-source Large Language Models (LLMs) to unleash LLMs' reasoning ability for improving navigation, and simultaneously mitigate the domain gap between LLMs' training corpus and the VLN task. However, these approaches primarily adopt direct input-output mapping paradigms, causing the mapping learning difficult and the navigational decisions unexplainable. Chain-of-Thought (CoT) training is a promising way to improve both navigational decision accuracy and interpretability, while the complexity of the navigation task makes the perfect CoT labels unavailable and may lead to overfitting through pure CoT supervised fine-tuning. In this paper, we propose a novel sElf-improving embodied reasoning framework for boosting LLM-based vision-language Navigation, dubbed EvolveNav. Our EvolveNav consists of two stages: (1) Formalized CoT Supervised Fine-Tuning, where we train the model with formalized CoT labels to both activate the model's navigational reasoning capabilities and increase the reasoning speed; (2) Self-Reflective Post-Training, where the model is iteratively trained with its own reasoning outputs as self-enriched CoT labels to enhance the supervision diversity. A self-reflective auxiliary task is also introduced to encourage learning correct reasoning patterns by contrasting with wrong ones. Experimental results on the popular VLN benchmarks demonstrate the superiority of EvolveNav over previous LLM-based VLN approaches. Code is available at https://github.com/expectorlin/EvolveNav.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Observation of universal topological magnetoelectric switching in multiferroic GdMn2O5
Authors:
Haowen Wang,
Fan Wang,
Ming Yang,
Yuting Chang,
Mengyi Shi,
Liang Li,
Jun-Ming Liu,
Junfeng Wang,
Shuai Dong,
Chengliang Lu
Abstract:
Topological magnetoelectricity was recently revealed as an emergent topic, which opens a unique route to precisely control magnetoelectric functionality. Here we report the synchronous magnetic-electric-cycle operation of topological magnetoelectric switching in GdMn2O5. Compared with pure magnetic-cycle operation, this topological winding can be accessed in a much broader parameter space, i.e. or…
▽ More
Topological magnetoelectricity was recently revealed as an emergent topic, which opens a unique route to precisely control magnetoelectric functionality. Here we report the synchronous magnetic-electric-cycle operation of topological magnetoelectric switching in GdMn2O5. Compared with pure magnetic-cycle operation, this topological winding can be accessed in a much broader parameter space, i.e. orientation of magnetic field is not limited to the magic angle and the effect can persist up to the Curie temperature. The fine tuning of free energy landscape is responsible to this topological behavior.
△ Less
Submitted 1 June, 2025;
originally announced June 2025.
-
HouseTS: A Large-Scale, Multimodal Spatiotemporal U.S. Housing Dataset
Authors:
Shengkun Wang,
Yanshen Sun,
Fanglan Chen,
Linhan Wang,
Naren Ramakrishnan,
Chang-Tien Lu,
Yinlin Chen
Abstract:
Accurate house-price forecasting is essential for investors, planners, and researchers. However, reproducible benchmarks with sufficient spatiotemporal depth and contextual richness for long horizon prediction remain scarce. To address this, we introduce HouseTS a large scale, multimodal dataset covering monthly house prices from March 2012 to December 2023 across 6,000 ZIP codes in 30 major U.S.…
▽ More
Accurate house-price forecasting is essential for investors, planners, and researchers. However, reproducible benchmarks with sufficient spatiotemporal depth and contextual richness for long horizon prediction remain scarce. To address this, we introduce HouseTS a large scale, multimodal dataset covering monthly house prices from March 2012 to December 2023 across 6,000 ZIP codes in 30 major U.S. metropolitan areas. The dataset includes over 890K records, enriched with points of Interest (POI), socioeconomic indicators, and detailed real estate metrics. To establish standardized performance baselines, we evaluate 14 models, spanning classical statistical approaches, deep neural networks (DNNs), and pretrained time-series foundation models. We further demonstrate the value of HouseTS in a multimodal case study, where a vision language model extracts structured textual descriptions of geographic change from time stamped satellite imagery. This enables interpretable, grounded insights into urban evolution. HouseTS is hosted on Kaggle, while all preprocessing pipelines, benchmark code, and documentation are openly maintained on GitHub to ensure full reproducibility and easy adoption.
△ Less
Submitted 31 May, 2025;
originally announced June 2025.
-
Millimeter-wave observations of Euclid Deep Field South using the South Pole Telescope: A data release of temperature maps and catalogs
Authors:
M. Archipley,
A. Hryciuk,
L. E. Bleem,
K. Kornoelje,
M. Klein,
A. J. Anderson,
B. Ansarinejad,
M. Aravena,
L. Balkenhol,
P. S. Barry,
K. Benabed,
A. N. Bender,
B. A. Benson,
F. Bianchini,
S. Bocquet,
F. R. Bouchet,
E. Camphuis,
M. G. Campitiello,
J. E. Carlstrom,
J. Cathey,
C. L. Chang,
S. C. Chapman,
P. Chaubal,
P. M. Chichura,
A. Chokshi
, et al. (86 additional authors not shown)
Abstract:
Context. The South Pole Telescope third-generation camera (SPT-3G) has observed over 10,000 square degrees of sky at 95, 150, and 220 GHz (3.3, 2.0, 1.4 mm, respectively) overlapping the ongoing 14,000 square-degree Euclid Wide Survey. The Euclid collaboration recently released Euclid Deep Field observations in the first quick data release (Q1). Aims. With the goal of releasing complementary milli…
▽ More
Context. The South Pole Telescope third-generation camera (SPT-3G) has observed over 10,000 square degrees of sky at 95, 150, and 220 GHz (3.3, 2.0, 1.4 mm, respectively) overlapping the ongoing 14,000 square-degree Euclid Wide Survey. The Euclid collaboration recently released Euclid Deep Field observations in the first quick data release (Q1). Aims. With the goal of releasing complementary millimeter-wave data and encouraging legacy science, we performed dedicated observations of a 57-square-degree field overlapping the Euclid Deep Field South (EDF-S). Methods. The observing time totaled 20 days and we reached noise depths of 4.3, 3.8, and 13.2 $μ$K-arcmin at 95, 150, and 220 GHz, respectively. Results. In this work we present the temperature maps and two catalogs constructed from these data. The emissive source catalog contains 601 objects (334 inside EDF-S) with 54% synchrotron-dominated sources and 46% thermal dust emission-dominated sources. The 5$σ$ detection thresholds are 1.7, 2.0, and 6.5 mJy in the three bands. The cluster catalog contains 217 cluster candidates (121 inside EDF-S) with median mass $M_{500c}=2.12 \times 10^{14} M_{\odot}/h_{70}$ and median redshift $z$ = 0.70, corresponding to an order-of-magnitude improvement in cluster density over previous tSZ-selected catalogs in this region (3.81 clusters per square degree). Conclusions. The overlap between SPT and Euclid data will enable a range of multiwavelength studies of the aforementioned source populations. This work serves as the first step towards joint projects between SPT and Euclid and provides a rich dataset containing information on galaxies, clusters, and their environments.
△ Less
Submitted 30 May, 2025;
originally announced June 2025.
-
Cryogenic scanning photocurrent spectroscopy for materials responses to structured optical fields
Authors:
Duxing Hao,
Chun-I Lu,
Ziqi Sun,
Yu-Chen Chang,
Wen-Hao Chang,
Ye-Ru Chen,
Akiyoshi Park,
Beining Rao,
Siyuan Qiu,
Yann-Wen Lan,
Ting-Hua Lu,
Nai-Chang Yeh
Abstract:
Circular dichroism spectroscopy is known to provide important insights into the interplay of different degrees of freedom in quantum materials, and yet spectroscopic study of the optoelectronic responses of quantum materials to structured optical fields, such as light with finite spin and orbital angular momentum, has not yet been widely explored, particularly at cryogenic temperature. Here we dem…
▽ More
Circular dichroism spectroscopy is known to provide important insights into the interplay of different degrees of freedom in quantum materials, and yet spectroscopic study of the optoelectronic responses of quantum materials to structured optical fields, such as light with finite spin and orbital angular momentum, has not yet been widely explored, particularly at cryogenic temperature. Here we demonstrate the design and application of a novel instrument that integrates scanning spectroscopic photocurrent measurements with structured light of controlled spin and orbital angular momentum. For structured photons with wavelengths between 500 nm to 700 nm, this instrument can perform spatially resolved photocurrent measurements of two-dimensional materials or thin crystals under magnetic fields up to $\pm$ 14 Tesla, at temperatures from 300 K down to 3 K, with either spin angular momentum $\pm \hbar$ ororbital angular momentum $\pm \ell \hbar$ (where $\ell$=1,2,3... is the topological charge), and over a (35 $\times$ 25) $μm^2$ area with ~ 1 $μm$ spatial resolution. These capabilities of the instrument are exemplified by magneto-photocurrent spectroscopic measurements of monolayer 2H-$MoS_2$ field-effect transistors, which not only reveal the excitonic spectra but also demonstrate monotonically increasing photocurrents with increasing |$\ell $| as well as excitonic Zeeman splitting and an enhanced Landé g-factor due to the enhanced formation of intervalley dark excitons under magnetic field. These studies thus demonstrate the versatility of the scanning photocurrent spectrometry for investigating excitonic physics, optical selection rules, and optoelectronic responses of novel quantum materials and engineered quantum devices to structured light.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
New Physics Search at the CEPC: a General Perspective
Authors:
Stefan Antusch,
Peter Athron,
Daniele Barducci,
Long Chen,
Mingshui Chen,
Xiang Chen,
Huajie Cheng,
Kingman Cheung,
Joao Guimaraes da Costa,
Arindam Das,
Frank F. Deppisch,
P. S. Bhupal Dev,
Xiaokang Du,
Yong Du,
Yaquan Fang,
Andrew Fowlie,
Yu Gao,
Bruce Mellado Garcia,
Shao-Feng Ge,
Jiayin Gu,
Yu-Chen Guo,
Jan Hajer,
Chengcheng Han,
Tao Han,
Sven Heinemeyer
, et al. (68 additional authors not shown)
Abstract:
The Circular Electron-Positron Collider (CEPC), a proposed next-generation Higgs factory, provides new opportunities to explore physics beyond the Standard Model (SM). With its clean electron-positron collision environment and the ability to collect large samples of Higgs, W, and Z bosons, the CEPC enables precision measurements and searches for new physics. This white paper outlines the CEPC's di…
▽ More
The Circular Electron-Positron Collider (CEPC), a proposed next-generation Higgs factory, provides new opportunities to explore physics beyond the Standard Model (SM). With its clean electron-positron collision environment and the ability to collect large samples of Higgs, W, and Z bosons, the CEPC enables precision measurements and searches for new physics. This white paper outlines the CEPC's discovery potential, including studies of exotic decays of the Higgs, Z, and top quarks, dark matter and dark sector phenomena, long-lived particles, supersymmetry, and neutrino-related signatures. Advanced detector technologies and reconstruction techniques, such as one-to-one correspondence reconstruction and jet origin identification, significantly improve sensitivity to rare and weakly interacting processes. The CEPC is particularly well suited to probe the electroweak phase transition and test models of electroweak baryogenesis and dark sector interactions. In addition, global fit analyses highlight the CEPC's complementary role in constraining a wide range of new physics scenarios. These features position the CEPC as a powerful tool for exploring the next frontier in fundamental particle physics in the post-Higgs discovery era.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
All-optical diode via nonreciprocal nonlinear absorption and interfacial charge transfer in two-dimensional van der Waals heterostructures
Authors:
Erkang Li,
Jinhong Liu,
Yanqing Ge,
Mingjian Shi,
Yijie Wang,
Chunhui Lu,
Yixuan Zhou,
Xinlong Xu
Abstract:
Nonreciprocity is fundamental to photonic and optoelectronic devices such as all-optical diodes for ultrafast optical signal processing. However, previous nonreciprocity is mainly based on linear optical response instead of nonlinear optical response based on recently developed two-dimensional (2D) van der Waals heterostructures. Herein, an all-optical diode prototype based on nonreciprocal nonlin…
▽ More
Nonreciprocity is fundamental to photonic and optoelectronic devices such as all-optical diodes for ultrafast optical signal processing. However, previous nonreciprocity is mainly based on linear optical response instead of nonlinear optical response based on recently developed two-dimensional (2D) van der Waals heterostructures. Herein, an all-optical diode prototype based on nonreciprocal nonlinear absorption and interfacial charge transfer is proposed and designed by both simulation and experiment based on ready van der Waals heterostructures. The giant saturable absorption from 2D MXenes (NbC) and reverse saturable absorption from 2D chalcogenides (GaS) play a synergistic role in the designed all-optical diodes, which is characterized by a femtosecond laser based Z-scan system. The comprehensive physical mechanism of this all-optical diode based on 2D van der Waals NbC/GaS heterostructure designed by simulations, is consistent with experiments under the consideration of both nonreciprocal nonlinear absorption and interfacial effect. This all-optical diode based on the 2D van der Waals heterostructure features the simplicity, scalability, stability, integration, and compatibility with the complementary planar fabrication technology, which can further extend and miniaturize the nonlinear photonic and optoelectric devices.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Adversarial Preference Learning for Robust LLM Alignment
Authors:
Yuanfu Wang,
Pengyu Wang,
Chenyang Xi,
Bo Tang,
Junyi Zhu,
Wenqiang Wei,
Chen Chen,
Chao Yang,
Jingfeng Zhang,
Chaochao Lu,
Yijun Niu,
Keming Mao,
Zhiyu Li,
Feiyu Xiong,
Jie Hu,
Mingchuan Yang
Abstract:
Modern language models often rely on Reinforcement Learning from Human Feedback (RLHF) to encourage safe behaviors. However, they remain vulnerable to adversarial attacks due to three key limitations: (1) the inefficiency and high cost of human annotation, (2) the vast diversity of potential adversarial attacks, and (3) the risk of feedback bias and reward hacking. To address these challenges, we…
▽ More
Modern language models often rely on Reinforcement Learning from Human Feedback (RLHF) to encourage safe behaviors. However, they remain vulnerable to adversarial attacks due to three key limitations: (1) the inefficiency and high cost of human annotation, (2) the vast diversity of potential adversarial attacks, and (3) the risk of feedback bias and reward hacking. To address these challenges, we introduce Adversarial Preference Learning (APL), an iterative adversarial training method incorporating three key innovations. First, a direct harmfulness metric based on the model's intrinsic preference probabilities, eliminating reliance on external assessment. Second, a conditional generative attacker that synthesizes input-specific adversarial variations. Third, an iterative framework with automated closed-loop feedback, enabling continuous adaptation through vulnerability discovery and mitigation. Experiments on Mistral-7B-Instruct-v0.3 demonstrate that APL significantly enhances robustness, achieving 83.33% harmlessness win rate over the base model (evaluated by GPT-4o), reducing harmful outputs from 5.88% to 0.43% (measured by LLaMA-Guard), and lowering attack success rate by up to 65% according to HarmBench. Notably, APL maintains competitive utility, with an MT-Bench score of 6.59 (comparable to the baseline 6.78) and an LC-WinRate of 46.52% against the base model.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
Authors:
Jenny Zhang,
Shengran Hu,
Cong Lu,
Robert Lange,
Jeff Clune
Abstract:
Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a sui…
▽ More
Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The Gödel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin Gödel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Attention-Enhanced Prompt Decision Transformers for UAV-Assisted Communications with AoI
Authors:
Chi Lu,
Yiyang Ni,
Zhe Wang,
Xiaoli Shi,
Jun Li,
Shi Jin
Abstract:
Decision Transformer (DT) has recently demonstrated strong generalizability in dynamic resource allocation within unmanned aerial vehicle (UAV) networks, compared to conventional deep reinforcement learning (DRL). However, its performance is hindered due to zero-padding for varying state dimensions, inability to manage long-term energy constraint, and challenges in acquiring expert samples for few…
▽ More
Decision Transformer (DT) has recently demonstrated strong generalizability in dynamic resource allocation within unmanned aerial vehicle (UAV) networks, compared to conventional deep reinforcement learning (DRL). However, its performance is hindered due to zero-padding for varying state dimensions, inability to manage long-term energy constraint, and challenges in acquiring expert samples for few-shot fine-tuning in new scenarios. To overcome these limitations, we propose an attention-enhanced prompt Decision Transformer (APDT) framework to optimize trajectory planning and user scheduling, aiming to minimize the average age of information (AoI) under long-term energy constraint in UAV-assisted Internet of Things (IoT) networks. Specifically, we enhance the convenional DT framework by incorporating an attention mechanism to accommodate varying numbers of terrestrial users, introducing a prompt mechanism based on short trajectory demonstrations for rapid adaptation to new scenarios, and designing a token-assisted method to address the UAV's long-term energy constraint. The APDT framework is first pre-trained on offline datasets and then efficiently generalized to new scenarios. Simulations demonstrate that APDT achieves twice faster in terms of convergence rate and reduces average AoI by $8\%$ compared to conventional DT.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation
Authors:
Jiawen Yu,
Hairuo Liu,
Qiaojun Yu,
Jieji Ren,
Ce Hao,
Haitong Ding,
Guangyu Huang,
Guofan Huang,
Yan Song,
Panpan Cai,
Cewu Lu,
Wenqiang Zhang
Abstract:
Vision-Language-Action (VLA) models have advanced general-purpose robotic manipulation by leveraging pretrained visual and linguistic representations. However, they struggle with contact-rich tasks that require fine-grained control involving force, especially under visual occlusion or dynamic uncertainty. To address these limitations, we propose \textbf{ForceVLA}, a novel end-to-end manipulation f…
▽ More
Vision-Language-Action (VLA) models have advanced general-purpose robotic manipulation by leveraging pretrained visual and linguistic representations. However, they struggle with contact-rich tasks that require fine-grained control involving force, especially under visual occlusion or dynamic uncertainty. To address these limitations, we propose \textbf{ForceVLA}, a novel end-to-end manipulation framework that treats external force sensing as a first-class modality within VLA systems. ForceVLA introduces \textbf{FVLMoE}, a force-aware Mixture-of-Experts fusion module that dynamically integrates pretrained visual-language embeddings with real-time 6-axis force feedback during action decoding. This enables context-aware routing across modality-specific experts, enhancing the robot's ability to adapt to subtle contact dynamics. We also introduce \textbf{ForceVLA-Data}, a new dataset comprising synchronized vision, proprioception, and force-torque signals across five contact-rich manipulation tasks. ForceVLA improves average task success by 23.2\% over strong $π_0$-based baselines, achieving up to 80\% success in tasks such as plug insertion. Our approach highlights the importance of multimodal integration for dexterous manipulation and sets a new benchmark for physically intelligent robotic control. Code and data will be released at https://sites.google.com/view/forcevla2025.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Fractional order derivative characterizations of Besov-Morrey type spaces with applications
Authors:
Chen Lu,
Mingjin Li,
Jianren Long
Abstract:
On the one hand, the fractional order derivative characterization of the Besov-Morrey type space $B_{p}^{K}(s)$ is established by $K$-Carleson measures, and it was also shown that $f \in B_{p}^{K}(s_1) \Leftrightarrow f^{\left(\frac{s_2 - s_1}{p}\right)} \in B_{p}^{K}(s_2)$, which extended the results of Sun et al. on the fractional derivative of Morrey type space. On the other hand, some sufficie…
▽ More
On the one hand, the fractional order derivative characterization of the Besov-Morrey type space $B_{p}^{K}(s)$ is established by $K$-Carleson measures, and it was also shown that $f \in B_{p}^{K}(s_1) \Leftrightarrow f^{\left(\frac{s_2 - s_1}{p}\right)} \in B_{p}^{K}(s_2)$, which extended the results of Sun et al. on the fractional derivative of Morrey type space. On the other hand, some sufficient conditions for the growth of solutions to linear complex differential equations have been obtained by using $n$th derivative criterion.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts
Authors:
Tianhua Qi,
Shiyan Wang,
Cheng Lu,
Tengfei Song,
Hao Yang,
Zhanglin Wu,
Wenming Zheng
Abstract:
Controllable emotional voice conversion (EVC) aims to manipulate emotional expressions to increase the diversity of synthesized speech. Existing methods typically rely on predefined labels, reference audios, or prespecified factor values, often overlooking individual differences in emotion perception and expression. In this paper, we introduce PromptEVC that utilizes natural language prompts for p…
▽ More
Controllable emotional voice conversion (EVC) aims to manipulate emotional expressions to increase the diversity of synthesized speech. Existing methods typically rely on predefined labels, reference audios, or prespecified factor values, often overlooking individual differences in emotion perception and expression. In this paper, we introduce PromptEVC that utilizes natural language prompts for precise and flexible emotion control. To bridge text descriptions with emotional speech, we propose emotion descriptor and prompt mapper to generate fine-grained emotion embeddings, trained jointly with reference embeddings. To enhance naturalness, we present a prosody modeling and control pipeline that adjusts the rhythm based on linguistic content and emotional cues. Additionally, a speaker encoder is incorporated to preserve identity. Experimental results demonstrate that PromptEVC outperforms state-of-the-art controllable EVC methods in emotion conversion, intensity control, mixed emotion synthesis, and prosody manipulation. Speech samples are available at https://jeremychee4.github.io/PromptEVC/.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Improvement Strategies for Few-Shot Learning in OCT Image Classification of Rare Retinal Diseases
Authors:
Cheng-Yu Tai,
Ching-Wen Chen,
Chi-Chin Wu,
Bo-Chen Chiu,
Cheng-Hung,
Lin,
Cheng-Kai Lu,
Jia-Kang Wang,
Tzu-Lun Huang
Abstract:
This paper focuses on using few-shot learning to improve the accuracy of classifying OCT diagnosis images with major and rare classes. We used the GAN-based augmentation strategy as a baseline and introduced several novel methods to further enhance our model. The proposed strategy contains U-GAT-IT for improving the generative part and uses the data balance technique to narrow down the skew of acc…
▽ More
This paper focuses on using few-shot learning to improve the accuracy of classifying OCT diagnosis images with major and rare classes. We used the GAN-based augmentation strategy as a baseline and introduced several novel methods to further enhance our model. The proposed strategy contains U-GAT-IT for improving the generative part and uses the data balance technique to narrow down the skew of accuracy between all categories. The best model obtained was built with CBAM attention mechanism and fine-tuned InceptionV3, and achieved an overall accuracy of 97.85%, representing a significant improvement over the original baseline.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks
Authors:
Sirui Chen,
Shuqin Ma,
Shu Yu,
Hanwang Zhang,
Shengjie Zhao,
Chaochao Lu
Abstract:
Consciousness stands as one of the most profound and distinguishing features of the human mind, fundamentally shaping our understanding of existence and agency. As large language models (LLMs) develop at an unprecedented pace, questions concerning intelligence and consciousness have become increasingly significant. However, discourse on LLM consciousness remains largely unexplored territory. In th…
▽ More
Consciousness stands as one of the most profound and distinguishing features of the human mind, fundamentally shaping our understanding of existence and agency. As large language models (LLMs) develop at an unprecedented pace, questions concerning intelligence and consciousness have become increasingly significant. However, discourse on LLM consciousness remains largely unexplored territory. In this paper, we first clarify frequently conflated terminologies (e.g., LLM consciousness and LLM awareness). Then, we systematically organize and synthesize existing research on LLM consciousness from both theoretical and empirical perspectives. Furthermore, we highlight potential frontier risks that conscious LLMs might introduce. Finally, we discuss current challenges and outline future directions in this emerging field. The references discussed in this paper are organized at https://github.com/OpenCausaLab/Awesome-LLM-Consciousness.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Multimodal Machine Translation with Visual Scene Graph Pruning
Authors:
Chenyu Lu,
Shiliang Sun,
Jing Zhao,
Nan Zhang,
Tengfei Song,
Hao Yang
Abstract:
Multimodal machine translation (MMT) seeks to address the challenges posed by linguistic polysemy and ambiguity in translation tasks by incorporating visual information. A key bottleneck in current MMT research is the effective utilization of visual data. Previous approaches have focused on extracting global or region-level image features and using attention or gating mechanisms for multimodal inf…
▽ More
Multimodal machine translation (MMT) seeks to address the challenges posed by linguistic polysemy and ambiguity in translation tasks by incorporating visual information. A key bottleneck in current MMT research is the effective utilization of visual data. Previous approaches have focused on extracting global or region-level image features and using attention or gating mechanisms for multimodal information fusion. However, these methods have not adequately tackled the issue of visual information redundancy in MMT, nor have they proposed effective solutions. In this paper, we introduce a novel approach--multimodal machine translation with visual Scene Graph Pruning (PSG), which leverages language scene graph information to guide the pruning of redundant nodes in visual scene graphs, thereby reducing noise in downstream translation tasks. Through extensive comparative experiments with state-of-the-art methods and ablation studies, we demonstrate the effectiveness of the PSG model. Our results also highlight the promising potential of visual information pruning in advancing the field of MMT.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Hierarchical Tree Search-based User Lifelong Behavior Modeling on Large Language Model
Authors:
Yu Xia,
Rui Zhong,
Hao Gu,
Wei Yang,
Chi Lu,
Peng Jiang,
Kun Gai
Abstract:
Large Language Models (LLMs) have garnered significant attention in Recommendation Systems (RS) due to their extensive world knowledge and robust reasoning capabilities. However, a critical challenge lies in enabling LLMs to effectively comprehend and extract insights from massive user behaviors. Current approaches that directly leverage LLMs for user interest learning face limitations in handling…
▽ More
Large Language Models (LLMs) have garnered significant attention in Recommendation Systems (RS) due to their extensive world knowledge and robust reasoning capabilities. However, a critical challenge lies in enabling LLMs to effectively comprehend and extract insights from massive user behaviors. Current approaches that directly leverage LLMs for user interest learning face limitations in handling long sequential behaviors, effectively extracting interest, and applying interest in practical scenarios. To address these issues, we propose a Hierarchical Tree Search-based User Lifelong Behavior Modeling framework (HiT-LBM). HiT-LBM integrates Chunked User Behavior Extraction (CUBE) and Hierarchical Tree Search for Interest (HTS) to capture diverse interests and interest evolution of user. CUBE divides user lifelong behaviors into multiple chunks and learns the interest and interest evolution within each chunk in a cascading manner. HTS generates candidate interests through hierarchical expansion and searches for the optimal interest with process rating model to ensure information gain for each behavior chunk. Additionally, we design Temporal-Ware Interest Fusion (TIF) to integrate interests from multiple behavior chunks, constructing a comprehensive representation of user lifelong interests. The representation can be embedded into any recommendation model to enhance performance. Extensive experiments demonstrate the effectiveness of our approach, showing that it surpasses state-of-the-art methods.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding
Authors:
Shiyue Wang,
Haozheng Xu,
Yuhan Zhang,
Jingran Lin,
Changhong Lu,
Xiangfeng Wang,
Wenhao Li
Abstract:
Multi-Agent Path Finding (MAPF) is a fundamental problem in artificial intelligence and robotics, requiring the computation of collision-free paths for multiple agents navigating from their start locations to designated goals. As autonomous systems become increasingly prevalent in warehouses, urban transportation, and other complex environments, MAPF has evolved from a theoretical challenge to a c…
▽ More
Multi-Agent Path Finding (MAPF) is a fundamental problem in artificial intelligence and robotics, requiring the computation of collision-free paths for multiple agents navigating from their start locations to designated goals. As autonomous systems become increasingly prevalent in warehouses, urban transportation, and other complex environments, MAPF has evolved from a theoretical challenge to a critical enabler of real-world multi-robot coordination. This comprehensive survey bridges the long-standing divide between classical algorithmic approaches and emerging learning-based methods in MAPF research. We present a unified framework that encompasses search-based methods (including Conflict-Based Search, Priority-Based Search, and Large Neighborhood Search), compilation-based approaches (SAT, SMT, CSP, ASP, and MIP formulations), and data-driven techniques (reinforcement learning, supervised learning, and hybrid strategies). Through systematic analysis of experimental practices across 200+ papers, we uncover significant disparities in evaluation methodologies, with classical methods typically tested on larger-scale instances (up to 200 by 200 grids with 1000+ agents) compared to learning-based approaches (predominantly 10-100 agents). We provide a comprehensive taxonomy of evaluation metrics, environment types, and baseline selections, highlighting the need for standardized benchmarking protocols. Finally, we outline promising future directions including mixed-motive MAPF with game-theoretic considerations, language-grounded planning with large language models, and neural solver architectures that combine the rigor of classical methods with the flexibility of deep learning. This survey serves as both a comprehensive reference for researchers and a practical guide for deploying MAPF solutions in increasingly complex real-world applications.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Distributionally Robust Deep Q-Learning
Authors:
Chung I Lu,
Julian Sester,
Aijia Zhang
Abstract:
We propose a novel distributionally robust $Q$-learning algorithm for the non-tabular case accounting for continuous state spaces where the state transition of the underlying Markov decision process is subject to model uncertainty. The uncertainty is taken into account by considering the worst-case transition from a ball around a reference probability measure. To determine the optimal policy under…
▽ More
We propose a novel distributionally robust $Q$-learning algorithm for the non-tabular case accounting for continuous state spaces where the state transition of the underlying Markov decision process is subject to model uncertainty. The uncertainty is taken into account by considering the worst-case transition from a ball around a reference probability measure. To determine the optimal policy under the worst-case state transition, we solve the associated non-linear Bellman equation by dualising and regularising the Bellman operator with the Sinkhorn distance, which is then parameterized with deep neural networks. This approach allows us to modify the Deep Q-Network algorithm to optimise for the worst case state transition.
We illustrate the tractability and effectiveness of our approach through several applications, including a portfolio optimisation task based on S\&{P}~500 data.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking?
Authors:
Chengda Lu,
Xiaoyu Fan,
Yu Huang,
Rongwu Xu,
Jijie Li,
Wei Xu
Abstract:
Jailbreak attacks have been observed to largely fail against recent reasoning models enhanced by Chain-of-Thought (CoT) reasoning. However, the underlying mechanism remains underexplored, and relying solely on reasoning capacity may raise security concerns. In this paper, we try to answer the question: Does CoT reasoning really reduce harmfulness from jailbreaking? Through rigorous theoretical ana…
▽ More
Jailbreak attacks have been observed to largely fail against recent reasoning models enhanced by Chain-of-Thought (CoT) reasoning. However, the underlying mechanism remains underexplored, and relying solely on reasoning capacity may raise security concerns. In this paper, we try to answer the question: Does CoT reasoning really reduce harmfulness from jailbreaking? Through rigorous theoretical analysis, we demonstrate that CoT reasoning has dual effects on jailbreaking harmfulness. Based on the theoretical insights, we propose a novel jailbreak method, FicDetail, whose practical performance validates our theoretical findings.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Authors:
Jingtong Gao,
Ling Pan,
Yejing Wang,
Rui Zhong,
Chi Lu,
Qingpeng Cai,
Peng Jiang,
Xiangyu Zhao
Abstract:
Reinforcement learning (RL) has emerged as a pivotal method for improving the reasoning capabilities of Large Language Models (LLMs). However, prevalent RL approaches such as Proximal Policy Optimization (PPO) and Group-Regularized Policy Optimization (GRPO) face critical limitations due to their reliance on sparse outcome-based rewards and inadequate mechanisms for incentivizing exploration. Thes…
▽ More
Reinforcement learning (RL) has emerged as a pivotal method for improving the reasoning capabilities of Large Language Models (LLMs). However, prevalent RL approaches such as Proximal Policy Optimization (PPO) and Group-Regularized Policy Optimization (GRPO) face critical limitations due to their reliance on sparse outcome-based rewards and inadequate mechanisms for incentivizing exploration. These limitations result in inefficient guidance for multi-step reasoning processes. Specifically, sparse reward signals fail to deliver effective or sufficient feedback, particularly for challenging problems. Furthermore, such reward structures induce systematic biases that prioritize exploitation of familiar trajectories over novel solution discovery. These shortcomings critically hinder performance in complex reasoning tasks, which inherently demand iterative refinement across ipntermediate steps. To address these challenges, we propose an Intrinsic Motivation guidEd exploratioN meThOd foR LLM Reasoning (i-MENTOR), a novel method designed to both deliver dense rewards and amplify explorations in the RL-based training paradigm. i-MENTOR introduces three key innovations: trajectory-aware exploration rewards that mitigate bias in token-level strategies while maintaining computational efficiency; dynamic reward scaling to stabilize exploration and exploitation in large action spaces; and advantage-preserving reward implementation that maintains advantage distribution integrity while incorporating exploratory guidance. Experiments across three public datasets demonstrate i-MENTOR's effectiveness with a 22.39% improvement on the difficult dataset Countdown-4.
△ Less
Submitted 27 May, 2025; v1 submitted 23 May, 2025;
originally announced May 2025.
-
No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery
Authors:
Xiaoxue Han,
Pengfei Hu,
Jun-En Ding,
Chang Lu,
Feng Liu,
Yue Ning
Abstract:
Deep learning models trained on extensive Electronic Health Records (EHR) data have achieved high accuracy in diagnosis prediction, offering the potential to assist clinicians in decision-making and treatment planning. However, these models lack two crucial features that clinicians highly value: interpretability and interactivity. The ``black-box'' nature of these models makes it difficult for cli…
▽ More
Deep learning models trained on extensive Electronic Health Records (EHR) data have achieved high accuracy in diagnosis prediction, offering the potential to assist clinicians in decision-making and treatment planning. However, these models lack two crucial features that clinicians highly value: interpretability and interactivity. The ``black-box'' nature of these models makes it difficult for clinicians to understand the reasoning behind predictions, limiting their ability to make informed decisions. Additionally, the absence of interactive mechanisms prevents clinicians from incorporating their own knowledge and experience into the decision-making process. To address these limitations, we propose II-KEA, a knowledge-enhanced agent-driven causal discovery framework that integrates personalized knowledge databases and agentic LLMs. II-KEA enhances interpretability through explicit reasoning and causal analysis, while also improving interactivity by allowing clinicians to inject their knowledge and experience through customized knowledge bases and prompts. II-KEA is evaluated on both MIMIC-III and MIMIC-IV, demonstrating superior performance along with enhanced interpretability and interactivity, as evidenced by its strong results from extensive case studies.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
Simulation and Experimental Studies of DWDM Nonlinear Phase/Polarization/Power Crosstalk Between DFOS and Communication Channels in 27.6-Tb/s 800ZR Metro Network
Authors:
Jingchuan Wang,
Maoqi Liu,
Liwang Lu,
Alan Pak Tao Lau,
Chao Lu
Abstract:
We comprehensively analyze the fiber nonlinearity crosstalks between DAS and communication channels through numerical results and 40 x 800-Gb/s 90-km experimental demonstration. Our findings indicate that conventional pulse-based DAS is unsuitable for in-band DWDM coexistence system, whereas pulse-compression DAS shows negligible penalties with legacy coherent transceivers.
We comprehensively analyze the fiber nonlinearity crosstalks between DAS and communication channels through numerical results and 40 x 800-Gb/s 90-km experimental demonstration. Our findings indicate that conventional pulse-based DAS is unsuitable for in-band DWDM coexistence system, whereas pulse-compression DAS shows negligible penalties with legacy coherent transceivers.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Chain-of-Model Learning for Language Model
Authors:
Kaitao Song,
Xiaohua Wang,
Xu Tan,
Huiqiang Jiang,
Chengruidong Zhang,
Yongliang Shen,
Cen LU,
Zihao Li,
Zifan Song,
Caihua Shan,
Yansen Wang,
Kan Ren,
Xiaoqing Zheng,
Tao Qin,
Yuqing Yang,
Dongsheng Li,
Lili Qiu
Abstract:
In this paper, we propose a novel learning paradigm, termed Chain-of-Model (CoM), which incorporates the causal relationship into the hidden states of each layer as a chain style, thereby introducing great scaling efficiency in model training and inference flexibility in deployment. We introduce the concept of Chain-of-Representation (CoR), which formulates the hidden states at each layer as a com…
▽ More
In this paper, we propose a novel learning paradigm, termed Chain-of-Model (CoM), which incorporates the causal relationship into the hidden states of each layer as a chain style, thereby introducing great scaling efficiency in model training and inference flexibility in deployment. We introduce the concept of Chain-of-Representation (CoR), which formulates the hidden states at each layer as a combination of multiple sub-representations (i.e., chains) at the hidden dimension level. In each layer, each chain from the output representations can only view all of its preceding chains in the input representations. Consequently, the model built upon CoM framework can progressively scale up the model size by increasing the chains based on the previous models (i.e., chains), and offer multiple sub-models at varying sizes for elastic inference by using different chain numbers. Based on this principle, we devise Chain-of-Language-Model (CoLM), which incorporates the idea of CoM into each layer of Transformer architecture. Based on CoLM, we further introduce CoLM-Air by introducing a KV sharing mechanism, that computes all keys and values within the first chain and then shares across all chains. This design demonstrates additional extensibility, such as enabling seamless LM switching, prefilling acceleration and so on. Experimental results demonstrate our CoLM family can achieve comparable performance to the standard Transformer, while simultaneously enabling greater flexiblity, such as progressive scaling to improve training efficiency and offer multiple varying model sizes for elastic inference, paving a a new way toward building language models. Our code will be released in the future at: https://github.com/microsoft/CoLM.
△ Less
Submitted 23 May, 2025; v1 submitted 17 May, 2025;
originally announced May 2025.
-
On the holomorphic foliations admitting a common invariant algebraic set
Authors:
Guangfeng Dong,
Chujun Lu
Abstract:
In this paper, we study the holomorphic foliations admitting a common invariant algebraic set $C$ defined by a polynomial $f$ in $ \mathbb{K}[x_1,x_2,...,x_n]$ over any characteristic $0$ subfield $\mathbb{K}\subseteq\mathbb{C}$. For the $\mathbb{K}[x_1,x_2,...,x_n]$-module $V_f$ of vector fields inducing foliations admitting $C$ as an invariant set, we present several conditions for $V_f$ to be f…
▽ More
In this paper, we study the holomorphic foliations admitting a common invariant algebraic set $C$ defined by a polynomial $f$ in $ \mathbb{K}[x_1,x_2,...,x_n]$ over any characteristic $0$ subfield $\mathbb{K}\subseteq\mathbb{C}$. For the $\mathbb{K}[x_1,x_2,...,x_n]$-module $V_f$ of vector fields inducing foliations admitting $C$ as an invariant set, we present several conditions for $V_f$ to be freely generated by $n$ generators. In particular, when $n=2$ and $f$ is a weakly tame polynomial, we show that the $\mathbb{K}[x,y]$-module $V_f$ is freely generated by two polynomial vector fields, one of which is the Hamiltonian vector field induced by $f$, if and only if, $f$ belongs to the Jacobian ideal $\langle f_x, f_y\rangle$ in $\mathbb{K}[x,y]$. Our proof employs a purely elementary method.
△ Less
Submitted 10 June, 2025; v1 submitted 16 May, 2025;
originally announced May 2025.
-
ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs
Authors:
Chaomeng Lu,
Tianyu Li,
Toon Dehaene,
Bert Lagaisse
Abstract:
Machine learning-based software vulnerability detection requires high-quality datasets, which is essential for training effective models. To address challenges related to data label quality, diversity, and comprehensiveness, we constructed ICVul, a dataset emphasizing data quality and enriched with comprehensive metadata, including Vulnerability-Contributing Commits (VCCs). We began by filtering C…
▽ More
Machine learning-based software vulnerability detection requires high-quality datasets, which is essential for training effective models. To address challenges related to data label quality, diversity, and comprehensiveness, we constructed ICVul, a dataset emphasizing data quality and enriched with comprehensive metadata, including Vulnerability-Contributing Commits (VCCs). We began by filtering Common Vulnerabilities and Exposures from the NVD, retaining only those linked to GitHub fix commits. Then we extracted functions and files along with relevant metadata from these commits and used the SZZ algorithm to trace VCCs. To further enhance label reliability, we developed the ESC (Eliminate Suspicious Commit) technique, ensuring credible data labels. The dataset is stored in a relational-like database for improved usability and data integrity. Both ICVul and its construction framework are publicly accessible on GitHub, supporting research in related field.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
General First-Principles Approach to Crystals in Finite Magnetic Fields
Authors:
Chengye Lü,
Yingwei Chen,
Yuzhi Wang,
Zhihao Dai,
Zhong Fang,
Xin-Gao Gong,
Quansheng Wu,
Hongjun Xiang
Abstract:
We introduce a general first-principles methodology for computing electronic structure in a finite uniform magnetic field which allows for an arbitrary rational magnetic flux and nonlocal pseudopotentials, at a comparable time complexity of conventional plane-wave pseudopotential approaches in zero-field conditions. The versatility of this method is demonstrated through comprehensive applications…
▽ More
We introduce a general first-principles methodology for computing electronic structure in a finite uniform magnetic field which allows for an arbitrary rational magnetic flux and nonlocal pseudopotentials, at a comparable time complexity of conventional plane-wave pseudopotential approaches in zero-field conditions. The versatility of this method is demonstrated through comprehensive applications to both molecular and crystalline systems, including calculations of magnetizabilities, magnetically induced currents, and magnetic energy bands. Furthermore, we provide rigorous proofs of two fundamental properties for crystals in uniform magnetic fields: the "strong translational symmetry" and "magnetic bands shift" phenomena.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Optimization Problem Solving Can Transition to Evolutionary Agentic Workflows
Authors:
Wenhao Li,
Bo Jin,
Mingyi Hong,
Changhong Lu,
Xiangfeng Wang
Abstract:
This position paper argues that optimization problem solving can transition from expert-dependent to evolutionary agentic workflows. Traditional optimization practices rely on human specialists for problem formulation, algorithm selection, and hyperparameter tuning, creating bottlenecks that impede industrial adoption of cutting-edge methods. We contend that an evolutionary agentic workflow, power…
▽ More
This position paper argues that optimization problem solving can transition from expert-dependent to evolutionary agentic workflows. Traditional optimization practices rely on human specialists for problem formulation, algorithm selection, and hyperparameter tuning, creating bottlenecks that impede industrial adoption of cutting-edge methods. We contend that an evolutionary agentic workflow, powered by foundation models and evolutionary search, can autonomously navigate the optimization space, comprising problem, formulation, algorithm, and hyperparameter spaces. Through case studies in cloud resource scheduling and ADMM parameter adaptation, we demonstrate how this approach can bridge the gap between academic innovation and industrial implementation. Our position challenges the status quo of human-centric optimization workflows and advocates for a more scalable, adaptive approach to solving real-world optimization problems.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
Constraints on Inflationary Gravitational Waves with Two Years of SPT-3G Data
Authors:
J. A. Zebrowski,
C. L. Reichardt,
A. J. Anderson,
B. Ansarinejad,
M. Archipley,
L. Balkenhol,
P. Barry,
K. Benabed,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
F. R. Bouchet,
L. Bryant,
E. Camphuis,
J. E. Carlstrom,
C. L. Chang,
P. Chaubal,
P. M. Chichura,
A. Chokshi,
T. -L. Chou,
A. Coerver,
T. M. Crawford,
C. Daley,
T. de Haan
, et al. (73 additional authors not shown)
Abstract:
We present a measurement of the $B$-mode polarization power spectrum of the cosmic microwave background anisotropies at 32 $\le$ $\ell$ $<$ 502 for three bands centered at 95, 150, and 220 GHz using data from the SPT-3G receiver on the South Pole Telescope. This work uses SPT-3G observations from the 2019 and 2020 winter observing seasons of a $\sim$1500 deg$^2$ patch of sky that directly overlaps…
▽ More
We present a measurement of the $B$-mode polarization power spectrum of the cosmic microwave background anisotropies at 32 $\le$ $\ell$ $<$ 502 for three bands centered at 95, 150, and 220 GHz using data from the SPT-3G receiver on the South Pole Telescope. This work uses SPT-3G observations from the 2019 and 2020 winter observing seasons of a $\sim$1500 deg$^2$ patch of sky that directly overlaps with fields observed with the BICEP/Keck family of telescopes, and covers part of the proposed Simons Observatory and CMB-S4 deep fields. Employing new techniques for mitigating polarized atmospheric noise, the SPT-3G data demonstrates a white noise level of 9.3 (6.7) $μ$K-arcmin at $\ell \sim 500$ for the 95 GHz (150 GHz) data, with a $1/\ell$ noise knee at $\ell$=128 (182). We fit the observed six auto- and cross-frequency $B$-mode power spectra to a model including lensed $Λ$CDM $B$-modes and a combination of Galactic and extragalactic foregrounds. This work characterizes foregrounds in the vicinity of the BICEP/Keck survey area, finding foreground power consistent with that reported by the BICEP/Keck collaboration within the same region, and a factor of $\sim$ 3 higher power over the full SPT-3G survey area. Using SPT-3G data over the BICEP/Keck survey area, we place a 95% upper limit on the tensor-to-scalar ratio of $r < 0.25$ and find the statistical uncertainty on $r$ to be $σ(r) = 0.067$.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
Generation of 95-qubit genuine entanglement and verification of symmetry-protected topological phases
Authors:
Tao Jiang,
Jianbin Cai,
Junxiang Huang,
Naibin Zhou,
Yukun Zhang,
Jiahao Bei,
Guoqing Cai,
Sirui Cao,
Fusheng Chen,
Jiang Chen,
Kefu Chen,
Xiawei Chen,
Xiqing Chen,
Zhe Chen,
Zhiyuan Chen,
Zihua Chen,
Wenhao Chu,
Hui Deng,
Zhibin Deng,
Pei Ding,
Xun Ding,
Zhuzhengqi Ding,
Shuai Dong,
Bo Fan,
Daojin Fan
, et al. (130 additional authors not shown)
Abstract:
Symmetry-protected topological (SPT) phases are fundamental features of cluster states, serving as key resources for measurement-based quantum computation (MBQC). Generating large-scale cluster states and verifying their SPT phases are essential steps toward practical MBQC, which however still presents significant experimental challenges. In this work, we address these challenges by utilizing adva…
▽ More
Symmetry-protected topological (SPT) phases are fundamental features of cluster states, serving as key resources for measurement-based quantum computation (MBQC). Generating large-scale cluster states and verifying their SPT phases are essential steps toward practical MBQC, which however still presents significant experimental challenges. In this work, we address these challenges by utilizing advanced superconducting hardware with optimized gate operations, enhanced readout fidelity, and error mitigation techniques. We successfully generate and verify 95-qubit one-dimensional and 72-qubit two-dimensional genuine entangled cluster states, achieving fidelities of $0.5603 \pm 0.0084$ and $0.5519 \pm 0.0054$, respectively. Leveraging these high-fidelity cluster states, we investigate SPT phases through quantum teleportation across all 95 qubits and demonstrate input-state-dependent robustness against symmetry-breaking perturbations, highlighting the practicality and intrinsic robustness of MBQC enabled by the SPT order. Our results represent a significant advancement in large-scale entanglement generation and topological phase simulation, laying the foundation for scalable and practical MBQC using superconducting quantum systems.
△ Less
Submitted 3 May, 2025;
originally announced May 2025.
-
SIME: Enhancing Policy Self-Improvement with Modal-level Exploration
Authors:
Yang Jin,
Jun Lv,
Wenye Yu,
Hongjie Fang,
Yong-Lu Li,
Cewu Lu
Abstract:
Self-improvement requires robotic systems to initially learn from human-provided data and then gradually enhance their capabilities through interaction with the environment. This is similar to how humans improve their skills through continuous practice. However, achieving effective self-improvement is challenging, primarily because robots tend to repeat their existing abilities during interactions…
▽ More
Self-improvement requires robotic systems to initially learn from human-provided data and then gradually enhance their capabilities through interaction with the environment. This is similar to how humans improve their skills through continuous practice. However, achieving effective self-improvement is challenging, primarily because robots tend to repeat their existing abilities during interactions, often failing to generate new, valuable data for learning. In this paper, we identify the key to successful self-improvement: modal-level exploration and data selection. By incorporating a modal-level exploration mechanism during policy execution, the robot can produce more diverse and multi-modal interactions. At the same time, we select the most valuable trials and high-quality segments from these interactions for learning. We successfully demonstrate effective robot self-improvement on both simulation benchmarks and real-world experiments. The capability for self-improvement will enable us to develop more robust and high-success-rate robotic control strategies at a lower cost. Our code and experiment scripts are available at https://ericjin2002.github.io/SIME/
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
Full realization of the RIBLL2 separator at the HIRFL-CSR facility
Authors:
Xiao-Dong Xu,
Yong Zheng,
Zhi-Yu Sun,
Yu-Nan Song,
Bao-Hua Sun,
Satoru Terashima,
Chang-Jian Wang,
Ge Guo,
Guang-Shuai Li,
Xiu-Lin Wei,
Jun-Yao Xu,
Ji-Chao Zhang,
Yong Cao,
Bing-Shui Gao,
Jia-Xing Han,
Jin-Rong Liu,
Chen-Gui Lu,
Shu-Ya Jin,
Hooi Jin Ong,
Hao-Tian Qi,
Yun Qin,
Ya-Zhou Sun,
Isao Tanihata,
Lu-Ping Wan,
Kai-Long Wang
, et al. (11 additional authors not shown)
Abstract:
A new experimental platform was constructed at the Second Radioactive Ion Beam Line in Lanzhou (RIBLL2) of HIRFL-CSR accelerator facility at Lanzhou, China. Its performance, along with several newly developed detectors, was tested in two radioactive ion beam experiments utilizing a 400 MeV/u 40Ar beam and a 350 MeV/u 78Kr beam, respectively. The first results from these two experiments demonstrate…
▽ More
A new experimental platform was constructed at the Second Radioactive Ion Beam Line in Lanzhou (RIBLL2) of HIRFL-CSR accelerator facility at Lanzhou, China. Its performance, along with several newly developed detectors, was tested in two radioactive ion beam experiments utilizing a 400 MeV/u 40Ar beam and a 350 MeV/u 78Kr beam, respectively. The first results from these two experiments demonstrate a good particle identification capability of the setup, thereby affirming the full realization of the RIBLL2 separator.
△ Less
Submitted 30 April, 2025;
originally announced May 2025.
-
PaRT: Enhancing Proactive Social Chatbots with Personalized Real-Time Retrieval
Authors:
Zihan Niu,
Zheyong Xie,
Shaosheng Cao,
Chonggang Lu,
Zheyu Ye,
Tong Xu,
Zuozhu Liu,
Yan Gao,
Jia Chen,
Zhe Xu,
Yi Wu,
Yao Hu
Abstract:
Social chatbots have become essential intelligent companions in daily scenarios ranging from emotional support to personal interaction. However, conventional chatbots with passive response mechanisms usually rely on users to initiate or sustain dialogues by bringing up new topics, resulting in diminished engagement and shortened dialogue duration. In this paper, we present PaRT, a novel framework…
▽ More
Social chatbots have become essential intelligent companions in daily scenarios ranging from emotional support to personal interaction. However, conventional chatbots with passive response mechanisms usually rely on users to initiate or sustain dialogues by bringing up new topics, resulting in diminished engagement and shortened dialogue duration. In this paper, we present PaRT, a novel framework enabling context-aware proactive dialogues for social chatbots through personalized real-time retrieval and generation. Specifically, PaRT first integrates user profiles and dialogue context into a large language model (LLM), which is initially prompted to refine user queries and recognize their underlying intents for the upcoming conversation. Guided by refined intents, the LLM generates personalized dialogue topics, which then serve as targeted queries to retrieve relevant passages from RedNote. Finally, we prompt LLMs with summarized passages to generate knowledge-grounded and engagement-optimized responses. Our approach has been running stably in a real-world production environment for more than 30 days, achieving a 21.77\% improvement in the average duration of dialogues.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Unified and consistent structure growth measurements from joint ACT, SPT and \textit{Planck} CMB lensing
Authors:
Frank J. Qu,
Fei Ge,
W. L. Kimmy Wu,
Irene Abril-Cabezas,
Mathew S. Madhavacheril,
Marius Millea,
Ethan Anderes,
Adam J. Anderson,
Behzad Ansarinejad,
Melanie Archipley,
Zachary Atkins,
Lennart Balkenhol,
Nicholas Battaglia,
Karim Benabed,
Amy N. Bender,
Bradford A. Benson,
Federico Bianchini,
Lindsey. E. Bleem,
Boris Bolliet,
J Richard Bond,
François. R. Bouchet,
Lincoln Bryant,
Erminia Calabrese,
Etienne Camphuis,
John E. Carlstrom
, et al. (120 additional authors not shown)
Abstract:
We present the tightest cosmic microwave background (CMB) lensing constraints to date on the growth of structure by combining CMB lensing measurements from the Atacama Cosmology Telescope (ACT), the South Pole Telescope (SPT) and \textit{Planck}. Each of these surveys individually provides lensing measurements with similarly high statistical power, achieving signal-to-noise ratios of approximately…
▽ More
We present the tightest cosmic microwave background (CMB) lensing constraints to date on the growth of structure by combining CMB lensing measurements from the Atacama Cosmology Telescope (ACT), the South Pole Telescope (SPT) and \textit{Planck}. Each of these surveys individually provides lensing measurements with similarly high statistical power, achieving signal-to-noise ratios of approximately 40. The combined lensing bandpowers represent the most precise CMB lensing power spectrum measurement to date with a signal-to-noise ratio of 61 and an amplitude of $A_\mathrm{lens}^\mathrm{recon} = 1.025 \pm 0.017$ with respect to the theory prediction from the best-fit CMB \textit{Planck}-ACT cosmology. The bandpowers from all three lensing datasets, analyzed jointly, yield a $1.6\%$ measurement of the parameter combination $S_8^\mathrm{CMBL} \equiv σ_8\,(Ω_m/0.3)^{0.25} = 0.825^{+0.015}_{-0.013}$. Including Dark Energy Spectroscopic Instrument (DESI) Baryon Acoustic Oscillation (BAO) data improves the constraint on the amplitude of matter fluctuations to $σ_8 = 0.829 \pm 0.009$ (a $1.1\%$ determination). When combining with uncalibrated supernovae from \texttt{Pantheon+}, we present a $4\%$ sound-horizon-independent estimate of $H_0=66.4\pm2.5\,\mathrm{km\,s^{-1}\,Mpc^{-1}} $. The joint lensing constraints on structure growth and present-day Hubble rate are fully consistent with a $Λ$CDM model fit to the primary CMB data from \textit{Planck} and ACT. While the precise upper limit is sensitive to the choice of data and underlying model assumptions, when varying the neutrino mass sum within the $Λ\mathrm{CDM}$ cosmological model, the combination of primary CMB, BAO and CMB lensing drives the probable upper limit for the mass sum towards lower values, comparable to the minimum mass prior required by neutrino oscillation experiments.
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets
Authors:
Mingqian He,
Fei Zhao,
Chonggang Lu,
Ziyan Liu,
Yue Wang,
Haofu Qian
Abstract:
As a fundamental task in machine learning, text classification plays a crucial role in many areas. With the rapid scaling of Large Language Models (LLMs), particularly through reinforcement learning (RL), there is a growing need for more capable discriminators. Consequently, advances in classification are becoming increasingly vital for enhancing the overall capabilities of LLMs. Traditional discr…
▽ More
As a fundamental task in machine learning, text classification plays a crucial role in many areas. With the rapid scaling of Large Language Models (LLMs), particularly through reinforcement learning (RL), there is a growing need for more capable discriminators. Consequently, advances in classification are becoming increasingly vital for enhancing the overall capabilities of LLMs. Traditional discriminative methods map text to labels but overlook LLMs' intrinsic generative strengths. Generative classification addresses this by prompting the model to directly output labels. However, existing studies still rely on simple SFT alone, seldom probing the interplay between training and inference prompts, and no work has systematically leveraged RL for generative text classifiers and unified SFT, RL, and inference-time prompting in one framework. We bridge this gap with GenCLS++, a framework that jointly optimizes SFT and RL while systematically exploring five high-level strategy dimensions-in-context learning variants, category definitions, explicit uncertainty labels, semantically irrelevant numeric labels, and perplexity-based decoding-during both training and inference. After an SFT "policy warm-up," we apply RL with a simple rule-based reward, yielding sizable extra gains. Across seven datasets, GenCLS++ achieves an average accuracy improvement of 3.46% relative to the naive SFT baseline; on public datasets, this improvement rises to 4.00%. Notably, unlike reasoning-intensive tasks that benefit from explicit thinking processes, we find that classification tasks perform better without such reasoning steps. These insights into the role of explicit reasoning provide valuable guidance for future LLM applications.
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
Searching for elusive dark Higgs boson in spin-1/2 inelastic dark matter models at Belle II
Authors:
P. Ko,
Youngjoon Kwon,
Chih-Ting Lu,
Xinqi Wei
Abstract:
Spin-1/2 inelastic dark matter (DM) models are popular among sub-GeV to GeV thermal DM scenarios due to the dominant role of co-annihilation in determining the DM relic abundance. In these models, the dark Higgs boson plays a crucial role in generating the mass of the new gauge boson, the dark photon ($A^{'}$), and in establishing the mass splitting between the excited ($χ_2$) and ground ($χ_1$) s…
▽ More
Spin-1/2 inelastic dark matter (DM) models are popular among sub-GeV to GeV thermal DM scenarios due to the dominant role of co-annihilation in determining the DM relic abundance. In these models, the dark Higgs boson plays a crucial role in generating the mass of the new gauge boson, the dark photon ($A^{'}$), and in establishing the mass splitting between the excited ($χ_2$) and ground ($χ_1$) states of DM. In particular, the Compton scattering $χ_1 A' \rightarrow χ_2^* \rightarrow χ_1 A'$ and its $t$-channel crossed process, $χ_1 χ_1 \rightarrow A' A'$, remain unitary for high energy longitudunal dark photon, only if the contribution of the dark Higgs boson is included. However, experimental searches for the dark Higgs boson have received relatively little attention. In particular, when the dark Higgs boson mass exceeds twice that of the DM excited state, its decay signatures become semi-visible or invisible, making detection challenging with current light scalar search strategies. In this work, we explore the prospects for detecting the elusive dark Higgs boson in inelastic DM models at Belle II via dark Higgs-strahlung and rare $B$ meson decay processes. Our analysis indicates that both the inclusive signature of two displaced dilepton vertices and the additional missing energy from dark Higgs boson decays serve as robust indicators of its presence. Furthermore, we assess the future potential for detecting the dark Higgs boson with the proposed far detector related to Belle II, GAZELLE.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
DiffUMI: Training-Free Universal Model Inversion via Unconditional Diffusion for Face Recognition
Authors:
Hanrui Wang,
Shuo Wang,
Chun-Shien Lu,
Isao Echizen
Abstract:
Face recognition technology presents serious privacy risks due to its reliance on sensitive and immutable biometric data. To address these concerns, such systems typically convert raw facial images into embeddings, which are traditionally viewed as privacy-preserving. However, model inversion attacks challenge this assumption by reconstructing private facial images from embeddings, highlighting a…
▽ More
Face recognition technology presents serious privacy risks due to its reliance on sensitive and immutable biometric data. To address these concerns, such systems typically convert raw facial images into embeddings, which are traditionally viewed as privacy-preserving. However, model inversion attacks challenge this assumption by reconstructing private facial images from embeddings, highlighting a critical vulnerability in face recognition systems. Most existing inversion methods require training a separate generator for each target model, making them computationally intensive. In this work, we introduce DiffUMI, a diffusion-based universal model inversion attack that requires no additional training. DiffUMI is the first approach to successfully leverage unconditional face generation without relying on model-specific generators. It surpasses state-of-the-art attacks by 15.5% and 9.82% in success rate on standard and privacy-preserving face recognition systems, respectively. Furthermore, we propose a novel use of out-of-domain detection (OODD), demonstrating for the first time that model inversion can differentiate between facial and non-facial embeddings using only the embedding space.
△ Less
Submitted 11 June, 2025; v1 submitted 24 April, 2025;
originally announced April 2025.