Skip to main content

Showing 201–250 of 7,496 results for author: Zhou, Y

.
  1. arXiv:2506.03231  [pdf, ps, other

    cs.NI cs.AI cs.LG

    NetPress: Dynamically Generated LLM Benchmarks for Network Applications

    Authors: Yajie Zhou, Jiajun Ruan, Eric S. Wang, Sadjad Fouladi, Francis Y. Yan, Kevin Hsieh, Zaoxing Liu

    Abstract: Despite growing interest in domain-specific benchmarking of large language models (LLMs) and agents, current evaluations remain limited to static, small-scale datasets, especially in high-stakes tasks like network operations that demand reliability for deployments. We present NetPress, an automated benchmark generation framework for evaluating LLM agents in network applications. NetPress introduce… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  2. arXiv:2506.03072  [pdf, ps, other

    hep-ex

    Three-pion Bose-Einstein correlations measured in proton-proton collisions

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis, L. An , et al. (1125 additional authors not shown)

    Abstract: A study on the Bose-Einstein correlations for triplets of same-sign pions is presented. The analysis is performed using proton-proton collisions at a centre-of-mass energy of $\sqrt{s}$ = 7 TeV, recorded by the LHCb experiment, corresponding to an integrated luminosity of 1.0 fb$^{-1}$. For the first time, the results are interpreted in the core-halo model. The parameters of the model are determin… ▽ More

    Submitted 9 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3759/ (LHCb public pages)

    Report number: CERN-EP-2025-104, LHCb-PAPER-2025-007

  3. arXiv:2506.02969  [pdf, ps, other

    hep-ex

    Measurement of the branching fractions of the Cabibbo-favored decays $Λ_{c}^{+}\toΛK_{S}^{0}K^{+}$ and $Λ_{c}^{+}\toΞ^{0}K_{S}^{0}π^{+}$ and search for $Λ_{c}^{+}\toΣ^{0} K_{S}^{0}K^{+}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (660 additional authors not shown)

    Abstract: Based on $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of about 4.5 fb$^{-1}$ collected at center-of-mass energies between 4599.53 MeV and 4698.82 MeV with the BESIII detector, the absolute branching fraction of the Cabibbo-favored decay $Λ_{c}^{+}\toΛK_{S}^{0}K^{+}$ is measured to be $(3.12\pm0.46\pm0.15)\times10^{-3}$. Combined with a previous measurement from the BESIII… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  4. arXiv:2506.02935  [pdf, ps, other

    cs.LG

    MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver

    Authors: Yuepeng Zheng, Fu Luo, Zhenkun Wang, Yaoxin Wu, Yu Zhou

    Abstract: Multi-Task Learning (MTL) in Neural Combinatorial Optimization (NCO) is a promising approach to train a unified model capable of solving multiple Vehicle Routing Problem (VRP) variants. However, existing Reinforcement Learning (RL)-based multi-task methods can only train light decoder models on small-scale problems, exhibiting limited generalization ability when solving large-scale problems. To ov… ▽ More

    Submitted 14 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

    Comments: 24 pages,5 figures, 8 tables

  5. arXiv:2506.02875  [pdf, ps, other

    cs.CV

    NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results

    Authors: Xiaohong Liu, Xiongkuo Min, Qiang Hu, Xiaoyun Zhang, Jie Guo, Guangtao Zhai, Shushi Wang, Yingjie Zhou, Lu Liu, Jingxin Li, Liu Yang, Farong Wen, Li Xu, Yanwei Jiang, Xilei Zhu, Chunyi Li, Zicheng Zhang, Huiyu Duan, Xiele Wu, Yixuan Gao, Yuqin Cao, Jun Jia, Wei Sun, Jiezhang Cao, Radu Timofte , et al. (70 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2025 XGC Quality Assessment Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2025. This challenge is to address a major challenge in the field of video and talking head processing. The challenge is divided into three tracks, including user generated video, AI generated video and talking he… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: NTIRE 2025 XGC Quality Assessment Challenge Report. arXiv admin note: text overlap with arXiv:2404.16687

  6. arXiv:2506.02784  [pdf, ps, other

    cs.IR

    UTCS: Effective Unsupervised Temporal Community Search with Pre-training of Temporal Dynamics and Subgraph Knowledge

    Authors: Yue Zhang, Yankai Chen, Yingli Zhou, Yucan Guo, Xiaolin Han, Chenhao Ma

    Abstract: In many real-world applications, the evolving relationships between entities can be modeled as temporal graphs, where each edge has a timestamp representing the interaction time. As a fundamental problem in graph analysis, {\it community search (CS)} in temporal graphs has received growing attention but exhibits two major limitations: (1) Traditional methods typically require predefined subgraph… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Accepted by SIGIR'25 short paper track

  7. arXiv:2506.02623  [pdf, ps, other

    cs.LG cs.AI cs.CV

    SiamNAS: Siamese Surrogate Model for Dominance Relation Prediction in Multi-objective Neural Architecture Search

    Authors: Yuyang Zhou, Ferrante Neri, Yew-Soon Ong, Ruibin Bai

    Abstract: Modern neural architecture search (NAS) is inherently multi-objective, balancing trade-offs such as accuracy, parameter count, and computational cost. This complexity makes NAS computationally expensive and nearly impossible to solve without efficient approximations. To address this, we propose a novel surrogate modelling approach that leverages an ensemble of Siamese network blocks to predict dom… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Genetic and Evolutionary Computation Conference (GECCO' 25)

  8. arXiv:2506.02580  [pdf, other

    cs.AI

    V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving

    Authors: Xuewen Luo, Fengze Yang, Fan Ding, Xiangbo Gao, Shuo Xing, Yang Zhou, Zhengzhong Tu, Chenxi Liu

    Abstract: Knowledge-driven autonomous driving systems(ADs) offer powerful reasoning capabilities, but face two critical challenges: limited perception due to the short-sightedness of single-vehicle sensors, and hallucination arising from the lack of real-time environmental grounding. To address these issues, this paper introduces V2X-UniPool, a unified framework that integrates multimodal Vehicle-to-Everyth… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  9. arXiv:2506.02521  [pdf, ps, other

    hep-ex

    Improved Measurements of $D^+ \to ηe^+ν_e$ and $D^+ \to ημ^+ν_μ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (682 additional authors not shown)

    Abstract: Using 20.3 fb$^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy of 3.773 GeV with the BESIII detector, we measure the branching fractions of $D^+\to ηe^+ν_e$ and $D^+\to ημ^+ν_μ$ to be $(9.75\pm0.29\pm0.28)\times10^{-4}$ and $(9.08\pm0.35\pm0.23)\times10^{-4}$, where the first and second uncertainties are statistical and systematic, respectively. From a simultaneous fit to t… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  10. arXiv:2506.02177  [pdf, ps, other

    cs.AI cs.LG

    Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts

    Authors: Haizhong Zheng, Yang Zhou, Brian R. Bartoldson, Bhavya Kailkhura, Fan Lai, Jiawei Zhao, Beidi Chen

    Abstract: Reinforcement learning, such as PPO and GRPO, has powered recent breakthroughs in LLM reasoning. Scaling rollout to sample more prompts enables models to selectively use higher-quality data for training, which can stabilize RL training and improve model performance. However, this comes at the cost of significant computational overhead. In this paper, we show that a substantial portion of this over… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  11. arXiv:2506.02126  [pdf, ps, other

    cs.CL

    Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains

    Authors: Juncheng Wu, Sheng Liu, Haoqin Tu, Hang Yu, Xiaoke Huang, James Zou, Cihang Xie, Yuyin Zhou

    Abstract: Recent advances in reasoning-enhanced Large Language Models such as OpenAI-o1/3 and DeepSeek-R1 have significantly improved performance on complex tasks. However, the quality and transparency of their internal reasoning processes remain underexplored. This work moves beyond the final-answer accuracy and investigates step-by-step reasoning in the medical and mathematical domains by explicitly decom… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 17 pages, preprint

  12. arXiv:2506.01907  [pdf, ps, other

    cs.LG cs.CR stat.ML

    SMOTE-DP: Improving Privacy-Utility Tradeoff with Synthetic Data

    Authors: Yan Zhou, Bradley Malin, Murat Kantarcioglu

    Abstract: Privacy-preserving data publication, including synthetic data sharing, often experiences trade-offs between privacy and utility. Synthetic data is generally more effective than data anonymization in balancing this trade-off, however, not without its own challenges. Synthetic data produced by generative models trained on source data may inadvertently reveal information about outliers. Techniques sp… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  13. arXiv:2506.01865  [pdf, ps, other

    math.NT

    Some series connecting Fibonacci numbers to $π$

    Authors: Zhi-Wei Sun, Yajun Zhou

    Abstract: Exploring the theory of Guillera--Rogers, we evaluate some infinite series whose summands are quadratic irrationals, in terms of $π$ and special values of Dirichlet $L$-functions. For example, we show that \[\sum_{k=1}^\infty\frac{3 \left(16 \sqrt{5}-35\right) k-4 \left(5 \sqrt{5}-11\right)}{k^{3}\binom{2k}{k}^3}\left(\frac{1+\sqrt{5}}{2} \right)^{8 k}=\frac{71π^{2}}{30}\]and\begin{align*}&\sum_{k… ▽ More

    Submitted 12 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

    Comments: 15 pages, a new result added [(1.13) in Theorem 1.2(c)]

    MSC Class: 11B39; 11F99

  14. arXiv:2506.01801  [pdf, ps, other

    cs.CV

    OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation

    Authors: Sen Liang, Zhentao Yu, Zhengguang Zhou, Teng Hu, Hongmei Wang, Yi Chen, Qin Lin, Yuan Zhou, Xin Li, Qinglin Lu, Zhibo Chen

    Abstract: The emergence of Diffusion Transformers (DiT) has brought significant advancements to video generation, especially in text-to-video and image-to-video tasks. Although video generation is widely applied in various fields, most existing models are limited to single scenarios and cannot perform diverse video generation and editing through dynamic content manipulation. We propose OmniV2V, a video mode… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  15. arXiv:2506.01721  [pdf

    quant-ph

    Macroscopic entanglement of three magnon modes in three cavities via optical parametric amplifier

    Authors: Ying Zhou, Guo-Qiang Zhang

    Abstract: We propose a scheme to generate bipartite and tripartite entanglements of three magnon modes in a three-cavity system using a nonlinear optical parametric amplifier (OPA). The three magnon modes in three YIG spheres are respectively placed inside three cavities near the maximum magnetic fields of the cavities and coupled to cavity modes via linear magnetic dipole interaction. Additionally, linear… ▽ More

    Submitted 14 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

    Comments: 9 pages, 7 figures

    ACM Class: J.2.9

  16. arXiv:2506.01716  [pdf, ps, other

    cs.AI cs.CL

    Self-Challenging Language Model Agents

    Authors: Yifei Zhou, Sergey Levine, Jason Weston, Xian Li, Sainbayar Sukhbaatar

    Abstract: Large language models are quickly becoming the foundation for intelligent agents that are capable of using tools. However, training such agents is challenging because it requires human creation and annotation of a diverse set of tasks, tools, and evaluation criteria. In this paper, we propose the Self-Challenging framework for training an agent on high-quality tasks that are generated by itself. T… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  17. arXiv:2506.01686  [pdf, ps, other

    physics.comp-ph

    A Graph Neural Network for the Era of Large Atomistic Models

    Authors: Duo Zhang, Anyang Peng, Chun Cai, Wentao Li, Yuanchang Zhou, Jinzhe Zeng, Mingyu Guo, Chengqian Zhang, Bowen Li, Hong Jiang, Tong Zhu, Weile Jia, Linfeng Zhang, Han Wang

    Abstract: Foundation models, or large atomistic models (LAMs), aim to universally represent the ground-state potential energy surface (PES) of atomistic systems as defined by density functional theory (DFT). The scaling law is pivotal in the development of large models, suggesting that their generalizability in downstream tasks consistently improves with increased model size, expanded training datasets, and… ▽ More

    Submitted 9 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

  18. arXiv:2506.01657  [pdf, ps, other

    quant-ph

    State Similarity in Modular Superconducting Quantum Processors with Classical Communications

    Authors: Bujiao Wu, Changrong Xie, Peng Mi, Zhiyi Wu, Zechen Guo, Peisheng Huang, Wenhui Huang, Xuandong Sun, Jiawei Zhang, Libo Zhang, Jiawei Qiu, Xiayu Linpeng, Ziyu Tao, Ji Chu, Ji Jiang, Song Liu, Jingjing Niu, Yuxuan Zhou, Yuxuan Du, Wenhui Ren, Youpeng Zhong, Tongliang Liu, Dapeng Yu

    Abstract: As quantum devices continue to scale, distributed quantum computing emerges as a promising strategy for executing large-scale tasks across modular quantum processors. A central challenge in this paradigm is verifying the correctness of computational outcomes when subcircuits are executed independently following circuit cutting. Here we propose a cross-platform fidelity estimation algorithm tailore… ▽ More

    Submitted 11 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

    Comments: 10 pages, 3 figures, 27-page appendix, reference citation typos corrected

  19. arXiv:2506.01327  [pdf, ps, other

    cs.LG cs.AI

    STSA: Federated Class-Incremental Learning via Spatial-Temporal Statistics Aggregation

    Authors: Zenghao Guan, Guojun Zhu, Yucan Zhou, Wu Liu, Weiping Wang, Jiebo Luo, Xiaoyan Gu

    Abstract: Federated Class-Incremental Learning (FCIL) enables Class-Incremental Learning (CIL) from distributed data. Existing FCIL methods typically integrate old knowledge preservation into local client training. However, these methods cannot avoid spatial-temporal client drift caused by data heterogeneity and often incur significant computational and communication overhead, limiting practical deployment.… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  20. arXiv:2506.01300  [pdf, ps, other

    cs.CV

    ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding

    Authors: Yiyang Zhou, Yangfan He, Yaofeng Su, Siwei Han, Joel Jang, Gedas Bertasius, Mohit Bansal, Huaxiu Yao

    Abstract: Video understanding is fundamental to tasks such as action recognition, video reasoning, and robotic control. Early video understanding methods based on large vision-language models (LVLMs) typically adopt a single-pass reasoning paradigm without dynamic feedback, limiting the model's capacity to self-correct and adapt in complex scenarios. Recent efforts have attempted to address this limitation… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 31 pages, 18 figures

  21. arXiv:2506.01297  [pdf, ps, other

    cs.AI

    MobCLIP: Learning General-purpose Geospatial Representation at Scale

    Authors: Ya Wen, Jixuan Cai, Qiyao Ma, Linyan Li, Xinhua Chen, Chris Webster, Yulun Zhou

    Abstract: Representation learning of geospatial locations remains a core challenge in achieving general geospatial intelligence. Current embedding methods often lack versatility, limiting their utility across diverse tasks in both human and natural domains. We present MobCLIP, the first nationwide general-purpose location encoder, integrating an unprecedented diversity of data modalities through effective a… ▽ More

    Submitted 3 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

  22. arXiv:2506.01231  [pdf, ps, other

    cs.LG cs.AI

    Towards Efficient Few-shot Graph Neural Architecture Search via Partitioning Gradient Contribution

    Authors: Wenhao Song, Xuan Wu, Bo Yang, You Zhou, Yubin Xiao, Yanchun Liang, Hongwei Ge, Heow Pueh Lee, Chunguo Wu

    Abstract: To address the weight coupling problem, certain studies introduced few-shot Neural Architecture Search (NAS) methods, which partition the supernet into multiple sub-supernets. However, these methods often suffer from computational inefficiency and tend to provide suboptimal partitioning schemes. To address this problem more effectively, we analyze the weight coupling problem from a novel perspecti… ▽ More

    Submitted 20 June, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted by SIGKDD 2025

  23. arXiv:2506.01103  [pdf, ps, other

    cs.CV

    DeepVerse: 4D Autoregressive Video Generation as a World Model

    Authors: Junyi Chen, Haoyi Zhu, Xianglong He, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Zhoujie Fu, Jiangmiao Pang, Tong He

    Abstract: World models serve as essential building blocks toward Artificial General Intelligence (AGI), enabling intelligent agents to predict future states and plan actions by simulating complex physical interactions. However, existing interactive models primarily predict visual observations, thereby neglecting crucial hidden states like geometric structures and spatial coherence. This leads to rapid error… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  24. arXiv:2506.01043  [pdf, ps, other

    eess.SP

    A Group-Wise Narrow Beam Design for Uplink Channel Estimation in Hybrid Beamforming Systems

    Authors: Yufan Zhou, Yongbo Xiao, An Liu

    Abstract: In this paper, we consider uplink channel estimation for massive multi-input multi-output (MIMO) systems with partially connected hybrid beamforming (PC-HBF) structures. Existing beam design and channel estimation schemes are usually based on ideal assumptions and require transmitting pilots across multiple timeslots, making them unsuitable for practical PC-HBF systems. To overcome these drawbacks… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  25. arXiv:2506.00883  [pdf, ps, other

    cs.CL

    Improve MLLM Benchmark Efficiency through Interview

    Authors: Farong Wen, Yijin Guo, Junying Wang, Jiaohao Xiao, Yingjie Zhou, Chunyi Li, Zicheng Zhang, Guangtao Zhai

    Abstract: The rapid development of Multimodal Large Language Models (MLLM) has led to a wide range of MLLM applications, and a number of benchmark datasets have sprung up in order to assess MLLM abilities. However, full-coverage Q&A testing on large-scale data is resource-intensive and time-consuming. To address this issue, we propose the MLLM Interview (MITV) strategy, which aims to quickly obtain MLLM per… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  26. arXiv:2506.00874  [pdf, ps, other

    cs.CV

    Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

    Authors: Yue Zhou, Xinan He, KaiQing Lin, Bin Fan, Feng Ding, Bin Li

    Abstract: Current AIGC detectors often achieve near-perfect accuracy on images produced by the same generator used for training but struggle to generalize to outputs from unseen generators. We trace this failure in part to latent prior bias: detectors learn shortcuts tied to patterns stemming from the initial noise vector rather than learning robust generative artifacts. To address this, we propose On-Manif… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  27. arXiv:2506.00830  [pdf, ps, other

    cs.CV

    SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

    Authors: Zhengcong Fei, Hao Jiang, Di Qiu, Baoxuan Gu, Youqiang Zhang, Jiahua Wang, Jialin Bai, Debang Li, Mingyuan Fan, Guibin Chen, Yahui Zhou

    Abstract: The generation and editing of audio-conditioned talking portraits guided by multimodal inputs, including text, images, and videos, remains under explored. In this paper, we present SkyReels-Audio, a unified framework for synthesizing high-fidelity and temporally coherent talking portrait videos. Built upon pretrained video diffusion transformers, our framework supports infinite-length generation a… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  28. arXiv:2506.00612  [pdf, ps, other

    cs.CL

    Enhancing Clinical Multiple-Choice Questions Benchmarks with Knowledge Graph Guided Distractor Generation

    Authors: Running Yang, Wenlong Deng, Minghui Chen, Yuyin Zhou, Xiaoxiao Li

    Abstract: Clinical tasks such as diagnosis and treatment require strong decision-making abilities, highlighting the importance of rigorous evaluation benchmarks to assess the reliability of large language models (LLMs). In this work, we introduce a knowledge-guided data augmentation framework that enhances the difficulty of clinical multiple-choice question (MCQ) datasets by generating distractors (i.e., in… ▽ More

    Submitted 3 July, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

  29. arXiv:2506.00583  [pdf, ps, other

    cs.CL cs.CY cs.HC

    The Hidden Language of Harm: Examining the Role of Emojis in Harmful Online Communication and Content Moderation

    Authors: Yuhang Zhou, Yimin Xiao, Wei Ai, Ge Gao

    Abstract: Social media platforms have become central to modern communication, yet they also harbor offensive content that challenges platform safety and inclusivity. While prior research has primarily focused on textual indicators of offense, the role of emojis, ubiquitous visual elements in online discourse, remains underexplored. Emojis, despite being rarely offensive in isolation, can acquire harmful mea… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: 18 pages, 3 figures

  30. arXiv:2506.00577  [pdf, ps, other

    cs.AI cs.CL cs.GT cs.MA

    Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

    Authors: Yufa Zhou, Shaobo Wang, Xingyu Dong, Xiangqi Jin, Yifang Chen, Yue Min, Kexin Yang, Xingzhang Ren, Dayiheng Liu, Linfeng Zhang

    Abstract: Directly training Large Language Models (LLMs) for Multi-Agent Systems (MAS) remains challenging due to intricate reward modeling, dynamic agent interactions, and demanding generalization requirements. This paper explores whether post-training techniques, specifically Supervised Fine-Tuning (SFT) and Reinforcement Learning with Verifiable Rewards (RLVR), can effectively $\textit{generalize}$ to mu… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  31. arXiv:2506.00536  [pdf, ps, other

    cs.CL cs.AI

    Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing

    Authors: Changyue Wang, Weihang Su, Qingyao Ai, Yujia Zhou, Yiqun Liu

    Abstract: Knowledge editing aims to efficiently update Large Language Models (LLMs) by modifying specific knowledge without retraining the entire model. Among knowledge editing approaches, in-context editing (ICE) offers a lightweight solution by injecting new knowledge directly into the input context, leaving model parameters unchanged. However, existing ICE approaches do not explicitly separate the newly… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  32. arXiv:2506.00493  [pdf, ps, other

    astro-ph.SR

    Asteroseismology of the G8 subgiant beta Aquilae with SONG-Tenerife, SONG-Australia and TESS

    Authors: Hans Kjeldsen, Timothy R. Bedding, Yaguang Li, Frank Grundahl, Mads Fredslund Andersen, Duncan J. Wright, Jack Soutter, Robert Wittenmyer, Claudia Reyes, Dennis Stello, Courtney Crawford, Yixiao Zhou, Mathieu Clerte, Pere L. Palle, Sergio Simon-Diaz, Joergen Christensen-Dalsgaard, Rasmus Handberg, Hasse Hansen, Paul Heeren, Jens Jessen-Hansen, Mikkel N. Lund, Mia S. Lundkvist, Karsten Brogaard, Rene Tronsgaard, Jonatan Rudrasingam , et al. (6 additional authors not shown)

    Abstract: We present time-series radial velocities of the G8 subgiant star beta Aql obtained in 2022 and 2023 using SONG-Tenerife and, for the first time, SONG-Australia. We also analyse a sector of TESS photometry that overlapped with the 2022 SONG data. The resulting power spectrum clearly shows solar-like oscillations centred at 430 muHz. The TESS light curve shows the oscillations at lower signal-to-noi… ▽ More

    Submitted 16 June, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

    Comments: accepted by A&A

  33. arXiv:2506.00345  [pdf, ps, other

    quant-ph

    Strain Enhanced Spin Readout Contrast in Silicon Carbide Membranes

    Authors: Haibo Hu, Guodong Bian, Ailun Yi, Chunhui Jiang, Junhua Tan, Qi Luo, Bo Liang, Zhengtong Liu, Xinfang Nie, Dawei Lu, Shumin Xiao, Xin Ou, Adam Gali, Yu Zhou, Qinghai Song

    Abstract: Quantum defects in solids have emerged as a transformative platform for advancing quantum technologies. A key requirement for these applications is achieving high-fidelity single-spin readout, particularly at room temperature for quantum biosensing. Here, we demonstrate through ab initio simulations of a primary quantum defect in 4H silicon carbide that strain is an effective control parameter for… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  34. arXiv:2506.00312  [pdf, ps, other

    cs.CL cs.AI

    An evaluation of LLMs for generating movie reviews: GPT-4o, Gemini-2.0 and DeepSeek-V3

    Authors: Brendan Sands, Yining Wang, Chenhao Xu, Yuxuan Zhou, Lai Wei, Rohitash Chandra

    Abstract: Large language models (LLMs) have been prominent in various tasks, including text generation and summarisation. The applicability of LLMs to the generation of product reviews is gaining momentum, paving the way for the generation of movie reviews. In this study, we propose a framework that generates movie reviews using three LLMs (GPT-4o, DeepSeek-V3, and Gemini-2.0), and evaluate their performanc… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  35. arXiv:2505.24787  [pdf, ps, other

    cs.CV cs.CL

    Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation

    Authors: Yucheng Zhou, Jiahao Yuan, Qianning Wang

    Abstract: Recent advancements in text-to-image (T2I) generation have enabled models to produce high-quality images from textual descriptions. However, these models often struggle with complex instructions involving multiple objects, attributes, and spatial relationships. Existing benchmarks for evaluating T2I models primarily focus on general text-image alignment and fail to capture the nuanced requirements… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  36. arXiv:2505.24773  [pdf, other

    cs.LG

    AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption

    Authors: Yajie Zhou, Xiaoyi Pang, Zhibo Wang

    Abstract: Federated fine-tuning has emerged as a promising approach to adapt foundation models to downstream tasks using decentralized data. However, real-world deployment remains challenging due to the high computational and communication demands of fine-tuning Large Language Models (LLMs) on clients with data and system resources that are heterogeneous and constrained. In such settings, the global model's… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  37. arXiv:2505.24678  [pdf

    physics.optics

    All-optical diode via nonreciprocal nonlinear absorption and interfacial charge transfer in two-dimensional van der Waals heterostructures

    Authors: Erkang Li, Jinhong Liu, Yanqing Ge, Mingjian Shi, Yijie Wang, Chunhui Lu, Yixuan Zhou, Xinlong Xu

    Abstract: Nonreciprocity is fundamental to photonic and optoelectronic devices such as all-optical diodes for ultrafast optical signal processing. However, previous nonreciprocity is mainly based on linear optical response instead of nonlinear optical response based on recently developed two-dimensional (2D) van der Waals heterostructures. Herein, an all-optical diode prototype based on nonreciprocal nonlin… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  38. arXiv:2505.24283  [pdf, ps, other

    math.PR math-ph

    Characterizing the limiting critical Potts measures on locally regular-tree-like expander graphs

    Authors: Hang Du, Yanxin Zhou

    Abstract: For any integers $d,q\ge 3$, we consider the $q$-state ferromagnetic Potts model with an external field on a sequence of expander graphs that converges to the $d$-regular tree $\mathtt{T}_d$ in the Benjamini-Schramm sense. We show that along the critical line, any subsequential local weak limit of the Potts measures is a mixture of the free and wired Potts Gibbs measures on $\mathtt{T}_d$. Further… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 52 pages, 1 figure

    MSC Class: 60K35; 82B20; 82B27

  39. arXiv:2505.24160  [pdf, ps, other

    eess.IV cs.CV

    Beyond the LUMIR challenge: The pathway to foundational registration models

    Authors: Junyu Chen, Shuwen Wei, Joel Honkamaa, Pekka Marttinen, Hang Zhang, Min Liu, Yichao Zhou, Zuopeng Tan, Zhuoyuan Wang, Yi Wang, Hongchao Zhou, Shunbo Hu, Yi Zhang, Qian Tao, Lukas Förner, Thomas Wendler, Bailiang Jian, Benedikt Wiestler, Tim Hable, Jin Kim, Dan Ruan, Frederic Madesta, Thilo Sentker, Wiebke Heyer, Lianrui Zuo , et al. (11 additional authors not shown)

    Abstract: Medical image challenges have played a transformative role in advancing the field, catalyzing algorithmic innovation and establishing new performance standards across diverse clinical applications. Image registration, a foundational task in neuroimaging pipelines, has similarly benefited from the Learn2Reg initiative. Building on this foundation, we introduce the Large-scale Unsupervised Brain MRI… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  40. arXiv:2505.24133  [pdf, ps, other

    cs.CL cs.AI

    R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

    Authors: Zefan Cai, Wen Xiao, Hanshi Sun, Cheng Luo, Yikai Zhang, Ke Wan, Yucheng Li, Yeyang Zhou, Li-Wen Chang, Jiuxiang Gu, Zhen Dong, Anima Anandkumar, Abedelkadir Asi, Junjie Hu

    Abstract: Reasoning models have demonstrated impressive performance in self-reflection and chain-of-thought reasoning. However, they often produce excessively long outputs, leading to prohibitively large key-value (KV) caches during inference. While chain-of-thought inference significantly improves performance on complex reasoning tasks, it can also lead to reasoning failures when deployed with existing KV… ▽ More

    Submitted 13 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

  41. arXiv:2505.24029  [pdf, ps, other

    eess.SY cs.RO

    Nonlinear Oscillatory Response of Automated Vehicle Car-following: Theoretical Analysis with Traffic State and Control Input Limits

    Authors: Sixu Li, Yang Zhou

    Abstract: This paper presents a framework grounded in the theory of describing function (DF) and incremental-input DF to theoretically analyze the nonlinear oscillatory response of automated vehicles (AVs) car-following (CF) amidst traffic oscillations, considering the limits of traffic state and control input. While prevailing approaches largely ignore these limits (i.e., saturation of acceleration/deceler… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  42. arXiv:2505.23885  [pdf, ps, other

    cs.AI cs.CL

    OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

    Authors: Mengkang Hu, Yuhang Zhou, Wendong Fan, Yuzhou Nie, Bowei Xia, Tao Sun, Ziyu Ye, Zhaoxuan Jin, Yingru Li, Qiguang Chen, Zeyu Zhang, Yifeng Wang, Qianshuo Ye, Bernard Ghanem, Ping Luo, Guohao Li

    Abstract: Large Language Model (LLM)-based multi-agent systems show promise for automating real-world tasks but struggle to transfer across domains due to their domain-specific nature. Current approaches face two critical shortcomings: they require complete architectural redesign and full retraining of all components when applied to new domains. We introduce Workforce, a hierarchical multi-agent framework t… ▽ More

    Submitted 10 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

    Comments: Project Page: https://github.com/camel-ai/owl

  43. arXiv:2505.23866  [pdf, ps, other

    cs.LG cs.AI

    Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

    Authors: Chengli Tan, Yubo Zhou, Haishan Ye, Guang Dai, Junmin Liu, Zengjie Song, Jiangshe Zhang, Zixiang Zhao, Yunda Hao, Yong Xu

    Abstract: Deep neural networks have been increasingly used in safety-critical applications such as medical diagnosis and autonomous driving. However, many studies suggest that they are prone to being poorly calibrated and have a propensity for overconfidence, which may have disastrous consequences. In this paper, unlike standard training such as stochastic gradient descent, we show that the recently propose… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 16 pages

  44. arXiv:2505.23827  [pdf, ps, other

    cs.CL

    ValueSim: Generating Backstories to Model Individual Value Systems

    Authors: Bangde Du, Ziyi Ye, Zhijing Wu, Jankowska Monika, Shuqi Zhu, Qingyao Ai, Yujia Zhou, Yiqun Liu

    Abstract: As Large Language Models (LLMs) continue to exhibit increasingly human-like capabilities, aligning them with human values has become critically important. Contemporary advanced techniques, such as prompt learning and reinforcement learning, are being deployed to better align LLMs with human values. However, while these approaches address broad ethical considerations and helpfulness, they rarely fo… ▽ More

    Submitted 5 June, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: 8 pages main paper + 13 pages appendix, 3 figures, 2 tables

  45. arXiv:2505.23713  [pdf, ps, other

    cs.CL

    SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models

    Authors: Zixiang Xu, Yanbo Wang, Yue Huang, Jiayi Ye, Haomin Zhuang, Zirui Song, Lang Gao, Chenxi Wang, Zhaorun Chen, Yujun Zhou, Sixian Li, Wang Pan, Yue Zhao, Jieyu Zhao, Xiangliang Zhang, Xiuying Chen

    Abstract: Large language models (LLMs) are increasingly applied to socially grounded tasks, such as online community moderation, media content analysis, and social reasoning games. Success in these contexts depends on a model's social reasoning ability - the capacity to interpret social contexts, infer others' mental states, and assess the truthfulness of presented information. However, there is currently n… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Code available at https://github.com/xzx34/SocialMaze

  46. arXiv:2505.23566  [pdf, ps, other

    cs.CV

    Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition

    Authors: Yu Li, Jin Jiang, Jianhua Zhu, Shuai Peng, Baole Wei, Yuxuan Zhou, Liangcai Gao

    Abstract: Handwritten Mathematical Expression Recognition (HMER) remains a persistent challenge in Optical Character Recognition (OCR) due to the inherent freedom of symbol layout and variability in handwriting styles. Prior methods have faced performance bottlenecks, proposing isolated architectural modifications that are difficult to integrate coherently into a unified framework. Meanwhile, recent advance… ▽ More

    Submitted 1 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

  47. arXiv:2505.23530  [pdf, ps, other

    hep-ex

    Measurement of the Lund plane for light- and beauty-quark jets

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis, L. An , et al. (1133 additional authors not shown)

    Abstract: The substructure of jets in quantum chromodynamics (QCD) has garnered significant attention with the advent of infrared- and collinear-safe clustering algorithms and observables. A key question emerging from these studies is how in-jet emissions at soft and hard energy scales, across collinear and wide angles relative to the emitter, differ with the mass of the emitting parton. The Lund jet plane… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2025-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2025-010,CERN-EP-2025-093

  48. arXiv:2505.23229  [pdf, ps, other

    cs.CL cs.AI cs.CY

    MCTSr-Zero: Self-Reflective Psychological Counseling Dialogues Generation via Principles and Adaptive Exploration

    Authors: Hao Lu, Yanchi Gu, Haoyuan Huang, Yulin Zhou, Ningxin Zhu, Chen Li

    Abstract: The integration of Monte Carlo Tree Search (MCTS) with Large Language Models (LLMs) has demonstrated significant success in structured, problem-oriented tasks. However, applying these methods to open-ended dialogues, such as those in psychological counseling, presents unique challenges. Unlike tasks with objective correctness, success in therapeutic conversations depends on subjective factors like… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 50 pages, 3 figures

  49. arXiv:2505.23219  [pdf, ps, other

    cs.DC

    Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism

    Authors: Jinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang, Jiangsu Du, Yutong Lu

    Abstract: In-situ LLM inference on end-user devices has gained significant interest due to its privacy benefits and reduced dependency on external infrastructure. However, as the decoding process is memory-bandwidth-bound, the diverse processing units in modern end-user devices cannot be fully exploited, resulting in slow LLM inference. This paper presents Ghidorah, a LLM inference system for end-user devic… ▽ More

    Submitted 9 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

    Comments: 8 pages

  50. arXiv:2505.23020  [pdf, ps, other

    cs.CR cs.AI cs.CL

    AgentAlign: Navigating Safety Alignment in the Shift from Informative to Agentic Large Language Models

    Authors: Jinchuan Zhang, Lu Yin, Yan Zhou, Songlin Hu

    Abstract: The acquisition of agentic capabilities has transformed LLMs from "knowledge providers" to "action executors", a trend that while expanding LLMs' capability boundaries, significantly increases their susceptibility to malicious use. Previous work has shown that current LLM-based agents execute numerous malicious tasks even without being attacked, indicating a deficiency in agentic use safety alignm… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Submitted to ACL 2025