Skip to main content

Showing 1–50 of 308 results for author: Wei, k

.
  1. arXiv:2506.14017  [pdf

    cond-mat.mtrl-sci

    Structural Inhomogeneities and Suppressed Magneto-Structural Coupling in Mn-Substituted GeCo2O4

    Authors: Shivani Sharma, Pooja Jain, Benny Schundelmier, Chin-Wei Wang, Poonam Yadav, Adrienn Maria Szucs, Kaya Wei, N. P. Lalla, Theo Siegrist

    Abstract: A comprehensive study of Ge1-xMnxCo2O4 (GMCO) system was conducted using neutron powder diffraction (NPD), x-ray diffraction (XRD), Scanning electron microscopy, magnetometry, and heat capacity measurements. Comparative analysis with GeCo2O4 (GCO) highlights the influence of Mn substitution on the crystal and magnetic structure at low temperature. Surprisingly, phase separation is observed in GMCO… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 19 pages,

  2. arXiv:2506.13776  [pdf, ps, other

    cs.AI cs.CY cs.HC

    Recommendations and Reporting Checklist for Rigorous & Transparent Human Baselines in Model Evaluations

    Authors: Kevin L. Wei, Patricia Paskov, Sunishchal Dev, Michael J. Byun, Anka Reuel, Xavier Roberts-Gaal, Rachel Calcott, Evie Coxon, Chinmay Deshpande

    Abstract: In this position paper, we argue that human baselines in foundation model evaluations must be more rigorous and more transparent to enable meaningful comparisons of human vs. AI performance, and we provide recommendations and a reporting checklist towards this end. Human performance baselines are vital for the machine learning community, downstream users, and policymakers to interpret AI evaluatio… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: A version of this paper has been accepted to ICML 2025 as a position paper (spotlight), with the title: "Position: Human Baselines in Model Evaluations Need Rigor and Transparency (With Recommendations & Reporting Checklist)."

  3. arXiv:2506.09562  [pdf, ps, other

    cs.CR cs.LG

    TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning

    Authors: Songze Li, Mingxuan Zhang, Kang Wei, Shouling Ji

    Abstract: Deep reinforcement learning (DRL) has achieved remarkable success in a wide range of sequential decision-making domains, including robotics, healthcare, smart grids, and finance. Recent research demonstrates that attackers can efficiently exploit system vulnerabilities during the training phase to execute backdoor attacks, producing malicious actions when specific trigger patterns are present in t… ▽ More

    Submitted 12 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

  4. arXiv:2506.04061  [pdf, ps, other

    physics.optics

    Collaborative On-Sensor Array Cameras

    Authors: Jipeng Sun, Kaixuan Wei, Thomas Eboli, Congli Wang, Cheng Zheng, Zhihao Zhou, Arka Majumdar, Wolfgang Heidrich, Felix Heide

    Abstract: Modern nanofabrication techniques have enabled us to manipulate the wavefront of light with sub-wavelength-scale structures, offering the potential to replace bulky refractive surfaces in conventional optics with ultrathin metasurfaces. In theory, arrays of nanoposts provide unprecedented control over manipulating the wavefront in terms of phase, polarization, and amplitude at the nanometer resolu… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: To appear in ACM Transactions on Graphics and to be presented at SIGGRAPH 2025

    ACM Class: I.2.11; I.4; J.2

  5. Bridging the Artificial Intelligence Governance Gap: The United States' and China's Divergent Approaches to Governing General-Purpose Artificial Intelligence

    Authors: Oliver Guest, Kevin Wei

    Abstract: The United States and China are among the world's top players in the development of advanced artificial intelligence (AI) systems, and both are keen to lead in global AI governance and development. A look at U.S. and Chinese policy landscapes reveals differences in how the two countries approach the governance of general-purpose artificial intelligence (GPAI) systems. Three areas of divergence are… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Published as a RAND commentary

    Report number: PE-A3703-1

    Journal ref: Santa Monica, CA: RAND Corporation, 2024. https://www.rand.org/pubs/perspectives/PEA3703-1.html

  6. arXiv:2505.22313  [pdf, ps, other

    physics.optics cs.CV cs.ET cs.GR

    Large-Area Fabrication-aware Computational Diffractive Optics

    Authors: Kaixuan Wei, Hector A. Jimenez-Romero, Hadi Amata, Jipeng Sun, Qiang Fu, Felix Heide, Wolfgang Heidrich

    Abstract: Differentiable optics, as an emerging paradigm that jointly optimizes optics and (optional) image processing algorithms, has made innovative optical designs possible across a broad range of applications. Many of these systems utilize diffractive optical components (DOEs) for holography, PSF engineering, or wavefront shaping. Existing approaches have, however, mostly remained limited to laboratory… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  7. arXiv:2505.21227  [pdf, other

    quant-ph

    Strong Molecule-Light Entanglement with Molecular Cavity Optomechanics

    Authors: Hong-Yun Yu, Ya-Feng Jiao, Jie Wang, Feng Li, Bin Yin, Tian Jiang, Qi-Rui Liu, Hui Jing, Ke Wei

    Abstract: We propose a molecular optomechanical platform to generate robust entanglement among bosonic modes-photons, phonons, and plasmons-under ambient conditions. The system integrates an ultrahigh-Q whispering-gallery-mode (WGM) optical resonator with a plasmonic nanocavity formed by a metallic nanoparticle and a single molecule. This hybrid architecture offers two critical advantages over standalone pl… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  8. arXiv:2505.19514  [pdf, other

    cs.CL cs.AI cs.LG

    SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

    Authors: Yaoning Yu, Ye Yu, Kai Wei, Haojing Luo, Haohan Wang

    Abstract: Prompt quality plays a critical role in the performance of large language models (LLMs), motivating a growing body of work on prompt optimization. Most existing methods optimize prompts over a fixed dataset, assuming static input distributions and offering limited support for iterative improvement. We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop frame… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  9. arXiv:2505.18462  [pdf, ps, other

    physics.atom-ph

    Dynamically Polarized SERF Atomic Comagnetometer

    Authors: Xiaofei Huang, Kai Wei, Yang Rui, Dinghui Gong, Saixin Zhou, Jie Zheng, Wei Quan

    Abstract: Atomic spin sensors are essential for beyond-the-standard-model exploration, biomagnetic measurement, and quantum navigation. While the traditional DC mode spin-exchange relaxation-free (SERF) comagnetometer achieves ultrahigh sensitivity, further improvements require suppressing technical noise and surpassing standard quantum limit. In this work, we develop a K-Rb-$^{21}$Ne SERF atomic comagnetom… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  10. arXiv:2505.17217  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs

    Authors: Kangda Wei, Hasnat Md Abdullah, Ruihong Huang

    Abstract: Large Language Models (LLMs) often exhibit gender bias, resulting in unequal treatment of male and female subjects across different contexts. To address this issue, we propose a novel data generation framework that fosters exploratory thinking in LLMs. Our approach prompts models to generate story pairs featuring male and female protagonists in structurally identical, morally ambiguous scenarios,… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  11. arXiv:2505.11733  [pdf, ps, other

    cs.CL

    MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports

    Authors: Kevin Wu, Eric Wu, Rahul Thapa, Kevin Wei, Angela Zhang, Arvind Suresh, Jacqueline J. Tao, Min Woo Sun, Alejandro Lozano, James Zou

    Abstract: Doctors and patients alike increasingly use Large Language Models (LLMs) to diagnose clinical cases. However, unlike domains such as math or coding, where correctness can be objectively defined by the final answer, medical diagnosis requires both the outcome and the reasoning process to be accurate. Currently, widely used medical benchmarks like MedQA and MMLU assess only accuracy in the final ans… ▽ More

    Submitted 20 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  12. arXiv:2505.01643  [pdf, other

    cs.CY

    Third-party compliance reviews for frontier AI safety frameworks

    Authors: Aidan Homewood, Sophie Williams, Noemi Dreksler, John Lidiard, Malcolm Murray, Lennart Heim, Marta Ziosi, Seán Ó hÉigeartaigh, Michael Chen, Kevin Wei, Christoph Winter, Miles Brundage, Ben Garfinkel, Jonas Schuett

    Abstract: Safety frameworks have emerged as a best practice for managing risks from frontier artificial intelligence (AI) systems. However, it may be difficult for stakeholders to know if companies are adhering to their frameworks. This paper explores a potential solution: third-party compliance reviews. During a third-party compliance review, an independent external party assesses whether a frontier AI com… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 27 pages, 1 figure, 5 tables

  13. arXiv:2505.00483  [pdf, other

    quant-ph physics.atom-ph

    Search for a parity-violating long-range spin-dependent interaction

    Authors: Xing Heng, Zitong Xu, Xiaofei Huang, Dinghui Gong, Guoqing Tian, Wei Ji, Jiancheng Fang, Dmitry Budker, Kai Wei

    Abstract: High-sensitivity quantum sensors are a promising tool for experimental searches for beyond-Standard-Model interactions. Here, we demonstrate an atomic comagnetometer operating under a resonantly-coupled hybrid spin-resonance (HSR) regime to probe P-odd, T-even interactions. The HSR regime enables robust nuclear-electron spin coupling, enhancing measurement bandwidth and stability without compromis… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  14. arXiv:2504.15137  [pdf, other

    quant-ph

    Scalable twin-field quantum key distribution network enabled by adaptable architecture

    Authors: Chunfeng Huang, Rui Guan, Xin Liu, Wenjie He, Shizhuo Li, Hao Liang, Ziyang Luo, Zhenrong Zhang, Wei Li, Kejin Wei

    Abstract: Quantum key distribution (QKD) is a key application in quantum communication, enabling secure key exchange between parties using quantum states. Twin-field (TF) QKD offers a promising solution that surpasses the repeaterless limits, and its measurement-device-independent nature makes it suitable for star-type network architectures. In this work, we propose a scalable TF-QKD network with adaptable… ▽ More

    Submitted 27 May, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: This version is revised based on helpful comments received, including changes to the title, network architecture, and simulation results

  15. arXiv:2504.12324  [pdf, other

    cs.CL cs.AI

    Cross-Document Cross-Lingual NLI via RST-Enhanced Graph Fusion and Interpretability Prediction

    Authors: Mengying Yuan, Wenhao Wang, Zixuan Wang, Yujie Huang, Kangli Wei, Fei Li, Chong Teng, Donghong Ji

    Abstract: Natural Language Inference (NLI) is a fundamental task in natural language processing. While NLI has developed many sub-directions such as sentence-level NLI, document-level NLI and cross-lingual NLI, Cross-Document Cross-Lingual NLI (CDCL-NLI) remains largely unexplored. In this paper, we propose a novel paradigm: CDCL-NLI, which extends traditional NLI capabilities to multi-document, multilingua… ▽ More

    Submitted 20 May, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

  16. arXiv:2504.04346  [pdf, other

    cs.AI cs.SI

    Crowdsourcing-Based Knowledge Graph Construction for Drug Side Effects Using Large Language Models with an Application on Semaglutide

    Authors: Zhijie Duan, Kai Wei, Zhaoqian Xue, Jiayan Zhou, Shu Yang, Siyuan Ma, Jin Jin, Lingyao li

    Abstract: Social media is a rich source of real-world data that captures valuable patient experience information for pharmacovigilance. However, mining data from unstructured and noisy social media content remains a challenging task. We present a systematic framework that leverages large language models (LLMs) to extract medication side effects from social media and organize them into a knowledge graph (KG)… ▽ More

    Submitted 7 April, 2025; v1 submitted 5 April, 2025; originally announced April 2025.

    MSC Class: J.4

  17. arXiv:2504.03906  [pdf, other

    cs.CL

    CliME: Evaluating Multimodal Climate Discourse on Social Media and the Climate Alignment Quotient (CAQ)

    Authors: Abhilekh Borah, Hasnat Md Abdullah, Kangda Wei, Ruihong Huang

    Abstract: The rise of Large Language Models (LLMs) has raised questions about their ability to understand climate-related contexts. Though climate change dominates social media, analyzing its multimodal expressions is understudied, and current tools have failed to determine whether LLMs amplify credible solutions or spread unsubstantiated claims. To address this, we introduce CliME (Climate Change Multimoda… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 16 pages, 9 figures

  18. arXiv:2503.09940  [pdf

    quant-ph

    Quantum-Secured DSP-Lite Data Transmission Architectures for AI-Driven Data Centres

    Authors: Xitao Ji, Wenjie He, Junda Chen, Mingming Zhang, Yuqi Li, Ziwen Zhou, Zhuoxuan Song, Hao Wu, Siqi Yan, Kejin Wei, Zhenrong Zhang, Shuang Wang, Ming Tang

    Abstract: Artificial intelligence-driven (AI-driven) data centres, which require high-performance, scalable, energy-efficient, and secure infrastructure, have led to unprecedented data traffic demands. These demands involve low latency, high bandwidth connections, low power consumption, and data confidentiality. However, conventional optical interconnect solutions, such as intensity-modulated direct detecti… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  19. arXiv:2503.09251  [pdf, other

    cs.LG cs.AI q-bio.QM

    SCOPE-DTI: Semi-Inductive Dataset Construction and Framework Optimization for Practical Usability Enhancement in Deep Learning-Based Drug Target Interaction Prediction

    Authors: Yigang Chen, Xiang Ji, Ziyue Zhang, Yuming Zhou, Yang-Chi-Dung Lin, Hsi-Yuan Huang, Tao Zhang, Yi Lai, Ke Chen, Chang Su, Xingqiao Lin, Zihao Zhu, Yanggyi Zhang, Kangping Wei, Jiehui Fu, Yixian Huang, Shidong Cui, Shih-Chung Yen, Ariel Warshel, Hsien-Da Huang

    Abstract: Deep learning-based drug-target interaction (DTI) prediction methods have demonstrated strong performance; however, real-world applicability remains constrained by limited data diversity and modeling complexity. To address these challenges, we propose SCOPE-DTI, a unified framework combining a large-scale, balanced semi-inductive human DTI dataset with advanced deep learning modeling. Constructed… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  20. arXiv:2503.09163  [pdf, other

    hep-ex physics.ins-det

    A novel layered reconstruction framework for longitudinal segmented electromagnetic calorimeter

    Authors: J. Fei, A. Yuan, K. Wei, L. Sun, J. Wang

    Abstract: In future high-energy physics experiments, the electromagnetic calorimeter (ECAL) will operate in exceptionally high-luminosity. An ECAL featuring layered readout in the longitudinal direction and precise time-stamped information offers a multi-dimensional view, enriching our comprehension of the showering process of electromagnetic particles in high-luminosity environments. And it is taken as the… ▽ More

    Submitted 7 May, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

  21. arXiv:2503.00162  [pdf, other

    cs.CV cs.AI cs.CL cs.MA

    PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style Videos

    Authors: Kangda Wei, Zhengyu Zhou, Bingqing Wang, Jun Araki, Lukas Lange, Ruihong Huang, Zhe Feng

    Abstract: In recent years, online lecture videos have become an increasingly popular resource for acquiring new knowledge. Systems capable of effectively understanding/indexing lecture videos are thus highly desirable, enabling downstream tasks like question answering to help users efficiently locate specific information within videos. This work proposes PreMind, a novel multi-agent multimodal framework tha… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

  22. arXiv:2502.19425  [pdf, other

    physics.soc-ph cs.CY

    Will the Technological Singularity Come Soon? Modeling the Dynamics of Artificial Intelligence Development via Multi-Logistic Growth Process

    Authors: Guangyin Jin, Xiaohan Ni, Kun Wei, Jie Zhao, Haoming Zhang, Leiming Jia

    Abstract: We are currently in an era of escalating technological complexity and profound societal transformations, where artificial intelligence (AI) technologies exemplified by large language models (LLMs) have reignited discussions on the 'Technological Singularity'. 'Technological Singularity' is a philosophical concept referring to an irreversible and profound transformation that occurs when AI capabili… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  23. arXiv:2502.15677  [pdf, other

    cs.CL cs.AI cs.LG

    FLEKE: Federated Locate-then-Edit Knowledge Editing

    Authors: Zongkai Zhao, Guozeng Xu, Xiuhua Li, Kaiwen Wei, Jiang Zhong

    Abstract: Locate-then-Edit Knowledge Editing (LEKE) is a key technique for updating large language models (LLMs) without full retraining. However, existing methods assume a single-user setting and become inefficient in real-world multi-client scenarios, where decentralized organizations (e.g., hospitals, financial institutions) independently update overlapping knowledge, leading to redundant mediator knowle… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

  24. arXiv:2502.14864  [pdf, other

    cs.AI cs.CV

    Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework

    Authors: Yuming Yang, Jiang Zhong, Li Jin, Jingwang Huang, Jingpeng Gao, Qing Liu, Yang Bai, Jingyuan Zhang, Rui Jiang, Kaiwen Wei

    Abstract: Multimodal Retrieval-Augmented Generation (MRAG) enhances reasoning capabilities by integrating external knowledge. However, existing benchmarks primarily focus on simple image-text interactions, overlooking complex visual formats like charts that are prevalent in real-world applications. In this work, we introduce a novel task, Chart-based MRAG, to address this limitation. To semi-automatically g… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  25. arXiv:2502.13954  [pdf, other

    cs.CL cs.LG

    Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion Recognition

    Authors: Jingwang Huang, Jiang Zhong, Qin Lei, Jinpeng Gao, Yuming Yang, Sirui Wang, Peiguang Li, Kaiwen Wei

    Abstract: Multimodal multi-label emotion recognition (MMER) aims to identify the concurrent presence of multiple emotions in multimodal data. Existing studies primarily focus on improving fusion strategies and modeling modality-to-label dependencies. However, they often overlook the impact of \textbf{aleatoric uncertainty}, which is the inherent noise in the multimodal data and hinders the effectiveness of… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  26. arXiv:2502.12509   

    cs.CL cs.AI

    LegalCore: A Dataset for Event Coreference Resolution in Legal Documents

    Authors: Kangda Wei, Xi Shi, Jonathan Tong, Sai Ramana Reddy, Anandhavelu Natarajan, Rajiv Jain, Aparna Garimella, Ruihong Huang

    Abstract: Recognizing events and their coreferential mentions in a document is essential for understanding semantic meanings of text. The existing research on event coreference resolution is mostly limited to news articles. In this paper, we present the first dataset for the legal domain, LegalCore, which has been annotated with comprehensive event and event coreference information. The legal contract docum… ▽ More

    Submitted 20 March, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: Need company internal approval before public release

  27. arXiv:2502.10641  [pdf, other

    cs.CL

    Toward Equitable Access: Leveraging Crowdsourced Reviews to Investigate Public Perceptions of Health Resource Accessibility

    Authors: Zhaoqian Xue, Guanhong Liu, Kai Wei, Chong Zhang, Qingcheng Zeng, Songhua Hu, Wenyue Hua, Lizhou Fan, Yongfeng Zhang, Lingyao Li

    Abstract: Access to health resources is a critical determinant of public well-being and societal resilience, particularly during public health crises when demand for medical services and preventive care surges. However, disparities in accessibility persist across demographic and geographic groups, raising concerns about equity. Traditional survey methods often fall short due to limitations in coverage, cost… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  28. arXiv:2502.05420  [pdf, other

    physics.optics

    Molecular optomechanically-induced transparency

    Authors: Bin Yin, Jie Wang, Mei-Yu Peng, Qian Zhang, Deng Wang, Tian-Xiang Lu, Ke Wei, Hui Jing

    Abstract: Molecular cavity optomechanics (COM), characterized by remarkably efficient optomechanical coupling enabled by a highly localized light field and ultra-small effective mode volume, holds significant promise for advancing applications in quantum science and technology. Here, we study optomechanically induced transparency and the associated group delay in a hybrid molecular COM system. We find that… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

    Comments: 12 pages, 6 figures

  29. arXiv:2502.01897  [pdf, other

    quant-ph

    Improved Quantum Computation using Operator Backpropagation

    Authors: Bryce Fuller, Minh C. Tran, Danylo Lykov, Caleb Johnson, Max Rossmannek, Ken Xuan Wei, Andre He, Youngseok Kim, DinhDuy Vu, Kunal Sharma, Yuri Alexeev, Abhinav Kandala, Antonio Mezzacapo

    Abstract: Decoherence of quantum hardware is currently limiting its practical applications. At the same time, classical algorithms for simulating quantum circuits have progressed substantially. Here, we demonstrate a hybrid framework that integrates classical simulations with quantum hardware to improve the computation of an observable's expectation value by reducing the quantum circuit depth. In this frame… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 18 pages, 10 figures

  30. arXiv:2502.01635  [pdf, other

    cs.SE cs.AI

    The AI Agent Index

    Authors: Stephen Casper, Luke Bailey, Rosco Hunter, Carson Ezell, Emma Cabalé, Michael Gerovitch, Stewart Slocum, Kevin Wei, Nikola Jurkovic, Ariba Khan, Phillip J. K. Christoffersen, A. Pinar Ozisik, Rakshit Trivedi, Dylan Hadfield-Menell, Noam Kolt

    Abstract: Leading AI developers and startups are increasingly deploying agentic AI systems that can plan and execute complex tasks with limited human involvement. However, there is currently no structured framework for documenting the technical components, intended uses, and safety features of agentic systems. To fill this gap, we introduce the AI Agent Index, the first public database to document informati… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: Accompanying website: https://aiagentindex.mit.edu/

  31. arXiv:2501.16515  [pdf, other

    cs.HC

    SimulataR: Rapid Assisted Reality Prototyping using Design-Blended Videos

    Authors: Ashwin Ram, Yue Gu, Bowen Wang, Sneha Jaikumar, Youqi Wu, Benjamin Tan Kuan Wei, Qingyang Xu, Haiming Liu, Shengdong Zhao

    Abstract: Assisted Reality (aR) is a subfield of Augmented Reality (AR) that overlays information onto a user's immediate view via see-through head-mounted displays (OST-HMDs). This technology has proven to be effective and energy-efficient to support the user and information interaction for everyday wearable intelligent systems. The aR viewing experience, however, is affected by varying real-world backgrou… ▽ More

    Submitted 9 February, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

  32. arXiv:2501.13497  [pdf, other

    cs.SD cs.CL eess.AS

    DQ-Data2vec: Decoupling Quantization for Multilingual Speech Recognition

    Authors: Qijie Shao, Linhao Dong, Kun Wei, Sining Sun, Lei Xie

    Abstract: Data2vec is a self-supervised learning (SSL) approach that employs a teacher-student architecture for contextual representation learning via masked prediction, demonstrating remarkable performance in monolingual ASR. Previous studies have revealed that data2vec's shallow layers capture speaker and language information, middle layers encode phoneme and word features, while deep layers are responsib… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

    Comments: Submitted to the IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)

  33. arXiv:2501.13306  [pdf, other

    cs.SD cs.CL eess.AS

    OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia

    Authors: Xuelong Geng, Kun Wei, Qijie Shao, Shuiyun Liu, Zhennan Lin, Zhixian Zhao, Guojian Li, Wenjie Tian, Peikun Chen, Yangze Li, Pengcheng Guo, Mingchen Shao, Shuiyuan Wang, Yuang Cao, Chengyou Wang, Tianyi Xu, Yuhang Dai, Xinfa Zhu, Yue Li, Li Zhang, Lei Xie

    Abstract: Large Language Models (LLMs) have made significant progress in various downstream tasks, inspiring the development of Speech Understanding Language Models (SULMs) to enable comprehensive speech-based interactions. However, most advanced SULMs are developed by the industry, leveraging large-scale datasets and computational resources that are not readily available to the academic community. Moreover… ▽ More

    Submitted 16 February, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

    Comments: OSUM Technical Report v2. The experimental results reported herein differ from those in v1 because of adding new data and training in more steps

  34. arXiv:2501.10114  [pdf, other

    cs.AI

    Infrastructure for AI Agents

    Authors: Alan Chan, Kevin Wei, Sihao Huang, Nitarshan Rajkumar, Elija Perrier, Seth Lazar, Gillian K. Hadfield, Markus Anderljung

    Abstract: AI agents plan and execute interactions in open-ended environments. For example, OpenAI's Operator can use a web browser to do product comparisons and buy online goods. To facilitate beneficial interactions and mitigate harmful ones, much research focuses on directly modifying agent behaviour. For example, developers can train agents to follow user instructions. This focus on direct modifications… ▽ More

    Submitted 16 May, 2025; v1 submitted 17 January, 2025; originally announced January 2025.

    Comments: Accepted to TMLR

  35. arXiv:2501.09606  [pdf, other

    cs.CY

    Local US officials' views on the impacts and governance of AI: Evidence from 2022 and 2023 survey waves

    Authors: Sophia Hatz, Noemi Dreksler, Kevin Wei, Baobao Zhang

    Abstract: This paper presents a survey of local US policymakers' views on the future impact and regulation of AI. Our survey provides insight into US policymakers' expectations regarding the effects of AI on local communities and the nation, as well as their attitudes towards specific regulatory policies. Conducted in two waves (2022 and 2023), the survey captures changes in attitudes following the release… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

  36. arXiv:2501.07865  [pdf, other

    hep-ph astro-ph.HE physics.atom-ph quant-ph

    New Constraints on Axion Mediated Dipole-Dipole Interactions

    Authors: Zitong Xu, Xing Heng, Guoqing Tian, Di Gong, Lei Cong, Wei Ji, Dmitry Budker, Kai Wei

    Abstract: The search for axions sits at the intersection of solving critical problems in fundamental physics, including the strong CP problem in QCD, uncovering the nature of dark matter, and understanding the origin of the universe's matter-antimatter asymmetry. The measurement of axion-mediated spin-dependent interactions offers a powerful approach for axion detection. However, it has long been restricted… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  37. arXiv:2501.07120  [pdf, other

    eess.IV cs.CV

    MSV-Mamba: A Multiscale Vision Mamba Network for Echocardiography Segmentation

    Authors: Xiaoxian Yang, Qi Wang, Kaiqi Zhang, Ke Wei, Jun Lyu, Lingchao Chen

    Abstract: Ultrasound imaging frequently encounters challenges, such as those related to elevated noise levels, diminished spatiotemporal resolution, and the complexity of anatomical structures. These factors significantly hinder the model's ability to accurately capture and analyze structural relationships and dynamic patterns across various regions of the heart. Mamba, an emerging model, is one of the most… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

  38. arXiv:2412.18837  [pdf, other

    quant-ph

    Experimental secure entanglement-free quantum remote sensing over 50 km of optical fiber

    Authors: Wenjie He, Chunfeng Huang, Rui Guan, Ye Chen, Zhenrong Zhang, Kejin Wei

    Abstract: Secure quantum remote sensing (SQRS) uses quantum states to gather information about distant objects or environments while ensuring secure data transmission against eavesdropping. It has potential applications in various fields, including environmental monitoring, military surveillance, and disaster response, where both data accuracy and transmission security are critical. Recent experiments have… ▽ More

    Submitted 25 December, 2024; originally announced December 2024.

  39. arXiv:2412.14759  [pdf, ps, other

    hep-ph hep-ex

    Probing the soft rescattering parameters in $B$ decays involving a scalar meson with QCD factorization

    Authors: Jing-Juan Qi, Zhen-Yang Wang, Zhen-Hua Zhang, Ke-Wei Wei, Xin-Heng Guo

    Abstract: In this work, the soft rescattering parameters in the $B^\pm\rightarrow π^\pmπ^+π^-$ and $B^\pm\rightarrow K^\pmπ^+π^-$ decays with the light scalar meson $f_0(500)$ as the intermediate resonance are studied within the QCD factorization. Considering the interference effect between $ρ(770)^0$ and $f_0(500)$, we utilize the experimentally more direct event yields for fitting and get the soft rescatt… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

  40. arXiv:2412.05230  [pdf, other

    quant-ph

    Dimensionality reduction for closed-loop quantum gate calibration

    Authors: Emma Berger, Vivek Maurya, Z. M. McIntyre, Ken Xuan Wei, Holger Haas, Daniel Puzzuoli

    Abstract: Numerical gate design typically makes use of high-dimensional parameterizations enabling sophisticated, highly expressive control pulses. Developing efficient experimental calibration methods for such gates is a long-standing challenge in quantum control, as on-device calibration requires the optimization of noisy experimental data over high-dimensional parameter spaces. To improve the efficiency… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: 14 pages, 7 figures

  41. arXiv:2410.20761  [pdf, other

    physics.atom-ph cond-mat.quant-gas

    Dual-species Optical tweezer for Rb and K atoms

    Authors: Yangbo Wei, Kedi Wei, Shangjin Li, Bo Yan

    Abstract: The optical tweezer experiment with neutral atoms is a focal topic in cold atom physics due to its significant potential in quantum computing and simulation. Here, we present the realization of a dual-species optical tweezer for both Rb and K atoms, marking the first step towards creating a polar molecule optical tweezer array. Initially, Rb and K atoms are collected using a dual magneto-optical t… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

    Comments: 6 pages, 4 figures

    Journal ref: Phys. Rev. A 110 (4), 043118 (2024)

  42. arXiv:2410.13042  [pdf, ps, other

    cs.CY

    How Do AI Companies "Fine-Tune" Policy? Examining Regulatory Capture in AI Governance

    Authors: Kevin Wei, Carson Ezell, Nick Gabrieli, Chinmay Deshpande

    Abstract: Industry actors in the United States have gained extensive influence in conversations about the regulation of general-purpose artificial intelligence (AI) systems. Although industry participation is an important part of the policy process, it can also cause regulatory capture, whereby industry co-opts regulatory regimes to prioritize private over public welfare. Capture of AI policy by AI develope… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 39 pages (14 pages main text), 3 figures, 9 tables. To be published in the Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics, & Society (AIES)

    Journal ref: Proc. AAAI/ACM Conf. AI, Ethics & Soc., 7 (2024) 1539-1555

  43. arXiv:2410.12469  [pdf, ps, other

    hep-ph hep-ex hep-th nucl-th

    Constraining the Fifth Force Using the Earth as a Spin and Mass Source from the Chinese Space Station

    Authors: Zheng-Ting Lai, Jun-Xu Lu, Li-Sheng Geng, Kai Wei, Wei Ji

    Abstract: We explore the potential of conducting an experiment on the Chinese Space Station (CSS) to constrain beyond-the-standard-model (BSM) long-range spin- and velocity-dependent interactions, which are mediated by the exchange of an ultralight $\left(m_{z^{\prime}}<10^{-10}\text{eV}\right)$ or massless intermediate vector boson. We demonstrate that the proposed experiment on the CSS offers several adva… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 6 pages, 4 figures; comments welcome

  44. arXiv:2410.01180  [pdf, other

    cs.CV cs.CL

    UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark

    Authors: Hasnat Md Abdullah, Tian Liu, Kangda Wei, Shu Kong, Ruihong Huang

    Abstract: Localizing unusual activities, such as human errors or surveillance incidents, in videos holds practical significance. However, current video understanding models struggle with localizing these unusual events likely because of their insufficient representation in models' pretraining datasets. To explore foundation models' capability in localizing unusual activity, we introduce UAL-Bench, a compreh… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

    Journal ref: wacv(2025) 5801-5811

  45. arXiv:2409.19878  [pdf, other

    cs.SD eess.AS

    HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models

    Authors: Bingshen Mu, Kun Wei, Qijie Shao, Yong Xu, Lei Xie

    Abstract: Recent advancements in integrating Large Language Models (LLM) with automatic speech recognition (ASR) have performed remarkably in general domains. While supervised fine-tuning (SFT) of all model parameters is often employed to adapt pre-trained LLM-based ASR models to specific domains, it imposes high computational costs and notably reduces their performance in general domains. In this paper, we… ▽ More

    Submitted 3 January, 2025; v1 submitted 29 September, 2024; originally announced September 2024.

    Comments: Accepted by ICASSP 2025

  46. arXiv:2409.15665  [pdf, other

    quant-ph

    Dynamically Optimized Nonadiabatic Holonomic Quantum Computation

    Authors: Hai Xu, Wanchun Li, Tao Chen, Kejin Wei, Chengxian Zhang

    Abstract: Nonadiabatic holonomic quantum computation (NHQC) is one of the promising approaches to realizing fault-tolerant quantum computation. However, due to the imperfect control in the experimental environments, the holonomic gate still needs to be further improved. Here, we propose a dynamically optimized NHQC (OPNHQC) scheme based on dynamically corrected gate technique. The scheme is implemented by c… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: 9 pages, 7 figures

  47. arXiv:2409.14297  [pdf, other

    eess.SP

    DNN-based Enhanced DOA Sensing via Massive MIMO Receiver with Switches-based Hybrid Architecture

    Authors: Yifan Li, Kang Wei, Linqiong Jia, Jun Zou, Feng Shu, Yaoliang Song, Jiangzhou Wang

    Abstract: Switches-based hybrid architecture has attracted much attention, especially in directional-of-arrival (DOA) sensing, due to its ability of significantly reducing the hardware cost by compressing massive multiple-input multiple-output (MIMO) arrays with switching networks. However, this structure will lead to a degradation in the degrees of freedom (DOF) and accuracy of DOA estimation. To address t… ▽ More

    Submitted 13 January, 2025; v1 submitted 21 September, 2024; originally announced September 2024.

  48. arXiv:2409.11214  [pdf, other

    eess.AS cs.SD

    Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text

    Authors: Hongfei Xue, Wei Ren, Xuelong Geng, Kun Wei, Longhao Li, Qijie Shao, Linju Yang, Kai Diao, Lei Xie

    Abstract: Integrating audio encoders with LLMs through connectors has enabled these models to process and comprehend audio modalities, significantly enhancing speech-to-text tasks, including automatic speech recognition (ASR) and automatic speech translation (AST). However, these methods often overlook the critical aspect of language adaptation in multilingual settings, relying instead on multilingual data… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: 5 pages, 3 figures, submitted to ICASSP 2025

  49. arXiv:2409.09754  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Towards Single-Lens Controllable Depth-of-Field Imaging via Depth-Aware Point Spread Functions

    Authors: Xiaolong Qian, Qi Jiang, Yao Gao, Shaohua Gao, Zhonghua Yi, Lei Sun, Kai Wei, Haifeng Li, Kailun Yang, Kaiwei Wang, Jian Bai

    Abstract: Controllable Depth-of-Field (DoF) imaging commonly produces amazing visual effects based on heavy and expensive high-end lenses. However, confronted with the increasing demand for mobile scenarios, it is desirable to achieve a lightweight solution with Minimalist Optical Systems (MOS). This work centers around two major limitations of MOS, i.e., the severe optical aberrations and uncontrollable Do… ▽ More

    Submitted 11 February, 2025; v1 submitted 15 September, 2024; originally announced September 2024.

    Comments: Accepted to IEEE Transactions on Computational Imaging (TCI). The source code and the established dataset will be publicly available at https://github.com/XiaolongQian/DCDI

  50. arXiv:2409.00601  [pdf, other

    quant-ph

    Geometric two-qubit gates in silicon-based double quantum dots

    Authors: Yong-Yang Lu, Kejin Wei, Chengxian Zhang

    Abstract: Achieving high-fidelity two-qubit gates is crucial for spin qubits in silicon double quantum dots. However, the two-qubit gates in experiments are easily suffered from charge noise, which is still a key challenge. Geometric gates which implement gate operations employing pure geometric phase are believed to be a powerful way to realize robust control. In this work, we theoretically propose feasibl… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

    Comments: 10 pages, 6 figures