Skip to main content

Showing 1–50 of 260 results for author: Yuan, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.04883  [pdf, other

    cs.IR cs.AI

    QBR: A Question-Bank-Based Approach to Fine-Grained Legal Knowledge Retrieval for the General Public

    Authors: Mingruo Yuan, Ben Kao, Tien-Hsuan Wu

    Abstract: Retrieval of legal knowledge by the general public is a challenging problem due to the technicality of the professional knowledge and the lack of fundamental understanding by laypersons on the subject. Traditional information retrieval techniques assume that users are capable of formulating succinct and precise queries for effective document retrieval. In practice, however, the wide gap between th… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  2. Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model

    Authors: Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen

    Abstract: Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable a… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Journal ref: Artificial Intelligence and Law 2024-09

  3. arXiv:2505.02865  [pdf, other

    cs.CL cs.AI

    Accelerating Large Language Model Reasoning via Speculative Search

    Authors: Zhihai Wang, Jie Wang, Jilai Pan, Xilin Xia, Huiling Zhen, Mingxuan Yuan, Jianye Hao, Feng Wu

    Abstract: Tree-search-based reasoning methods have significantly enhanced the reasoning capability of large language models (LLMs) by facilitating the exploration of multiple intermediate reasoning steps, i.e., thoughts. However, these methods suffer from substantial inference latency, as they have to generate numerous reasoning thoughts, severely limiting LLM applicability. To address this challenge, we pr… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: Accepted by ICML2025

  4. arXiv:2505.02322  [pdf, other

    cs.AI

    HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking

    Authors: Runquan Gui, Zhihai Wang, Jie Wang, Chi Ma, Huiling Zhen, Mingxuan Yuan, Jianye Hao, Defu Lian, Enhong Chen, Feng Wu

    Abstract: Recent advancements have significantly enhanced the performance of large language models (LLMs) in tackling complex reasoning tasks, achieving notable success in domains like mathematical and logical reasoning. However, these methods encounter challenges with complex planning tasks, primarily due to extended reasoning steps, diverse constraints, and the challenge of handling multiple distinct sub-… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: arXiv admin note: text overlap with arXiv:2406.14228 by other authors

  5. arXiv:2505.01743  [pdf, other

    cs.CV cs.AI cs.LG

    An LLM-Empowered Low-Resolution Vision System for On-Device Human Behavior Understanding

    Authors: Siyang Jiang, Bufang Yang, Lilin Xu, Mu Yuan, Yeerzhati Abudunuer, Kaiwei Liu, Liekang Zeng, Hongkai Chen, Zhenyu Yan, Xiaofan Jiang, Guoliang Xing

    Abstract: The rapid advancements in Large Vision Language Models (LVLMs) offer the potential to surpass conventional labeling by generating richer, more detailed descriptions of on-device human behavior understanding (HBU) in low-resolution vision systems, such as depth, thermal, and infrared. However, existing large vision language model (LVLM) approaches are unable to understand low-resolution data well a… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

  6. arXiv:2504.19636  [pdf, other

    cs.AI cs.NE

    Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search

    Authors: Fei Liu, Qingfu Zhang, Xialiang Tong, Kun Mao, Mingxuan Yuan

    Abstract: Large Language Models (LLMs) have demonstrated significant potential in algorithm design. However, when integrated into search frameworks for iterative algorithm search, the underlying fitness landscape--critical for understanding search behaviou--remains underexplored. In this paper, we illustrate and analyze the fitness landscape of LLM-assisted Algorithm Search (LAS) using a graph-based approac… ▽ More

    Submitted 1 May, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

  7. arXiv:2504.17490  [pdf, ps, other

    cs.LG cs.AI

    Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning

    Authors: Mingqi Yuan, Qi Wang, Guozheng Ma, Bo Li, Xin Jin, Yunbo Wang, Xiaokang Yang, Wenjun Zeng, Dacheng Tao

    Abstract: Developing lifelong learning agents is crucial for artificial general intelligence. However, deep reinforcement learning (RL) systems often suffer from plasticity loss, where neural networks gradually lose their ability to adapt during training. Despite its significance, this field lacks unified benchmarks and evaluation protocols. We introduce Plasticine, the first open-source framework for bench… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: 23 pages

  8. arXiv:2504.13443  [pdf, other

    cs.AI cs.DC cs.MA econ.GN

    Trust, but verify

    Authors: Michael J. Yuan, Carlos Campoy, Sydney Lai, James Snewin, Ju Long

    Abstract: Decentralized AI agent networks, such as Gaia, allows individuals to run customized LLMs on their own computers and then provide services to the public. However, in order to maintain service quality, the network must verify that individual nodes are running their designated LLMs. In this paper, we demonstrate that in a cluster of mostly honest nodes, we can detect nodes that run unauthorized or in… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  9. arXiv:2504.12353  [pdf, other

    q-bio.GN cs.LG stat.AP stat.ML

    TransST: Transfer Learning Embedded Spatial Factor Modeling of Spatial Transcriptomics Data

    Authors: Shuo Shuo Liu, Shikun Wang, Yuxuan Chen, Anil K. Rustgi, Ming Yuan, Jianhua Hu

    Abstract: Background: Spatial transcriptomics have emerged as a powerful tool in biomedical research because of its ability to capture both the spatial contexts and abundance of the complete RNA transcript profile in organs of interest. However, limitations of the technology such as the relatively low resolution and comparatively insufficient sequencing depth make it difficult to reliably extract real biolo… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  10. arXiv:2504.12324  [pdf, other

    cs.CL cs.AI

    Cross-Document Cross-Lingual Natural Language Inference via RST-enhanced Graph Fusion and Interpretability Prediction

    Authors: Mengying Yuan, Wangzi Xuan, Fei Li

    Abstract: Natural Language Inference (NLI) is a fundamental task in both natural language processing and information retrieval. While NLI has developed many sub-directions such as sentence-level NLI, document-level NLI and cross-lingual NLI, Cross-Document Cross-Lingual NLI (CDCL-NLI) remains largely unexplored. In this paper, we propose a novel paradigm for CDCL-NLI that extends traditional NLI capabilitie… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

  11. arXiv:2504.01541  [pdf, other

    cs.IR cs.AI

    Hyperbolic Diffusion Recommender Model

    Authors: Meng Yuan, Yutian Xiao, Wei Chen, Chu Zhao, Deqing Wang, Fuzhen Zhuang

    Abstract: Diffusion models (DMs) have emerged as the new state-of-the-art family of deep generative models. To gain deeper insights into the limitations of diffusion models in recommender systems, we investigate the fundamental structural disparities between images and items. Consequently, items often exhibit distinct anisotropic and directional structures that are less prevalent in images. However, the tra… ▽ More

    Submitted 10 April, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

  12. arXiv:2503.23993  [pdf, other

    cs.CV cs.AI

    DenseFormer: Learning Dense Depth Map from Sparse Depth and Image via Conditional Diffusion Model

    Authors: Ming Yuan, Sichao Wang, Chuang Zhang, Lei He, Qing Xu, Jianqiang Wang

    Abstract: The depth completion task is a critical problem in autonomous driving, involving the generation of dense depth maps from sparse depth maps and RGB images. Most existing methods employ a spatial propagation network to iteratively refine the depth map after obtaining an initial dense depth. In this paper, we propose DenseFormer, a novel method that integrates the diffusion model into the depth compl… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  13. arXiv:2503.23100  [pdf, other

    cs.LG cs.CL

    Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models

    Authors: Zehua Liu, Han Wu, Ruifeng She, Xiaojin Fu, Xiongwei Han, Tao Zhong, Mingxuan Yuan

    Abstract: Mixture of Experts (MoE) has emerged as a pivotal architectural paradigm for efficient scaling of Large Language Models (LLMs), operating through selective activation of parameter subsets for each input token. Nevertheless, conventional MoE architectures encounter substantial challenges, including excessive memory utilization and communication overhead during training and inference, primarily attr… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  14. arXiv:2503.21760  [pdf, other

    cs.CL

    MemInsight: Autonomous Memory Augmentation for LLM Agents

    Authors: Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, Monica Sunkara, Yi Zhang, Yassine Benajiba

    Abstract: Large language model (LLM) agents have evolved to intelligently process information, make decisions, and interact with users or tools. A key capability is the integration of long-term memory capabilities, enabling these agents to draw upon historical interactions and knowledge. However, the growing memory size and need for semantic structuring pose significant challenges. In this work, we propose… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  15. arXiv:2503.20641  [pdf, other

    cs.CL

    Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

    Authors: Han Wu, Yuxuan Yao, Shuqi Liu, Zehua Liu, Xiaojin Fu, Xiongwei Han, Xing Li, Hui-Ling Zhen, Tao Zhong, Mingxuan Yuan

    Abstract: The transition from System 1 to System 2 reasoning in large language models (LLMs) has marked significant advancements in handling complex tasks through deliberate, iterative thinking. However, this progress often comes at the cost of efficiency, as models tend to overthink, generating redundant reasoning steps without proportional improvements in output quality. Long-to-Short (L2S) reasoning has… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: Work in progress; technical report

  16. arXiv:2503.17620  [pdf, other

    cs.HC

    A Case Study of Scalable Content Annotation Using Multi-LLM Consensus and Human Review

    Authors: Mingyue Yuan, Jieshan Chen, Zhenchang Xing, Gelareh Mohammadi, Aaron Quigley

    Abstract: Content annotation at scale remains challenging, requiring substantial human expertise and effort. This paper presents a case study in code documentation analysis, where we explore the balance between automation efficiency and annotation accuracy. We present MCHR (Multi-LLM Consensus with Human Review), a novel semi-automated framework that enhances annotation scalability through the systematic in… ▽ More

    Submitted 27 April, 2025; v1 submitted 21 March, 2025; originally announced March 2025.

    Comments: 7 pages, GenAICHI: CHI 2025 Workshop on Generative AI and HCI

  17. arXiv:2503.13879  [pdf, other

    cs.AI

    Bridging Social Psychology and LLM Reasoning: Conflict-Aware Meta-Review Generation via Cognitive Alignment

    Authors: Wei Chen, Han Ding, Meng Yuan, Zhao Zhang, Deqing Wang, Fuzhen Zhuang

    Abstract: The rapid growth of scholarly submissions has overwhelmed traditional peer review systems, driving the need for intelligent automation to preserve scientific rigor. While large language models (LLMs) show promise in automating manuscript critiques, their ability to synthesize high-stakes meta-reviews, which require conflict-aware reasoning and consensus derivation, remains underdeveloped. Existing… ▽ More

    Submitted 21 March, 2025; v1 submitted 18 March, 2025; originally announced March 2025.

    Comments: 23 pages

  18. arXiv:2503.12946  [pdf, other

    cs.AR cs.AI

    Open3DBench: Open-Source Benchmark for 3D-IC Backend Implementation and PPA Evaluation

    Authors: Yunqi Shi, Chengrui Gao, Wanqi Ren, Siyuan Xu, Ke Xue, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou

    Abstract: This work introduces Open3DBench, an open-source 3D-IC backend implementation benchmark built upon the OpenROAD-flow-scripts framework, enabling comprehensive evaluation of power, performance, area, and thermal metrics. Our proposed flow supports modular integration of 3D partitioning, placement, 3D routing, RC extraction, and thermal simulation, aligning with advanced 3D flows that rely on commer… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  19. arXiv:2503.11780  [pdf, other

    cs.CV

    Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning

    Authors: Tianyi Zhao, Boyang Liu, Yanglei Gao, Yiming Sun, Maoxun Yuan, Xingxing Wei

    Abstract: Multi-Modal Object Detection (MMOD), due to its stronger adaptability to various complex environments, has been widely applied in various applications. Extensive research is dedicated to the RGB-IR object detection, primarily focusing on how to integrate complementary features from RGB-IR modalities. However, they neglect the mono-modality insufficient learning problem that the decreased feature e… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: 10 pages, 6 figures

  20. arXiv:2503.11674  [pdf, other

    cs.AR cs.AI

    Timing-Driven Global Placement by Efficient Critical Path Extraction

    Authors: Yunqi Shi, Siyuan Xu, Shixiong Kai, Xi Lin, Ke Xue, Mingxuan Yuan, Chao Qian

    Abstract: Timing optimization during the global placement of integrated circuits has been a significant focus for decades, yet it remains a complex, unresolved issue. Recent analytical methods typically use pin-level timing information to adjust net weights, which is fast and simple but neglects the path-based nature of the timing graph. The existing path-based methods, however, cannot balance the accuracy… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

    Comments: Accepted by DATE'25 as a Best Paper Award

  21. arXiv:2503.06101  [pdf, other

    cs.LG cs.AI

    ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

    Authors: Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

    Abstract: Hyperparameter optimization (HPO) is a billion-dollar problem in machine learning, which significantly impacts the training efficiency and model performance. However, achieving efficient and robust HPO in deep reinforcement learning (RL) is consistently challenging due to its high non-stationarity and computational cost. To tackle this problem, existing approaches attempt to adapt common HPO techn… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 23 pages, 22 figures

  22. arXiv:2502.15830  [pdf, other

    cs.SE cs.AI cs.CR

    Show Me Your Code! Kill Code Poisoning: A Lightweight Method Based on Code Naturalness

    Authors: Weisong Sun, Yuchen Chen, Mengzhe Yuan, Chunrong Fang, Zhenpeng Chen, Chong Wang, Yang Liu, Baowen Xu, Zhenyu Chen

    Abstract: Neural code models (NCMs) have demonstrated extraordinary capabilities in code intelligence tasks. Meanwhile, the security of NCMs and NCMs-based systems has garnered increasing attention. In particular, NCMs are often trained on large-scale data from potentially untrustworthy sources, providing attackers with the opportunity to manipulate them by inserting crafted samples into the data. This type… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: Accepted to the 47th International Conference on Software Engineering (ICSE 2025)

    MSC Class: 68-06 ACM Class: D.2.3; I.2.7

  23. arXiv:2502.15359   

    cs.AI cs.CL

    ARS: Automatic Routing Solver with Large Language Models

    Authors: Kai Li, Fei Liu, Zhenkun Wang, Xialiang Tong, Xiongwei Han, Mingxuan Yuan

    Abstract: Real-world Vehicle Routing Problems (VRPs) are characterized by a variety of practical constraints, making manual solver design both knowledge-intensive and time-consuming. Although there is increasing interest in automating the design of routing algorithms, existing research has explored only a limited array of VRP variants and fails to adequately address the complex and prevalent constraints enc… ▽ More

    Submitted 28 February, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

    Comments: Authorship is under discussion; arXiv release will follow finalization

  24. arXiv:2502.12594  [pdf, other

    cs.CL

    PASER: Post-Training Data Selection for Efficient Pruned Large Language Model Recovery

    Authors: Bowei He, Lihao Yin, Hui-Ling Zhen, Xiaokun Zhang, Mingxuan Yuan, Chen Ma

    Abstract: Model pruning is an effective approach for compressing large language models. However, this process often leads to significant degradation of model capabilities. While post-training techniques such as instruction tuning are commonly employed to recover model performance, existing methods often overlook the uneven deterioration of model capabilities and incur high computational costs. Moreover, som… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  25. arXiv:2502.12570  [pdf, other

    cs.CV

    GVTNet: Graph Vision Transformer For Face Super-Resolution

    Authors: Chao Yang, Yong Fan, Cheng Lu, Minghao Yuan, Zhijing Yang

    Abstract: Recent advances in face super-resolution research have utilized the Transformer architecture. This method processes the input image into a series of small patches. However, because of the strong correlation between different facial components in facial images. When it comes to super-resolution of low-resolution images, existing algorithms cannot handle the relationships between patches well, resul… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  26. arXiv:2502.12420  [pdf, other

    cs.CL cs.AI

    Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models

    Authors: Shuqi Liu, Han Wu, Bowei He, Xiongwei Han, Mingxuan Yuan, Linqi Song

    Abstract: Recent advances in large language models have led to numerous task-specialized fine-tuned variants, creating a need for efficient model merging techniques that preserve specialized capabilities while avoiding costly retraining. While existing task vector-based merging methods show promise, they typically apply uniform coefficients across all parameters, overlooking varying parameter importance bot… ▽ More

    Submitted 19 February, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

  27. arXiv:2502.12094  [pdf, other

    cs.AI cs.CL

    A Study on Leveraging Search and Self-Feedback for Agent Reasoning

    Authors: Karthikeyan K, Michelle Yuan, Elman Mansimov, Katerina Margatina, Anurag Pratik, Daniele Bonadiman, Monica Sunkara, Yi Zhang, Yassine Benajiba

    Abstract: Recent works have demonstrated that incorporating search during inference can significantly improve reasoning capabilities of language agents. Some approaches may make use of the ground truth or rely on model's own generated feedback. The search algorithm uses this feedback to then produce values that will update its criterion for exploring and exploiting various reasoning paths. In this study, we… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: Under review

  28. arXiv:2502.10749  [pdf, other

    cs.CL cs.AI

    LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging

    Authors: Zehua Liu, Han Wu, Yuxuan Yao, Ruifeng She, Xiongwei Han, Tao Zhong, Mingxuan Yuan

    Abstract: While most current approaches rely on further training techniques, such as fine-tuning or reinforcement learning, to enhance model capacities, model merging stands out for its ability of improving models without requiring any additional training. In this paper, we propose a unified framework for model merging based on low-rank estimation of task vectors without the need for access to the base mode… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  29. arXiv:2502.10743  [pdf, other

    cs.CL

    1bit-Merging: Dynamic Quantized Merging for Large Language Models

    Authors: Shuqi Liu, Han Wu, Bowei He, Zehua Liu, Xiongwei Han, Mingxuan Yuan, Linqi Song

    Abstract: Recent advances in large language models have led to specialized models excelling in specific domains, creating a need for efficient model merging techniques. While traditional merging approaches combine parameters into a single static model, they often compromise task-specific performance. However, task-specific routing methods maintain accuracy but introduce substantial storage overhead. We pres… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  30. arXiv:2502.07526  [pdf, other

    cs.CV

    CodePhys: Robust Video-based Remote Physiological Measurement through Latent Codebook Querying

    Authors: Shuyang Chu, Menghan Xia, Mengyao Yuan, Xin Liu, Tapio Seppanen, Guoying Zhao, Jingang Shi

    Abstract: Remote photoplethysmography (rPPG) aims to measure non-contact physiological signals from facial videos, which has shown great potential in many applications. Most existing methods directly extract video-based rPPG features by designing neural networks for heart rate estimation. Although they can achieve acceptable results, the recovery of rPPG signal faces intractable challenges when interference… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  31. arXiv:2502.06892  [pdf, other

    cs.LG cs.AI

    Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks

    Authors: Bowei He, Lihao Yin, Hui-Ling Zhen, Jianping Zhang, Lanqing Hong, Mingxuan Yuan, Chen Ma

    Abstract: The widespread deployment of pre-trained language models (PLMs) has exposed them to textual backdoor attacks, particularly those planted during the pre-training stage. These attacks pose significant risks to high-reliability applications, as they can stealthily affect multiple downstream tasks. While certifying robustness against such threats is crucial, existing defenses struggle with the high-di… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

    Comments: Accepted by ICLR 2025

  32. arXiv:2502.05731  [pdf, other

    cs.HC

    Visual Text Mining with Progressive Taxonomy Construction for Environmental Studies

    Authors: Sam Yu-Te Lee, Cheng-Wei Hung, Mei-Hua Yuan, Kwan-Liu Ma

    Abstract: Environmental experts have developed the DPSIR (Driver, Pressure, State, Impact, Response) framework to systematically study and communicate key relationships between society and the environment. Using this framework requires experts to construct a DPSIR taxonomy from a corpus, annotate the documents, and identify DPSIR variables and relationships, which is laborious and inflexible. Automating it… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

  33. arXiv:2502.04420  [pdf, other

    cs.LG cs.AI cs.CL

    KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

    Authors: Xing Li, Zeyu Xing, Yiming Li, Linping Qu, Hui-Ling Zhen, Wulong Liu, Yiwu Yao, Sinno Jialin Pan, Mingxuan Yuan

    Abstract: KV cache quantization can improve Large Language Models (LLMs) inference throughput and latency in long contexts and large batch-size scenarios while preserving LLMs effectiveness. However, current methods have three unsolved issues: overlooking layer-wise sensitivity to KV cache quantization, high overhead of online fine-grained decision-making, and low flexibility to different LLMs and constrain… ▽ More

    Submitted 24 February, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: 36 pages. Code: https://github.com/cmd2001/KVTuner

  34. arXiv:2502.04416  [pdf, other

    cs.LG cs.AI

    CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference

    Authors: Zehua Pei, Lancheng Zou, Hui-Ling Zhen, Xianzhi Yu, Wulong Liu, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu

    Abstract: Large language models (LLMs) achieve impressive performance by scaling model parameters, but this comes with significant inference overhead. Feed-forward networks (FFNs), which dominate LLM parameters, exhibit high activation sparsity in hidden neurons. To exploit this, researchers have proposed using a mixture-of-experts (MoE) architecture, where only a subset of parameters is activated. However,… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  35. arXiv:2502.04077  [pdf, other

    cs.CL cs.LG

    AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference

    Authors: Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Chen Chen, Lei Chen, Xianzhi Yu, Wulong Liu, Jianye Hao, Mingxuan Yuan, Bin Li

    Abstract: With the development of large language models (LLMs), efficient inference through Key-Value (KV) cache compression has attracted considerable attention, especially for long-context generation. To compress the KV cache, recent methods identify critical KV tokens through heuristic ranking with attention scores. However, these methods often struggle to accurately determine critical tokens as they neg… ▽ More

    Submitted 25 February, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

  36. arXiv:2502.03201  [pdf, ps, other

    cs.LG

    SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels

    Authors: Xiangyu Dong, Xingyi Zhang, Lei Chen, Mingxuan Yuan, Sibo Wang

    Abstract: Node Anomaly Detection (NAD) has gained significant attention in the deep learning community due to its diverse applications in real-world scenarios. Existing NAD methods primarily embed graphs within a single Euclidean space, while overlooking the potential of non-Euclidean spaces. Besides, to address the prevalent issue of limited supervision in real NAD tasks, previous methods tend to leverage… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  37. arXiv:2501.18876  [pdf

    physics.chem-ph cs.LG

    QMe14S, A Comprehensive and Efficient Spectral Dataset for Small Organic Molecules

    Authors: Mingzhi Yuan, Zihan Zou, Wei Hu

    Abstract: Developing machine learning protocols for molecular simulations requires comprehensive and efficient datasets. Here we introduce the QMe14S dataset, comprising 186,102 small organic molecules featuring 14 elements (H, B, C, N, O, F, Al, Si, P, S, Cl, As, Se, Br) and 47 functional groups. Using density functional theory at the B3LYP/TZVP level, we optimized the geometries and calculated properties… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: 11 pages, 4figures

  38. arXiv:2501.16607  [pdf, other

    cs.DB cs.AI cs.CL cs.PL

    MCTS-SQL: An Effective Framework for Text-to-SQL with Monte Carlo Tree Search

    Authors: Shuozhi Yuan, Liming Chen, Miaomiao Yuan, Jin Zhao, Haoran Peng, Wenming Guo

    Abstract: Text-to-SQL is a fundamental and longstanding problem in the NLP area, aiming at converting natural language queries into SQL, enabling non-expert users to operate databases. Recent advances in LLM have greatly improved text-to-SQL performance. However, challenges persist, especially when dealing with complex user queries. Current approaches (e.g., COT prompting and multi-agent frameworks) rely on… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: 8 pages, 5 figures

  39. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  40. arXiv:2501.12627  [pdf, other

    cs.LG

    Deep Reinforcement Learning with Hybrid Intrinsic Reward Model

    Authors: Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

    Abstract: Intrinsic reward shaping has emerged as a prevalent approach to solving hard-exploration and sparse-rewards environments in reinforcement learning (RL). While single intrinsic rewards, such as curiosity-driven or novelty-based methods, have shown effectiveness, they often limit the diversity and efficiency of exploration. Moreover, the potential and principle of combining multiple intrinsic reward… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 18 pages, 14 figures

  41. arXiv:2501.12620  [pdf, other

    cs.LG cs.AI

    Adaptive Data Exploitation in Deep Reinforcement Learning

    Authors: Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

    Abstract: We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms, optimizing data utilization while mitigating overfitting. Moreover, ADEPT can significan… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 40 pages, 37 figures

  42. arXiv:2501.07564  [pdf, other

    cs.LG

    E2ESlack: An End-to-End Graph-Based Framework for Pre-Routing Slack Prediction

    Authors: Saurabh Bodhe, Zhanguang Zhang, Atia Hamidizadeh, Shixiong Kai, Yingxue Zhang, Mingxuan Yuan

    Abstract: Pre-routing slack prediction remains a critical area of research in Electronic Design Automation (EDA). Despite numerous machine learning-based approaches targeting this task, there is still a lack of a truly end-to-end framework that engineers can use to obtain TNS/WNS metrics from raw circuit data at the placement stage. Existing works have demonstrated effectiveness in Arrival Time (AT) predict… ▽ More

    Submitted 13 January, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

  43. arXiv:2501.02952  [pdf, other

    cs.NI eess.SP

    Online Collaborative Resource Allocation and Task Offloading for Multi-access Edge Computing

    Authors: Geng Sun, Minghua Yuan, Zemin Sun, Jiacheng Wang, Hongyang Du, Dusit Niyato, Zhu Han, Dong In Kim

    Abstract: Multi-access edge computing (MEC) is emerging as a promising paradigm to provide flexible computing services close to user devices (UDs). However, meeting the computation-hungry and delay-sensitive demands of UDs faces several challenges, including the resource constraints of MEC servers, inherent dynamic and complex features in the MEC system, and difficulty in dealing with the time-coupled and d… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

  44. arXiv:2501.02800  [pdf, other

    cs.CV cs.CE

    COph100: A comprehensive fundus image registration dataset from infants constituting the "RIDIRP" database

    Authors: Yan Hu, Mingdao Gong, Zhongxi Qiu, Jiabao Liu, Hongli Shen, Mingzhen Yuan, Xiaoqing Zhang, Heng Li, Hai Lu, Jiang Liu

    Abstract: Retinal image registration is vital for diagnostic therapeutic applications within the field of ophthalmology. Existing public datasets, focusing on adult retinal pathologies with high-quality images, have limited number of image pairs and neglect clinical challenges. To address this gap, we introduce COph100, a novel and challenging dataset known as the Comprehensive Ophthalmology Retinal Image R… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

    Comments: 12 pages, 7 figures

    Journal ref: Scientific Data 2025

  45. arXiv:2501.02363  [pdf, other

    cs.CV cs.MA

    V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection

    Authors: Sichao Wang, Ming Yuan, Chuang Zhang, Qing Xu, Lei He, Jianqiang Wang

    Abstract: In V2X collaborative perception, the domain gaps between heterogeneous nodes pose a significant challenge for effective information fusion. Pose errors arising from latency and GPS localization noise further exacerbate the issue by leading to feature misalignment. To overcome these challenges, we propose V2X-DGPE, a high-accuracy and robust V2X feature-level collaborative perception framework. V2X… ▽ More

    Submitted 25 January, 2025; v1 submitted 4 January, 2025; originally announced January 2025.

  46. arXiv:2412.20071  [pdf, other

    cs.HC

    Towards Human-AI Synergy in UI Design: Enhancing Multi-Agent Based UI Generation with Intent Clarification and Alignment

    Authors: Mingyue Yuan, Jieshan Chen, Yongquan Hu, Sidong Feng, Mulong Xie, Gelareh Mohammadi, Zhenchang Xing, Aaron Quigley

    Abstract: In automated user interface (UI) design generation, a key challenge is the lack of support for iterative processes, as most systems only focus on end-to-end generation of designs as starting points. This results from (1) limited capabilities to fully interpret user design intent from text or images, and (2) a lack of transparency, which prevents designers from refining intermediate results. To add… ▽ More

    Submitted 28 December, 2024; originally announced December 2024.

    Comments: 21 pages,9 figures

  47. arXiv:2412.12775  [pdf, other

    cs.IR cs.CR

    RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service

    Authors: Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao

    Abstract: Retrieval-augmented generation (RAG) improves the service quality of large language models by retrieving relevant documents from credible literature and integrating them into the context of the user query. Recently, the rise of the cloud RAG service has made it possible for users to query relevant documents conveniently. However, directly sending queries to the cloud brings potential privacy leaka… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  48. arXiv:2412.12149   

    cs.LG cs.AI cs.CV

    MHSA: A Multi-scale Hypergraph Network for Mild Cognitive Impairment Detection via Synchronous and Attentive Fusion

    Authors: Manman Yuan, Weiming Jia, Xiong Luo, Jiazhen Ye, Peican Zhu, Junlin Li

    Abstract: The precise detection of mild cognitive impairment (MCI) is of significant importance in preventing the deterioration of patients in a timely manner. Although hypergraphs have enhanced performance by learning and analyzing brain networks, they often only depend on vector distances between features at a single scale to infer interactions. In this paper, we deal with a more arduous challenge, hyperg… ▽ More

    Submitted 11 January, 2025; v1 submitted 10 December, 2024; originally announced December 2024.

    Comments: The submission was made prematurely and will be resubmitted after further development

  49. arXiv:2412.05449  [pdf, other

    cs.CL cs.AI

    Towards Effective GenAI Multi-Agent Collaboration: Design and Evaluation for Enterprise Applications

    Authors: Raphael Shu, Nilaksh Das, Michelle Yuan, Monica Sunkara, Yi Zhang

    Abstract: AI agents powered by large language models (LLMs) have shown strong capabilities in problem solving. Through combining many intelligent agents, multi-agent collaboration has emerged as a promising approach to tackle complex, multi-faceted problems that exceed the capabilities of single AI agents. However, designing the collaboration protocols and evaluating the effectiveness of these systems remai… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: Technical report for multi-agent collaboration on AWS Bedrock Agents

  50. arXiv:2411.16158  [pdf, other

    cs.LG cs.AI cs.AR

    MixPE: Quantization and Hardware Co-design for Efficient LLM Inference

    Authors: Yu Zhang, Mingzi Wang, Lancheng Zou, Wulong Liu, Hui-Ling Zhen, Mingxuan Yuan, Bei Yu

    Abstract: Transformer-based large language models (LLMs) have achieved remarkable success as model sizes continue to grow, yet their deployment remains challenging due to significant computational and memory demands. Quantization has emerged as a promising solution, and state-of-the-art quantization algorithms for LLMs introduce the need for mixed-precision matrix multiplication (mpGEMM), where lower-precis… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.