Skip to main content

Showing 1–50 of 186 results for author: An, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02307  [pdf, ps, other

    cs.CV

    Flow-CDNet: A Novel Network for Detecting Both Slow and Fast Changes in Bitemporal Images

    Authors: Haoxuan Li, Chenxu Wei, Haodong Wang, Xiaomeng Hu, Boyuan An, Lingyan Ran, Baosen Zhang, Jin Jin, Omirzhan Taukebayev, Amirkhan Temirbayev, Junrui Liu, Xiuwei Zhang

    Abstract: Change detection typically involves identifying regions with changes between bitemporal images taken at the same location. Besides significant changes, slow changes in bitemporal images are also important in real-life scenarios. For instance, weak changes often serve as precursors to major hazards in scenarios like slopes, dams, and tailings ponds. Therefore, designing a change detection network t… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: 18 pages, 8 figures

  2. arXiv:2506.21872  [pdf, ps, other

    cs.LG cs.AI

    A Survey of Continual Reinforcement Learning

    Authors: Chaofan Pan, Xin Yang, Yanhua Li, Wei Wei, Tianrui Li, Bo An, Jiye Liang

    Abstract: Reinforcement Learning (RL) is an important machine learning paradigm for solving sequential decision-making problems. Recent years have witnessed remarkable progress in this field due to the rapid development of deep neural networks. However, the success of RL currently relies on extensive training data and computational resources. In addition, RL's limited ability to generalize across tasks rest… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: This work has been submitted to the IEEE TPAMI

  3. arXiv:2506.13187  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CV

    Dynamic Context-oriented Decomposition for Task-aware Low-rank Adaptation with Less Forgetting and Faster Convergence

    Authors: Yibo Yang, Sihao Liu, Chuan Rao, Bang An, Tiancheng Shen, Philip H. S. Torr, Ming-Hsuan Yang, Bernard Ghanem

    Abstract: Conventional low-rank adaptation methods build adapters without considering data context, leading to sub-optimal fine-tuning performance and severe forgetting of inherent world knowledge. In this paper, we propose context-oriented decomposition adaptation (CorDA), a novel method that initializes adapters in a task-aware manner. Concretely, we develop context-oriented singular value decomposition,… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  4. arXiv:2506.12508  [pdf, ps, other

    cs.AI

    AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving

    Authors: Wentao Zhang, Ce Cui, Yilei Zhao, Rui Hu, Yang Liu, Yahui Zhou, Bo An

    Abstract: Recent advances in agent systems based on large language models (LLMs) have demonstrated strong capabilities in solving complex tasks. However, most current methods lack mechanisms for coordinating specialized agents and have limited ability to generalize to new or diverse domains. We introduce \projectname, a hierarchical multi-agent framework for general-purpose task solving that integrates high… ▽ More

    Submitted 17 June, 2025; v1 submitted 14 June, 2025; originally announced June 2025.

  5. arXiv:2506.12110  [pdf, ps, other

    econ.GN cs.AI

    EconGym: A Scalable AI Testbed with Diverse Economic Tasks

    Authors: Qirui Mi, Qipeng Yang, Zijun Fan, Wentian Fan, Heyang Ma, Chengdong Ma, Siyu Xia, Bo An, Jun Wang, Haifeng Zhang

    Abstract: Artificial intelligence (AI) has become a powerful tool for economic research, enabling large-scale simulation and policy optimization. However, applying AI effectively requires simulation platforms for scalable training and evaluation-yet existing environments remain limited to simplified, narrowly scoped tasks, falling short of capturing complex economic challenges such as demographic shifts, mu… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: 28 pages, 7 figures, 17 tables

  6. arXiv:2506.01369  [pdf, ps, other

    cs.LG cs.AI

    Incentivizing LLMs to Self-Verify Their Answers

    Authors: Fuxiang Zhang, Jiacheng Xu, Chaojie Wang, Ce Cui, Yang Liu, Bo An

    Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in complex reasoning tasks through both post-training and test-time scaling laws. While prevalent test-time scaling approaches are often realized by using external reward models to guide the model generation process, we find only marginal gains can be acquired when scaling a model post-trained on specific reasoning tasks. We identi… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  7. arXiv:2505.22312  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Skywork Open Reasoner 1 Technical Report

    Authors: Jujie He, Jiacai Liu, Chris Yuhao Liu, Rui Yan, Chaojie Wang, Peng Cheng, Xiaoyu Zhang, Fuxiang Zhang, Jiacheng Xu, Wei Shen, Siyuan Li, Liang Zeng, Tianwen Wei, Cheng Cheng, Bo An, Yang Liu, Yahui Zhou

    Abstract: The success of DeepSeek-R1 underscores the significant role of reinforcement learning (RL) in enhancing the reasoning capabilities of large language models (LLMs). In this work, we present Skywork-OR1, an effective and scalable RL implementation for long Chain-of-Thought (CoT) models. Building on the DeepSeek-R1-Distill model series, our RL approach achieves notable performance gains, increasing a… ▽ More

    Submitted 29 May, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

  8. arXiv:2505.21959   

    cs.LG cs.CL

    EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

    Authors: Aakriti Agrawal, Mucong Ding, Zora Che, Chenghao Deng, Anirudh Satheesh, Bang An, Bayan Bruss, John Langford, Furong Huang

    Abstract: With Large Language Models (LLMs) rapidly approaching and potentially surpassing human-level performance, it has become imperative to develop approaches capable of effectively supervising and enhancing these powerful models using smaller, human-level models exposed to only human-level data. We address this critical weak-to-strong (W2S) generalization challenge by proposing a novel method aimed at… ▽ More

    Submitted 4 June, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: Manuscript uploaded as version2 of arXiv:2410.04571

  9. arXiv:2505.16988  [pdf, other

    cs.CL cs.AI cs.MA

    MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems

    Authors: Rui Ye, Keduan Huang, Qimin Wu, Yuzhu Cai, Tian Jin, Xianghe Pang, Xiangrui Liu, Jiaqi Su, Chen Qian, Bohan Tang, Kaiqu Liang, Jiaao Chen, Yue Hu, Zhenfei Yin, Rongye Shi, Bo An, Yang Gao, Wenjun Wu, Lei Bai, Siheng Chen

    Abstract: LLM-based multi-agent systems (MAS) have demonstrated significant potential in enhancing single LLMs to address complex and diverse tasks in practical applications. Despite considerable advancements, the field lacks a unified codebase that consolidates existing methods, resulting in redundant re-implementation efforts, unfair comparisons, and high entry barriers for researchers. To address these c… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 18 pages, 11 figures

  10. arXiv:2505.12299  [pdf, ps, other

    cs.CL cs.AI

    Enhance Mobile Agents Thinking Process Via Iterative Preference Learning

    Authors: Kun Huang, Weikai Xu, Yuxuan Liu, Quandong Wang, Pengzhi Gao, Wei Liu, Jian Luan, Bin Wang, Bo An

    Abstract: The Chain of Action-Planning Thoughts (CoaT) paradigm has been shown to improve the reasoning performance of VLM-based mobile agents in GUI tasks. However, the scarcity of diverse CoaT trajectories limits the expressiveness and generalization ability of such agents. While self-training is commonly employed to address data scarcity, existing approaches either overlook the correctness of intermediat… ▽ More

    Submitted 27 May, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

    Comments: 9 pages, 8 figures, 7 tables

  11. arXiv:2505.11891  [pdf, ps, other

    cs.CL cs.AI

    Mobile-Bench-v2: A More Realistic and Comprehensive Benchmark for VLM-based Mobile Agents

    Authors: Weikai Xu, Zhizheng Jiang, Yuxuan Liu, Pengzhi Gao, Wei Liu, Jian Luan, Yuanchun Li, Yunxin Liu, Bin Wang, Bo An

    Abstract: VLM-based mobile agents are increasingly popular due to their capabilities to interact with smartphone GUIs and XML-structured texts and to complete daily tasks. However, existing online benchmarks struggle with obtaining stable reward signals due to dynamic environmental changes. Offline benchmarks evaluate the agents through single-path trajectories, which stands in contrast to the inherently mu… ▽ More

    Submitted 26 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

  12. arXiv:2505.10978  [pdf, other

    cs.LG cs.AI

    Group-in-Group Policy Optimization for LLM Agent Training

    Authors: Lang Feng, Zhenghai Xue, Tingcong Liu, Bo An

    Abstract: Recent advances in group-based reinforcement learning (RL) have driven frontier large language models (LLMs) in single-turn tasks like mathematical reasoning. However, their scalability to long-horizon LLM agent training remains limited. Unlike static tasks, agent-environment interactions unfold over many steps and often yield sparse or delayed rewards, making credit assignment across individual s… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: Preprint

  13. arXiv:2505.09986  [pdf, other

    cs.CV eess.IV

    High Quality Underwater Image Compression with Adaptive Correction and Codebook-based Augmentation

    Authors: Yimin Zhou, Yichong Xia, Sicheng Pan, Bin Chen, Baoyi An, Haoqian Wang, Zhi Wang, Yaowei Wang, Zikun Zhou

    Abstract: With the increasing exploration and exploitation of the underwater world, underwater images have become a critical medium for human interaction with marine environments, driving extensive research into their efficient transmission and storage. However, contemporary underwater image compression algorithms fail to fully leverage the unique characteristics distinguishing underwater scenes from terres… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  14. arXiv:2505.09959  [pdf, ps, other

    cs.LG

    Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning

    Authors: Zengxia Guo, Bohui An, Zhongqi Lu

    Abstract: Federated reinforcement learning (FRL) methods usually share the encrypted local state or policy information and help each client to learn from others while preserving everyone's privacy. In this work, we propose that sharing the approximated behavior metric-based state projection function is a promising way to enhance the performance of FRL and concurrently provides an effective protection of sen… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  15. arXiv:2505.09432  [pdf, ps, other

    cs.LG stat.ML

    Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel-Young Losses

    Authors: Yuzhou Cao, Han Bao, Lei Feng, Bo An

    Abstract: Surrogate regret bounds, also known as excess risk bounds, bridge the gap between the convergence rates of surrogate and target losses, with linear bounds favorable for their lossless regret transfer. While convex smooth surrogate losses are appealing in particular due to the efficient estimation and optimization, the existence of a trade-off between the smoothness and linear regret bound has been… ▽ More

    Submitted 14 May, 2025; v1 submitted 14 May, 2025; originally announced May 2025.

  16. arXiv:2505.05870  [pdf, ps, other

    cs.CV cs.AI eess.IV

    Towards Facial Image Compression with Consistency Preserving Diffusion Prior

    Authors: Yimin Zhou, Yichong Xia, Bin Chen, Baoyi An, Haoqian Wang, Zhi Wang, Yaowei Wang, Zikun Zhou

    Abstract: With the widespread application of facial image data across various domains, the efficient storage and transmission of facial images has garnered significant attention. However, the existing learned face image compression methods often produce unsatisfactory reconstructed image quality at low bit rates. Simply adapting diffusion-based compression methods to facial compression tasks results in reco… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  17. arXiv:2505.03792  [pdf, ps, other

    cs.LG cs.AI

    Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

    Authors: Lang Feng, Weihao Tan, Zhiyi Lyu, Longtao Zheng, Haiyang Xu, Ming Yan, Fei Huang, Bo An

    Abstract: Online fine-tuning vision-language model (VLM) agents with reinforcement learning (RL) has shown promise for equipping agents with multi-step, goal-oriented capabilities in dynamic environments. However, their open-ended textual action space and non-end-to-end nature of action generation present significant challenges to effective online exploration in RL, e.g., explosion of the exploration space.… ▽ More

    Submitted 3 June, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

    Comments: ICML 2025

  18. arXiv:2504.21582  [pdf, other

    cs.MA cs.AI

    MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework

    Authors: Qirui Mi, Mengyue Yang, Xiangning Yu, Zhiyu Zhao, Cheng Deng, Bo An, Haifeng Zhang, Xu Chen, Jun Wang

    Abstract: Simulating collective decision-making involves more than aggregating individual behaviors; it emerges from dynamic interactions among individuals. While large language models (LLMs) offer strong potential for social simulation, achieving quantitative alignment with real-world data remains a key challenge. To bridge this gap, we propose the Mean-Field LLM (MF-LLM) framework, the first to incorporat… ▽ More

    Submitted 19 May, 2025; v1 submitted 30 April, 2025; originally announced April 2025.

    Comments: 29 pages, 8 figures, 4 tables

  19. arXiv:2504.20965  [pdf, ps, other

    cs.LG

    AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security

    Authors: Zikui Cai, Shayan Shabihi, Bang An, Zora Che, Brian R. Bartoldson, Bhavya Kailkhura, Tom Goldstein, Furong Huang

    Abstract: We introduce AegisLLM, a cooperative multi-agent defense against adversarial attacks and information leakage. In AegisLLM, a structured workflow of autonomous agents - orchestrator, deflector, responder, and evaluator - collaborate to ensure safe and compliant LLM outputs, while self-improving over time through prompt optimization. We show that scaling agentic reasoning system at test-time - both… ▽ More

    Submitted 13 June, 2025; v1 submitted 29 April, 2025; originally announced April 2025.

    Comments: ICLR 2025 Workshop BuildingTrust

  20. arXiv:2504.18904  [pdf, other

    cs.RO

    RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

    Authors: Haoran Geng, Feishi Wang, Songlin Wei, Yuyang Li, Bangjun Wang, Boshi An, Charlie Tianyue Cheng, Haozhe Lou, Peihao Li, Yen-Jen Wang, Yutong Liang, Dylan Goetting, Chaoyi Xu, Haozhe Chen, Yuxi Qian, Yiran Geng, Jiageng Mao, Weikang Wan, Mingtong Zhang, Jiangran Lyu, Siheng Zhao, Jiazhao Zhang, Jialiang Zhang, Chengyang Zhao, Haoran Lu , et al. (12 additional authors not shown)

    Abstract: Data scaling and standardized evaluation benchmarks have driven significant advances in natural language processing and computer vision. However, robotics faces unique challenges in scaling data and establishing evaluation protocols. Collecting real-world data is resource-intensive and inefficient, while benchmarking in real-world scenarios remains highly complex. Synthetic data and simulation off… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

  21. arXiv:2504.18041  [pdf, other

    cs.CL cs.AI

    RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models

    Authors: Bang An, Shiyue Zhang, Mark Dredze

    Abstract: Efforts to ensure the safety of large language models (LLMs) include safety fine-tuning, evaluation, and red teaming. However, despite the widespread use of the Retrieval-Augmented Generation (RAG) framework, AI safety work focuses on standard LLMs, which means we know little about how RAG use cases change a model's safety profile. We conduct a detailed comparative analysis of RAG and non-RAG fram… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: NAACL 2025

  22. arXiv:2504.16073  [pdf, other

    cs.CL

    Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation

    Authors: Zhiyuan Hu, Shiyun Xiong, Yifan Zhang, See-Kiong Ng, Anh Tuan Luu, Bo An, Shuicheng Yan, Bryan Hooi

    Abstract: Recent advancements in visual language models (VLMs) have notably enhanced their capabilities in handling complex Graphical User Interface (GUI) interaction tasks. Despite these improvements, current frameworks often struggle to generate correct actions in challenging GUI environments. State-of-the-art commercial VLMs are black-boxes, and fine-tuning open-source VLMs for GUI tasks requires signifi… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  23. arXiv:2504.15585  [pdf, ps, other

    cs.CR cs.AI cs.CL cs.LG

    A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

    Authors: Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Shicheng Xu, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu , et al. (78 additional authors not shown)

    Abstract: The remarkable success of Large Language Models (LLMs) has illuminated a promising pathway toward achieving Artificial General Intelligence for both academic and industrial communities, owing to their unprecedented performance across various applications. As LLMs continue to gain prominence in both research and commercial domains, their security and safety implications have become a growing concer… ▽ More

    Submitted 8 June, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

  24. arXiv:2504.14587  [pdf, other

    cs.LG cs.IR

    Generative Auto-Bidding with Value-Guided Explorations

    Authors: Jingtong Gao, Yewen Li, Shuai Mao, Peng Jiang, Nan Jiang, Yejing Wang, Qingpeng Cai, Fei Pan, Peng Jiang, Kun Gai, Bo An, Xiangyu Zhao

    Abstract: Auto-bidding, with its strong capability to optimize bidding decisions within dynamic and competitive online environments, has become a pivotal strategy for advertising platforms. Existing approaches typically employ rule-based strategies or Reinforcement Learning (RL) techniques. However, rule-based strategies lack the flexibility to adapt to time-varying market conditions, and RL-based methods s… ▽ More

    Submitted 25 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

  25. arXiv:2504.12608  [pdf, other

    cs.SE cs.AI

    Code Copycat Conundrum: Demystifying Repetition in LLM-based Code Generation

    Authors: Mingwei Liu, Juntao Li, Ying Wang, Xueying Du, Zuoyu Ou, Qiuyuan Chen, Bingxu An, Zhao Wei, Yong Xu, Fangming Zou, Xin Peng, Yiling Lou

    Abstract: Despite recent advances in Large Language Models (LLMs) for code generation, the quality of LLM-generated code still faces significant challenges. One significant issue is code repetition, which refers to the model's tendency to generate structurally redundant code, resulting in inefficiencies and reduced readability. To address this, we conduct the first empirical study to investigate the prevale… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  26. arXiv:2504.05732  [pdf, other

    cs.CL

    LLM$\times$MapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long Resources

    Authors: Haoyu Wang, Yujia Fu, Zhu Zhang, Shuo Wang, Zirui Ren, Xiaorong Wang, Zhili Li, Chaoqun He, Bo An, Zhiyuan Liu, Maosong Sun

    Abstract: Long-form generation is crucial for a wide range of practical applications, typically categorized into short-to-long and long-to-long generation. While short-to-long generations have received considerable attention, generating long texts from extremely long resources remains relatively underexplored. The primary challenge in long-to-long generation lies in effectively integrating and analyzing rel… ▽ More

    Submitted 14 April, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

  27. arXiv:2504.04562  [pdf, other

    cs.RO cs.AI

    Planning Safety Trajectories with Dual-Phase, Physics-Informed, and Transportation Knowledge-Driven Large Language Models

    Authors: Rui Gan, Pei Li, Keke Long, Bocheng An, Junwei You, Keshu Wu, Bin Ran

    Abstract: Foundation models have demonstrated strong reasoning and generalization capabilities in driving-related tasks, including scene understanding, planning, and control. However, they still face challenges in hallucinations, uncertainty, and long inference latency. While existing foundation models have general knowledge of avoiding collisions, they often lack transportation-specific safety knowledge. T… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  28. arXiv:2503.10721  [pdf, other

    cs.SE cs.AI

    From Understanding to Excelling: Template-Free Algorithm Design through Structural-Functional Co-Evolution

    Authors: Zhe Zhao, Haibin Wen, Pengkun Wang, Ye Wei, Zaixi Zhang, Xi Lin, Fei Liu, Bo An, Hui Xiong, Yang Wang, Qingfu Zhang

    Abstract: Large language models (LLMs) have greatly accelerated the automation of algorithm generation and optimization. However, current methods such as EoH and FunSearch mainly rely on predefined templates and expert-specified functions that focus solely on the local evolution of key functionalities. Consequently, they fail to fully leverage the synergistic benefits of the overall architecture and the pot… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    MSC Class: 68W20; 68T20 ACM Class: I.2.7

  29. arXiv:2503.09648  [pdf, other

    cs.MA cs.CY

    A Survey on Trustworthy LLM Agents: Threats and Countermeasures

    Authors: Miao Yu, Fanci Meng, Xinyun Zhou, Shilong Wang, Junyuan Mao, Linsey Pang, Tianlong Chen, Kun Wang, Xinfeng Li, Yongfeng Zhang, Bo An, Qingsong Wen

    Abstract: With the rapid evolution of Large Language Models (LLMs), LLM-based agents and Multi-agent Systems (MAS) have significantly expanded the capabilities of LLM ecosystems. This evolution stems from empowering LLMs with additional modules such as memory, tools, environment, and even other agents. However, this advancement has also introduced more complex issues of trustworthiness, which previous resea… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  30. arXiv:2503.07697  [pdf, ps, other

    cs.LG cs.CR

    PoisonedParrot: Subtle Data Poisoning Attacks to Elicit Copyright-Infringing Content from Large Language Models

    Authors: Michael-Andrei Panaitescu-Liess, Pankayaraj Pathmanathan, Yigitcan Kaya, Zora Che, Bang An, Sicheng Zhu, Aakriti Agrawal, Furong Huang

    Abstract: As the capabilities of large language models (LLMs) continue to expand, their usage has become increasingly prevalent. However, as reflected in numerous ongoing lawsuits regarding LLM-generated content, addressing copyright infringement remains a significant challenge. In this paper, we introduce PoisonedParrot: the first stealthy data poisoning attack that induces an LLM to generate copyrighted c… ▽ More

    Submitted 5 June, 2025; v1 submitted 10 March, 2025; originally announced March 2025.

    Comments: 18 pages, 18 figures. Accepted at NAACL 2025

  31. arXiv:2503.06893  [pdf, other

    cs.LG cs.AI

    Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning

    Authors: Zhenghai Xue, Lang Feng, Jiacheng Xu, Kang Kang, Xiang Wen, Bo An, Shuicheng Yan

    Abstract: To learn from data collected in diverse dynamics, Imitation from Observation (IfO) methods leverage expert state trajectories based on the premise that recovering expert state distributions in other dynamics facilitates policy learning in the current one. However, Imitation Learning inherently imposes a performance upper bound of learned policies. Additionally, as the environment dynamics change,… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

    Comments: Preprint. Under Review

  32. arXiv:2502.19832  [pdf, other

    cs.RO

    Tracailer: An Efficient Trajectory Planner for Tractor-Trailer Vehicles in Unstructured Environments

    Authors: Long Xu, Kaixin Chai, Boyuan An, Jiaxiang Gan, Qianhao Wang, Yuan Zhou, Xiaoying Li, Junxiao Lin, Zhichao Han, Chao Xu, Yanjun Cao, Fei Gao

    Abstract: The tractor-trailer vehicle (robot) consists of a drivable tractor and one or more non-drivable trailers connected via hitches. Compared to typical car-like robots, the addition of trailers provides greater transportation capability. However, this also complicates motion planning due to the robot's complex kinematics, high-dimensional state space, and deformable structure. To efficiently plan safe… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 15 pages, 12 figures

  33. arXiv:2502.17421  [pdf, ps, other

    cs.CL cs.AI cs.LG

    LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

    Authors: Penghui Yang, Cunxiao Du, Fengzhuo Zhang, Haonan Wang, Tianyu Pang, Chao Du, Bo An

    Abstract: As Large Language Models (LLMs) can now process extremely long contexts, efficient inference over these extended inputs has become increasingly important, especially for emerging applications like LLM agents that highly depend on this capability. Speculative decoding (SD) offers a promising lossless acceleration technique compared to lossy alternatives such as quantization and model cascades. Howe… ▽ More

    Submitted 17 June, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  34. arXiv:2502.11890  [pdf, other

    cs.CL

    Revisiting Classification Taxonomy for Grammatical Errors

    Authors: Deqing Zou, Jingheng Ye, Yulu Liu, Yu Wu, Zishan Xu, Yinghui Li, Hai-Tao Zheng, Bingxu An, Zhao Wei, Yong Xu

    Abstract: Grammatical error classification plays a crucial role in language learning systems, but existing classification taxonomies often lack rigorous validation, leading to inconsistencies and unreliable feedback. In this paper, we revisit previous classification taxonomies for grammatical errors by introducing a systematic and qualitative evaluation framework. Our approach examines four aspects of a tax… ▽ More

    Submitted 17 February, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: 26 pages, 4 figures and 5 tables

  35. arXiv:2501.17559  [pdf, other

    cs.AI cs.GT

    Solving Urban Network Security Games: Learning Platform, Benchmark, and Challenge for AI Research

    Authors: Shuxin Zhuang, Shuxin Li, Tianji Yang, Muheng Li, Xianjie Shi, Bo An, Youzhi Zhang

    Abstract: After the great achievement of solving two-player zero-sum games, more and more AI researchers focus on solving multiplayer games. To facilitate the development of designing efficient learning algorithms for solving multiplayer games, we propose a multiplayer game platform for solving Urban Network Security Games (\textbf{UNSG}) that model real-world scenarios. That is, preventing criminal activit… ▽ More

    Submitted 29 January, 2025; originally announced January 2025.

  36. arXiv:2501.14234  [pdf, other

    eess.SP cs.IT

    STAR-RIS-Enabled Multi-Path Beam Routing with Passive Beam Splitting

    Authors: Bonan An, Weidong Mei, Yuanwei Liu, Dong Wang, Zhi Chen

    Abstract: Reconfigurable intelligent surfaces (RISs) can be densely deployed in the environment to create multi-reflection line-of-sight (LoS) links for signal coverage enhancement. However, conventional reflection-only RISs can only achieve half-space reflection, which limits the LoS path diversity. In contrast, simultaneously transmitting and reflecting RISs (STAR-RISs) can achieve full-space reflection a… ▽ More

    Submitted 19 May, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

  37. arXiv:2412.17018  [pdf, ps, other

    cs.AI

    GAS: Generative Auto-bidding with Post-training Search

    Authors: Yewen Li, Shuai Mao, Jingtong Gao, Nan Jiang, Yunjian Xu, Qingpeng Cai, Fei Pan, Peng Jiang, Bo An

    Abstract: Auto-bidding is essential in facilitating online advertising by automatically placing bids on behalf of advertisers. Generative auto-bidding, which generates bids based on an adjustable condition using models like transformers and diffusers, has recently emerged as a new trend due to its potential to learn optimal strategies directly from data and adjust flexibly to preferences. However, generativ… ▽ More

    Submitted 3 June, 2025; v1 submitted 22 December, 2024; originally announced December 2024.

  38. arXiv:2412.15365  [pdf, other

    cs.LG

    LISA: Learning-Integrated Space Partitioning Framework for Traffic Accident Forecasting on Heterogeneous Spatiotemporal Data

    Authors: Bang An, Xun Zhou, Amin Vahedian, Nick Street, Jinping Guan, Jun Luo

    Abstract: Traffic accident forecasting is an important task for intelligent transportation management and emergency response systems. However, this problem is challenging due to the spatial heterogeneity of the environment. Existing data-driven methods mostly focus on studying homogeneous areas with limited size (e.g. a single urban area such as New York City) and fail to handle the heterogeneous accident p… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Journal ref: IEEE International Conference on Data Mining, ICDM 2024

  39. arXiv:2412.15353  [pdf, other

    cs.LG cs.AI

    GeoPro-Net: Learning Interpretable Spatiotemporal Prediction Models through Statistically-Guided Geo-Prototyping

    Authors: Bang An, Xun Zhou, Zirui Zhou, Ronilo Ragodos, Zenglin Xu, Jun Luo

    Abstract: The problem of forecasting spatiotemporal events such as crimes and accidents is crucial to public safety and city management. Besides accuracy, interpretability is also a key requirement for spatiotemporal forecasting models to justify the decisions. Interpretation of the spatiotemporal forecasting mechanism is, however, challenging due to the complexity of multi-source spatiotemporal features, t… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 39, 2025

  40. arXiv:2412.12310  [pdf, other

    cs.CL

    Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion

    Authors: Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Abdulmohsen Alharthik, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, Zhuoheng Ma, Yuhao Du, He Zhang, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu

    Abstract: This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art offerings like GPT-4 or ChatGPT 3.5, due to a predominant focus on mainstream languages (e.g., English and Chinese). One practical objective for an Arabic LLM is to utilize an Arabic-specific vocabulary fo… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  41. arXiv:2412.04448  [pdf, other

    cs.CV

    MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

    Authors: Longtao Zheng, Yifan Zhang, Hanzhong Guo, Jiachun Pan, Zhenxiong Tan, Jiahao Lu, Chuanxin Tang, Bo An, Shuicheng Yan

    Abstract: Recent advances in video diffusion models have unlocked new potential for realistic audio-driven talking video generation. However, achieving seamless audio-lip synchronization, maintaining long-term identity consistency, and producing natural, audio-aligned expressions in generated talking videos remain significant challenges. To address these challenges, we propose Memory-guided EMOtion-aware di… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: Project Page: https://memoavatar.github.io

  42. arXiv:2412.03253  [pdf, other

    cs.CL

    Alignment at Pre-training! Towards Native Alignment for Arabic LLMs

    Authors: Juhao Liang, Zhenyang Cai, Jianqing Zhu, Huang Huang, Kewei Zong, Bang An, Mosen Alharthi, Juncai He, Lian Zhang, Haizhou Li, Benyou Wang, Jinchao Xu

    Abstract: The alignment of large language models (LLMs) is critical for developing effective and safe language models. Traditional approaches focus on aligning models during the instruction tuning or reinforcement learning stages, referred to in this paper as `post alignment'. We argue that alignment during the pre-training phase, which we term `native alignment', warrants investigation. Native alignment ai… ▽ More

    Submitted 4 December, 2024; originally announced December 2024.

    Comments: Accepted to NeurIPS 2024 main conference. see https://github.com/FreedomIntelligence/AceGPT-v2

  43. arXiv:2411.19039  [pdf, other

    cs.AI

    Mars-PO: Multi-Agent Reasoning System Preference Optimization

    Authors: Xiaoxuan Lou, Chaojie Wang, Bo An

    Abstract: Mathematical reasoning is a fundamental capability for large language models (LLMs), yet achieving high performance in this domain remains a significant challenge. The auto-regressive generation process often makes LLMs susceptible to errors, hallucinations, and inconsistencies, particularly during multi-step reasoning. In this paper, we propose Mars-PO, a novel framework to improve the mathematic… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

  44. arXiv:2411.14487  [pdf

    cs.CL cs.AI cs.CY

    Ensuring Safety and Trust: Analyzing the Risks of Large Language Models in Medicine

    Authors: Yifan Yang, Qiao Jin, Robert Leaman, Xiaoyu Liu, Guangzhi Xiong, Maame Sarfo-Gyamfi, Changlin Gong, Santiago Ferrière-Steinert, W. John Wilbur, Xiaojun Li, Jiaxin Yuan, Bang An, Kelvin S. Castro, Francisco Erramuspe Álvarez, Matías Stockle, Aidong Zhang, Furong Huang, Zhiyong Lu

    Abstract: The remarkable capabilities of Large Language Models (LLMs) make them increasingly compelling for adoption in real-world healthcare applications. However, the risks associated with using LLMs in medical applications have not been systematically characterized. We propose using five key principles for safe and trustworthy medical AI: Truthfulness, Resilience, Fairness, Robustness, and Privacy, along… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

  45. arXiv:2411.08937  [pdf, other

    cs.CV cs.LG

    Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

    Authors: Penghui Yang, Chen-Chen Zong, Sheng-Jun Huang, Lei Feng, Bo An

    Abstract: Traditional knowledge distillation focuses on aligning the student's predicted probabilities with both ground-truth labels and the teacher's predicted probabilities. However, the transition to predicted probabilities from logits would obscure certain indispensable information. To address this issue, it is intuitive to additionally introduce a logit-level loss function as a supplement to the widely… ▽ More

    Submitted 28 May, 2025; v1 submitted 13 November, 2024; originally announced November 2024.

    Comments: Accepted by KDD 2025

  46. arXiv:2410.20428  [pdf, other

    cs.CL cs.AI

    MedGo: A Chinese Medical Large Language Model

    Authors: Haitao Zhang, Bo An

    Abstract: Large models are a hot research topic in the field of artificial intelligence. Leveraging their generative capabilities has the potential to enhance the level and quality of medical services. In response to the limitations of current large language models, which often struggle with accuracy and have narrow capabilities in medical applications, this paper presents a Chinese medical large language m… ▽ More

    Submitted 27 October, 2024; originally announced October 2024.

    Comments: 12 pages, 1 figure

  47. arXiv:2410.08193  [pdf, ps, other

    cs.CL

    GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment

    Authors: Yuancheng Xu, Udari Madhushani Sehwag, Alec Koppel, Sicheng Zhu, Bang An, Furong Huang, Sumitra Ganesh

    Abstract: Large Language Models (LLMs) exhibit impressive capabilities but require careful alignment with human preferences. Traditional training-time methods finetune LLMs using human preference datasets but incur significant training costs and require repeated training to handle diverse user preferences. Test-time alignment methods address this by using reward models (RMs) to guide frozen LLMs without ret… ▽ More

    Submitted 11 June, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: Published at the Thirteenth International Conference on Learning Representations (ICLR 2025)

  48. arXiv:2410.04764  [pdf, other

    cs.LG cs.GT

    Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models

    Authors: Aye Phyu Phyu Aung, Xinrun Wang, Ruiyu Wang, Hau Chan, Bo An, Xiaoli Li, J. Senthilnath

    Abstract: In this paper, we propose a new approach to train deep learning models using game theory concepts including Generative Adversarial Networks (GANs) and Adversarial Training (AT) where we deploy a double-oracle framework using best response oracles. GAN is essentially a two-player zero-sum game between the generator and the discriminator. The same concept can be applied to AT with attacker and class… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  49. arXiv:2410.02512  [pdf, other

    cs.LG cs.AI

    SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation

    Authors: Mucong Ding, Bang An, Yuancheng Xu, Anirudh Satheesh, Furong Huang

    Abstract: Data augmentation, a cornerstone technique in deep learning, is crucial in enhancing model performance, especially with scarce labeled data. While traditional techniques are effective, their reliance on hand-crafted methods limits their applicability across diverse data types and tasks. Although modern learnable augmentation methods offer increased adaptability, they are computationally expensive… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: ICLR 2024

  50. arXiv:2410.01575  [pdf, other

    cs.GT cs.AI

    Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

    Authors: Naming Liu, Mingzhi Wang, Xihuai Wang, Weinan Zhang, Yaodong Yang, Youzhi Zhang, Bo An, Ying Wen

    Abstract: The ex ante equilibrium for two-team zero-sum games, where agents within each team collaborate to compete against the opposing team, is known to be the best a team can do for coordination. Many existing works on ex ante equilibrium solutions are aiming to extend the scope of ex ante equilibrium solving to large-scale team games based on Policy Space Response Oracle (PSRO). However, the joint team… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.