Skip to main content

Showing 1–50 of 9,040 results for author: Liu, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.10559  [pdf, ps, other

    cs.LG cs.AI physics.data-an stat.ML

    Neural Thermodynamic Laws for Large Language Model Training

    Authors: Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark

    Abstract: Beyond neural scaling laws, little is known about the laws underlying large language models (LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM training dynamics. On the theoretical side, we demonstrate that key thermodynamic quantities (e.g., temperature, entropy, heat capacity, thermal conduction) and classical thermodynamic principles (e.g… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 18 pages, 10 figures

  2. arXiv:2505.10551  [pdf, other

    cs.CV cs.AI

    Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data

    Authors: Yiwen Liu, Jessica Bader, Jae Myung Kim

    Abstract: With the development of photorealistic diffusion models, models trained in part or fully on synthetic data achieve progressively better results. However, diffusion models still routinely generate images that would not exist in reality, such as a dog floating above the ground or with unrealistic texture artifacts. We define the concept of feasibility as whether attributes in a synthetic image could… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: CVPRW 2025

  3. arXiv:2505.10465  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Superposition Yields Robust Neural Scaling

    Authors: Yizhou liu, Ziming Liu, Jeff Gore

    Abstract: The success of today's large language models (LLMs) depends on the observation that larger models perform better. However, the origin of this neural scaling law -- the finding that loss decreases as a power law with model size -- remains unclear. Starting from two empirical principles -- that LLMs represent more things than the model dimensions (widths) they have (i.e., representations are superpo… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 30 pages, 23 figures

  4. arXiv:2505.10415  [pdf, ps, other

    cs.RO cs.HC

    Internal State Estimation in Groups via Active Information Gathering

    Authors: Xuebo Ji, Zherong Pan, Xifeng Gao, Lei Yang, Xinxin Du, Kaiyun Li, Yongjin Liu, Wenping Wang, Changhe Tu, Jia Pan

    Abstract: Accurately estimating human internal states, such as personality traits or behavioral patterns, is critical for enhancing the effectiveness of human-robot interaction, particularly in group settings. These insights are key in applications ranging from social navigation to autism diagnosis. However, prior methods are limited by scalability and passive observation, making real-time estimation in com… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  5. arXiv:2505.10402  [pdf, ps, other

    cs.CL cs.AI cs.LG cs.SE

    Rethinking Repetition Problems of LLMs in Code Generation

    Authors: Yihong Dong, Yuchen Liu, Xue Jiang, Zhi Jin, Ge Li

    Abstract: With the advent of neural language models, the performance of code generation has been significantly boosted. However, the problem of repetitions during the generation process continues to linger. Previous work has primarily focused on content repetition, which is merely a fraction of the broader repetition problem in code generation. A more prevalent and challenging problem is structural repetiti… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: Accepted to ACL 2025 (main)

  6. arXiv:2505.10145  [pdf, ps, other

    cs.AR cs.PF

    An Integrated UVM-TLM Co-Simulation Framework for RISC-V Functional Verification and Performance Evaluation

    Authors: Ruizhi Qiu, Yang Liu

    Abstract: The burgeoning RISC-V ecosystem necessitates efficient verification methodologies for complex processors. Traditional approaches often struggle to concurrently evaluate functional correctness and performance, or balance simulation speed with modeling accuracy. This paper introduces an integrated co-simulation framework leveraging Universal Verification Methodology (UVM) and Transaction-Level Model… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 7 pages, 3 figures, This work is under consideration for conference publication

    ACM Class: C.1.1; I.6.4; B.7.2

  7. arXiv:2505.09926  [pdf, ps, other

    cs.CV cs.AI

    AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection

    Authors: Bin-Bin Gao, Yue Zhu, Jiangtao Yan, Yuezhi Cai, Weixi Zhang, Meng Wang, Jun Liu, Yong Liu, Lei Wang, Chengjie Wang

    Abstract: Universal visual anomaly detection aims to identify anomalies from novel or unseen vision domains without additional fine-tuning, which is critical in open scenarios. Recent studies have demonstrated that pre-trained vision-language models like CLIP exhibit strong generalization with just zero or a few normal images. However, existing methods struggle with designing prompt templates, complex token… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: 27 pages, 15 figures, 22 tables

  8. arXiv:2505.09757  [pdf, ps, other

    cs.HC cs.AI cs.CY

    Trustless Autonomy: Understanding Motivations, Benefits and Governance Dilemma in Self-Sovereign Decentralized AI Agents

    Authors: Botao Amber Hu, Yuhan Liu, Helena Rong

    Abstract: The recent trend of self-sovereign Decentralized AI Agents (DeAgents) combines Large Language Model (LLM)-based AI agents with decentralization technologies such as blockchain smart contracts and trusted execution environments (TEEs). These tamper-resistant trustless substrates allow agents to achieve self-sovereignty through ownership of cryptowallet private keys and control of digital assets and… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: Submitted to CSCW 2026

  9. arXiv:2505.09702  [pdf, ps, other

    cs.LG

    Enabling Group Fairness in Graph Unlearning via Bi-level Debiasing

    Authors: Yezi Liu, Prathyush Poduval, Wenjun Huang, Yang Ni, Hanning Chen, Mohsen Imani

    Abstract: Graph unlearning is a crucial approach for protecting user privacy by erasing the influence of user data on trained graph models. Recent developments in graph unlearning methods have primarily focused on maintaining model prediction performance while removing user information. However, we have observed that when user information is deleted from the model, the prediction distribution across differe… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  10. arXiv:2505.09569  [pdf, other

    cs.SE

    MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8

    Authors: Linbo Liu, Xinle Liu, Qiang Zhou, Lin Chen, Yihan Liu, Hoan Nguyen, Behrooz Omidvar-Tehrani, Xi Shen, Jun Huan, Omer Tripp, Anoop Deoras

    Abstract: With the rapid advancement of powerful large language models (LLMs) in recent years, a wide range of software engineering tasks can now be addressed using LLMs, significantly enhancing productivity and scalability. Numerous benchmark datasets have been developed to evaluate the coding capabilities of these models, while they primarily focus on problem-solving and issue-resolution tasks. In contras… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  11. arXiv:2505.09561  [pdf, ps, other

    cs.RO cs.AI cs.LG

    Learning Long-Context Diffusion Policies via Past-Token Prediction

    Authors: Marcel Torne, Andy Tang, Yuejiang Liu, Chelsea Finn

    Abstract: Reasoning over long sequences of observations and actions is essential for many robotic tasks. Yet, learning effective long-context policies from demonstrations remains challenging. As context length increases, training becomes increasingly expensive due to rising memory demands, and policy performance often degrades as a result of spurious correlations. Recent methods typically sidestep these iss… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: Videos are available at https://long-context-dp.github.io

  12. arXiv:2505.09406  [pdf, ps, other

    cs.CV

    FreeDriveRF: Monocular RGB Dynamic NeRF without Poses for Autonomous Driving via Point-Level Dynamic-Static Decoupling

    Authors: Yue Wen, Liang Song, Yijia Liu, Siting Zhu, Yanzi Miao, Lijun Han, Hesheng Wang

    Abstract: Dynamic scene reconstruction for autonomous driving enables vehicles to perceive and interpret complex scene changes more precisely. Dynamic Neural Radiance Fields (NeRFs) have recently shown promising capability in scene modeling. However, many existing methods rely heavily on accurate poses inputs and multi-sensor data, leading to increased system complexity. To address this, we propose FreeDriv… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: 7 pages, 9 figures, accepted by ICRA2025

  13. arXiv:2505.09388  [pdf, other

    cs.CL

    Qwen3 Technical Report

    Authors: An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou , et al. (35 additional authors not shown)

    Abstract: In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert (MoE) architectures, with parameter scales ranging from 0.6 to 235 billion. A key innovation in Qwen3 is the integration… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  14. arXiv:2505.09343  [pdf, ps, other

    cs.DC cs.AI cs.AR

    Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

    Authors: Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei

    Abstract: The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconnection bandwidth. DeepSeek-V3, trained on 2,048 NVIDIA H800 GPUs, demonstrates how hardware-aware model co-design can effectively address these challenges, enabling cost-efficient training and inferen… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive version will appear as part of the Industry Track in Proceedings of the 52nd Annual International Symposium on Computer Architecture (ISCA '25)

  15. arXiv:2505.09284  [pdf, ps, other

    cs.LG stat.ML

    Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations

    Authors: Panqi Chen, Yifan Sun, Lei Cheng, Yang Yang, Weichang Li, Yang Liu, Weiqing Liu, Jiang Bian, Shikai Fang

    Abstract: Modeling and reconstructing multidimensional physical dynamics from sparse and off-grid observations presents a fundamental challenge in scientific research. Recently, diffusion-based generative modeling shows promising potential for physical simulation. However, current approaches typically operate on on-grid data with preset spatiotemporal resolution, but struggle with the sparsely observed and… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  16. arXiv:2505.09106  [pdf, other

    cs.LG

    Argus: Federated Non-convex Bilevel Learning over 6G Space-Air-Ground Integrated Network

    Authors: Ya Liu, Kai Yang, Yu Zhu, Keying Yang, Haibo Zhao

    Abstract: The space-air-ground integrated network (SAGIN) has recently emerged as a core element in the 6G networks. However, traditional centralized and synchronous optimization algorithms are unsuitable for SAGIN due to infrastructureless and time-varying environments. This paper aims to develop a novel Asynchronous algorithm a.k.a. Argus for tackling non-convex and non-smooth decentralized federated bile… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 17 pages, 11 figures

    MSC Class: 68T07 ACM Class: I.2

  17. arXiv:2505.08690  [pdf, ps, other

    cs.CL

    Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation

    Authors: Sheng Liang, Hang Lv, Zhihao Wen, Yaxiong Wu, Yongyue Zhang, Hao Wang, Yong Liu

    Abstract: Event extraction (EE) is a fundamental task in natural language processing (NLP) that involves identifying and extracting event information from unstructured text. Effective EE in real-world scenarios requires two key steps: selecting appropriate schemas from hundreds of candidates and executing the extraction process. Existing research exhibits two critical gaps: (1) the rigid schema fixation in… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 15 pages, 3 figures

    ACM Class: I.2.7

  18. arXiv:2505.08550  [pdf, other

    cs.LG stat.ML

    OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain

    Authors: Wenzhen Yue, Yong Liu, Haoxuan Li, Hao Wang, Xianghua Ying, Ruohao Guo, Bowei Xing, Ji Shi

    Abstract: This paper presents $\mathbf{OLinear}$, a $\mathbf{linear}$-based multivariate time series forecasting model that operates in an $\mathbf{o}$rthogonally transformed domain. Recent forecasting models typically adopt the temporal forecast (TF) paradigm, which directly encode and decode time series in the time domain. However, the entangled step-wise dependencies in series data can hinder the perform… ▽ More

    Submitted 14 May, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

  19. arXiv:2505.08532  [pdf, ps, other

    cs.SI cs.AI

    The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News

    Authors: Yuhan Liu, Yuxuan Liu, Xiaoqing Zhang, Xiuying Chen, Rui Yan

    Abstract: In today's digital environment, the rapid propagation of fake news via social networks poses significant social challenges. Most existing detection methods either employ traditional classification models, which suffer from low interpretability and limited generalization capabilities, or craft specific prompts for large language models (LLMs) to produce explanations and results directly, failing to… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: SIGIR 2025

  20. arXiv:2505.08523  [pdf, ps, other

    cs.IT eess.SP

    Dual-UAV-Enabled Secure Communication and Sensing for A2G-ISAC Systems with Maneuverable Jamming

    Authors: Libiao Lou, Yuan Liu, Fotis Foukalas, Hongjiang Lei, Gaofeng Pan, Theodoros A. Tsiftsis, Hongwu Liu

    Abstract: In this paper, we propose a dual-unmanned aerial vehicle (UAV)-enabled secure communication and sensing (SCS) scheme for an air-to-ground integrated sensing and communication (ISAC) system, in which a dual-functional source UAV and jamming UAV collaborate to enhance both the secure communication and target sensing performance. From a perspective of hybrid monostatitc-bistatic radar, the jamming UA… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 13 pages, submitted to IEEE Journal

  21. arXiv:2505.08414  [pdf

    eess.IV cs.CV

    An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care

    Authors: Zhi Da Soh, Yang Bai, Kai Yu, Yang Zhou, Xiaofeng Lei, Sahil Thakur, Zann Lee, Lee Ching Linette Phang, Qingsheng Peng, Can Can Xue, Rachel Shujuan Chong, Quan V. Hoang, Lavanya Raghavan, Yih Chung Tham, Charumathi Sabanayagam, Wei-Chi Wu, Ming-Chih Ho, Jiangnan He, Preeti Gupta, Ecosse Lamoureux, Seang Mei Saw, Vinay Nangia, Songhomitra Panda-Jonas, Jie Xu, Ya Xing Wang , et al. (6 additional authors not shown)

    Abstract: Current deep learning models are mostly task specific and lack a user-friendly interface to operate. We present Meta-EyeFM, a multi-function foundation model that integrates a large language model (LLM) with vision foundation models (VFMs) for ocular disease assessment. Meta-EyeFM leverages a routing mechanism to enable accurate task-specific analysis based on text queries. Using Low Rank Adaptati… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  22. arXiv:2505.08343  [pdf, other

    cs.AI

    An Identifiable Cost-Aware Causal Decision-Making Framework Using Counterfactual Reasoning

    Authors: Ruichu Cai, Xi Chen, Jie Qiao, Zijian Li, Yuequn Liu, Wei Chen, Keli Zhang, Jiale Zheng

    Abstract: Decision making under abnormal conditions is a critical process that involves evaluating the current state and determining the optimal action to restore the system to a normal state at an acceptable cost. However, in such scenarios, existing decision-making frameworks highly rely on reinforcement learning or root cause analysis, resulting in them frequently neglecting the cost of the actions or fa… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  23. arXiv:2505.08213  [pdf, other

    cs.RO

    HandCept: A Visual-Inertial Fusion Framework for Accurate Proprioception in Dexterous Hands

    Authors: Junda Huang, Jianshu Zhou, Honghao Guo, Yunhui Liu

    Abstract: As robotics progresses toward general manipulation, dexterous hands are becoming increasingly critical. However, proprioception in dexterous hands remains a bottleneck due to limitations in volume and generality. In this work, we present HandCept, a novel visual-inertial proprioception framework designed to overcome the challenges of traditional joint angle estimation methods. HandCept addresses t… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 8 pages, 7 figures, journal

  24. arXiv:2505.08037  [pdf, other

    cs.CL cs.LG

    TiSpell: A Semi-Masked Methodology for Tibetan Spelling Correction covering Multi-Level Error with Data Augmentation

    Authors: Yutong Liu, Feng Xiao, Ziyue Zhang, Yongbin Yu, Cheng Huang, Fan Gao, Xiangxiang Wang, Ma-bao Ban, Manping Fan, Thupten Tsering, Cheng Huang, Gadeng Luosang, Renzeng Duojie, Nyima Tashi

    Abstract: Multi-level Tibetan spelling correction addresses errors at both the character and syllable levels within a unified model. Existing methods focus mainly on single-level correction and lack effective integration of both levels. Moreover, there are no open-source datasets or augmentation methods tailored for this task in Tibetan. To tackle this, we propose a data augmentation approach using unlabele… ▽ More

    Submitted 14 May, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: 14 pages, 7 figures

  25. arXiv:2505.07921  [pdf, ps, other

    cs.LG cs.AI

    Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning

    Authors: Qi Xu, Junyang Zhu, Dongdong Zhou, Hao Chen, Yang Liu, Jiangrong Shen, Qiang Zhang

    Abstract: Deep neural networks (DNNs) excel in computer vision tasks, especially, few-shot learning (FSL), which is increasingly important for generalizing from limited examples. However, DNNs are computationally expensive with scalability issues in real world. Spiking Neural Networks (SNNs), with their event-driven nature and low energy consumption, are particularly efficient in processing sparse and dynam… ▽ More

    Submitted 14 May, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

  26. arXiv:2505.07893  [pdf, other

    cs.NI cs.LG eess.SP math.PR math.ST

    Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach

    Authors: Zhenzhou Jin, Li You, Xudong Li, Zhen Gao, Yuanwei Liu, Xiang-Gen Xia, Xiqi Gao

    Abstract: Accurate channel state information (CSI) acquisition for massive multiple-input multiple-output (MIMO) systems is essential for future mobile communication networks. Channel fingerprint (CF), also referred to as channel knowledge map, is a key enabler for intelligent environment-aware communication and can facilitate CSI acquisition. However, due to the cost limitations of practical sensing nodes… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

    Comments: 15 pages, 7 figures

  27. arXiv:2505.07889  [pdf, ps, other

    cs.CL

    BioProBench: Comprehensive Dataset and Benchmark in Biological Protocol Understanding and Reasoning

    Authors: Yuyang Liu, Liuzhenghao Lv, Xiancheng Zhang, Li Yuan, Yonghong Tian

    Abstract: Biological protocols are fundamental to reproducible and safe life science research. While LLMs excel on general tasks, their systematic evaluation on these highly specialized, accuracy-critical, and inherently procedural texts remains limited. In this work, we present BioProBench, the first large-scale, integrated multi-task benchmark for biological protocol understanding and reasoning. While lim… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  28. arXiv:2505.07882  [pdf, other

    cs.AI cs.LG

    Enhancing Trust Management System for Connected Autonomous Vehicles Using Machine Learning Methods: A Survey

    Authors: Qian Xu, Lei Zhang, Yixiao Liu

    Abstract: Connected Autonomous Vehicles (CAVs) operate in dynamic, open, and multi-domain networks, rendering them vulnerable to various threats. Trust Management Systems (TMS) systematically organize essential steps in the trust mechanism, identifying malicious nodes against internal threats and external threats, as well as ensuring reliable decision-making for more cooperative tasks. Recent advances in ma… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

    Comments: 31 pages, 9 figures

  29. arXiv:2505.07849  [pdf, ps, other

    cs.SE cs.AI cs.IR

    SweRank: Software Issue Localization with Code Ranking

    Authors: Revanth Gangi Reddy, Tarun Suresh, JaeHyeok Doo, Ye Liu, Xuan Phi Nguyen, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Heng Ji, Shafiq Joty

    Abstract: Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of software development. While recent LLM-based agentic approaches demonstrate promise, they often incur significant latency and cost due to complex multi-step rea… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  30. arXiv:2505.07834  [pdf, other

    cs.NI cs.AI cs.CR cs.PL

    ai.txt: A Domain-Specific Language for Guiding AI Interactions with the Internet

    Authors: Yuekang Li, Wei Song, Bangshuo Zhu, Dong Gong, Yi Liu, Gelei Deng, Chunyang Chen, Lei Ma, Jun Sun, Toby Walsh, Jingling Xue

    Abstract: We introduce ai.txt, a novel domain-specific language (DSL) designed to explicitly regulate interactions between AI models, agents, and web content, addressing critical limitations of the widely adopted robots.txt standard. As AI increasingly engages with online materials for tasks such as training, summarization, and content modification, existing regulatory methods lack the necessary granularity… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  31. arXiv:2505.07783  [pdf, ps, other

    cs.LG

    Relative Overfitting and Accept-Reject Framework

    Authors: Yanxin Liu, Yunqi Zhang

    Abstract: Currently, the scaling law of Large Language Models (LLMs) faces challenges and bottlenecks. This paper posits that noise effects, stemming from changes in the signal-to-noise ratio under diminishing marginal returns, are the root cause of these issues. To control this noise, we investigated the differences between models with performance advantages and disadvantages, introducing the concept of "r… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  32. arXiv:2505.07692  [pdf, other

    cs.DB

    ABase: the Multi-Tenant NoSQL Serverless Database for Diverse and Dynamic Workloads in Large-scale Cloud Environments

    Authors: Rong Kang, Yanbin Chen, Ye Liu, Fuxin Jiang, Qingshuo Li, Miao Ma, Jian Liu, Guangliang Zhao, Tieying Zhang, Jianjun Chen, Lei Zhang

    Abstract: Multi-tenant architectures enhance the elasticity and resource utilization of NoSQL databases by allowing multiple tenants to co-locate and share resources. However, in large-scale cloud environments, the diverse and dynamic nature of workloads poses significant challenges for multi-tenant NoSQL databases. Based on our practical observations, we have identified three crucial challenges: (1) the im… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: SIGMOD 2025 accepted

  33. arXiv:2505.07396  [pdf, ps, other

    cs.CV cs.LG

    TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset

    Authors: Olaf Wysocki, Benedikt Schwab, Manoj Kumar Biswanath, Michael Greza, Qilin Zhang, Jingwei Zhu, Thomas Froech, Medhini Heeramaglore, Ihab Hijazi, Khaoula Kanna, Mathias Pechinger, Zhaiyu Chen, Yao Sun, Alejandro Rueda Segura, Ziyang Xu, Omar AbdelGafar, Mansour Mehranfar, Chandan Yeshwanth, Yueh-Cheng Liu, Hadi Yazdi, Jiapan Wang, Stefan Auer, Katharina Anders, Klaus Bogenberger, Andre Borrmann , et al. (9 additional authors not shown)

    Abstract: Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models' updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually… ▽ More

    Submitted 13 May, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: Submitted to the ISPRS Journal of Photogrammetry and Remote Sensing

  34. arXiv:2505.07263  [pdf, other

    cs.CV

    Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning

    Authors: Xiaokun Wang, Chris, Jiangbo Pei, Wei Shen, Yi Peng, Yunzhuo Hao, Weijie Qiu, Ai Jian, Tianyidan Xie, Xuchen Song, Yang Liu, Yahui Zhou

    Abstract: We propose Skywork-VL Reward, a multimodal reward model that provides reward signals for both multimodal understanding and reasoning tasks. Our technical approach comprises two key components: First, we construct a large-scale multimodal preference dataset that covers a wide range of tasks and scenarios, with responses collected from both standard vision-language models (VLMs) and advanced VLM rea… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  35. arXiv:2505.07062  [pdf, ps, other

    cs.CV cs.AI

    Seed1.5-VL Technical Report

    Authors: Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, Pengfei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng , et al. (172 additional authors not shown)

    Abstract: We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B active parameters. Despite its relatively compact architecture, it delivers strong performance across a wide spectrum of public VLM benchmarks and internal evaluati… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  36. arXiv:2505.07003  [pdf, ps, other

    cs.CV

    CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation

    Authors: Peng Li, Suizhi Ma, Jialiang Chen, Yuan Liu, Chongyi Zhang, Wei Xue, Wenhan Luo, Alla Sheffer, Wenping Wang, Yike Guo

    Abstract: Recently, 3D generation methods have shown their powerful ability to automate 3D model creation. However, most 3D generation methods only rely on an input image or a text prompt to generate a 3D model, which lacks the control of each component of the generated 3D model. Any modifications of the input image lead to an entire regeneration of the 3D models. In this paper, we introduce a new method ca… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

    Comments: Siggraph 2025

  37. arXiv:2505.06912  [pdf, ps, other

    cs.CV

    Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI

    Authors: Chao Ding, Mouxiao Bian, Pengcheng Chen, Hongliang Zhang, Tianbin Li, Lihao Liu, Jiayuan Chen, Zhuoran Li, Yabei Zhong, Yongqi Liu, Haiqing Huang, Dongming Shan, Junjun He, Jie Xu

    Abstract: Despite strong performance in medical question-answering, the clinical adoption of Large Language Models (LLMs) is critically hampered by their opaque 'black-box' reasoning, limiting clinician trust. This challenge is compounded by the predominant reliance of current medical LLMs on corpora from scientific literature or synthetic data, which often lack the granular expert validation and high clini… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  38. arXiv:2505.06240  [pdf, ps, other

    eess.SP cs.IT

    Pinching-Antenna Assisted Simultaneous Wireless Information and Power Transfer

    Authors: Yixuan Li, Ji Wang, Yuanwei Liu, Zhiguo Ding

    Abstract: This letter introduces a novel pinching-antenna-system (PASS) assisted simultaneous wireless information and power transfer (SWIPT), where multiple pinching antennas (PAs) are strategically activiated on a waveguide to facilitate information transmission to multiple information receivers (IRs) and power transfer to multiple energy receivers (ERs) simultaneously. Leveraging the single-waveguide arc… ▽ More

    Submitted 26 April, 2025; originally announced May 2025.

  39. arXiv:2505.05877  [pdf, other

    cs.LG cs.AI

    Multi-Modal Molecular Representation Learning via Structure Awareness

    Authors: Rong Yin, Ruyue Liu, Xiaoshuai Hao, Xingrui Zhou, Yong Liu, Can Ma, Weiping Wang

    Abstract: Accurate extraction of molecular representations is a critical step in the drug discovery process. In recent years, significant progress has been made in molecular representation learning methods, among which multi-modal molecular representation methods based on images, and 2D/3D topologies have become increasingly mainstream. However, existing these multi-modal approaches often directly fuse info… ▽ More

    Submitted 11 May, 2025; v1 submitted 9 May, 2025; originally announced May 2025.

    Comments: Accepted by IEEE Transactions on Image Processing (TIP) 2025

  40. arXiv:2505.05591  [pdf, ps, other

    cs.CV

    QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization

    Authors: Yueh-Cheng Liu, Lukas Höllein, Matthias Nießner, Angela Dai

    Abstract: Surface reconstruction is fundamental to computer vision and graphics, enabling applications in 3D modeling, mixed reality, robotics, and more. Existing approaches based on volumetric rendering obtain promising results, but optimize on a per-scene basis, resulting in a slow optimization that can struggle to model under-observed or textureless regions. We introduce QuickSplat, which learns data-dri… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Project page: https://liu115.github.io/quicksplat, Video: https://youtu.be/2IA_gnFvFG8

  41. arXiv:2505.05509  [pdf, ps, other

    eess.IV cs.CV

    StereoINR: Cross-View Geometry Consistent Stereo Super Resolution with Implicit Neural Representation

    Authors: Yi Liu, Xinyi Liu, Panwang Xia, Qiong Wu, Yi Wan, Yongjun Zhang

    Abstract: Stereo image super-resolution (SSR) aims to enhance high-resolution details by leveraging information from stereo image pairs. However, existing stereo super-resolution (SSR) upsampling methods (e.g., pixel shuffle) often overlook cross-view geometric consistency and are limited to fixed-scale upsampling. The key issue is that previous upsampling methods use convolution to independently process de… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  42. arXiv:2505.05505  [pdf, other

    cs.CV eess.IV

    Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation

    Authors: Yiming Qin, Zhu Xu, Yang Liu

    Abstract: Recent text-to-3D models can render high-quality assets, yet they still stumble on objects with complex attributes. The key obstacles are: (1) existing text-to-3D approaches typically lift text-to-image models to extract semantics via text encoders, while the text encoder exhibits limited comprehension ability for long descriptions, leading to deviated cross-attention focus, subsequently wrong att… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: Project page here: https://hierarchical-chain-of-generation.github.io/

  43. arXiv:2505.05353  [pdf, ps, other

    cs.GT

    Weighted Envy-Freeness Revisited: Indivisible Resource and House Allocations

    Authors: Yuxi Liu, Mingyu Xiao

    Abstract: Envy-Freeness is one of the most fundamental and important concepts in fair allocation. Some recent studies have focused on the concept of weighted envy-freeness. Under this concept, each agent is assigned a weight, and their valuations are divided by their weights when assessing fairness. This concept can promote more fairness in some scenarios. But on the other hand, experimental research has sh… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  44. arXiv:2505.05283  [pdf, ps, other

    cs.SE cs.AI

    Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents

    Authors: Kaixin Wang, Tianlin Li, Xiaoyu Zhang, Chong Wang, Weisong Sun, Yang Liu, Bin Shi

    Abstract: Code large language models (CodeLLMs) and agents have shown great promise in tackling complex software engineering tasks.Compared to traditional software engineering methods, CodeLLMs and agents offer stronger abilities, and can flexibly process inputs and outputs in both natural and code. Benchmarking plays a crucial role in evaluating the capabilities of CodeLLMs and agents, guiding their develo… ▽ More

    Submitted 8 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

  45. arXiv:2505.05271  [pdf, other

    cs.CL cs.AI

    T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction

    Authors: Kun Peng, Chaodong Tong, Cong Cao, Hao Peng, Qian Li, Guanlin Wu, Lei Jiang, Yanbing Liu, Philip S. Yu

    Abstract: Aspect sentiment triplet extraction (ASTE) aims to extract triplets composed of aspect terms, opinion terms, and sentiment polarities from given sentences. The table tagging method is a popular approach to addressing this task, which encodes a sentence into a 2-dimensional table, allowing for the tagging of relations between any two words. Previous efforts have focused on designing various downstr… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Accepted by IJCAI2025

  46. arXiv:2505.05261  [pdf, ps, other

    math.OC cs.LG

    ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

    Authors: Yu Liu, Fabricio Oliveira

    Abstract: Two-stage stochastic programming (2SP) offers a basic framework for modelling decision-making under uncertainty, yet scalability remains a challenge due to the computational complexity of recourse function evaluation. Existing learning-based methods like Neural Two-Stage Stochastic Programming (Neur2SP) employ neural networks (NNs) as recourse function surrogates but rely on computationally intens… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  47. arXiv:2505.05240  [pdf, other

    cs.CV

    PADriver: Towards Personalized Autonomous Driving

    Authors: Genghua Kou, Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Ziheng Zhang, Osamu Yoshie, Tiancai Wang, Ying Li, Xiangyu Zhang

    Abstract: In this paper, we propose PADriver, a novel closed-loop framework for personalized autonomous driving (PAD). Built upon Multi-modal Large Language Model (MLLM), PADriver takes streaming frames and personalized textual prompts as inputs. It autoaggressively performs scene understanding, danger level estimation and action decision. The predicted danger level reflects the risk of the potential action… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  48. arXiv:2505.05103  [pdf, other

    cs.CR cs.NI

    A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network

    Authors: Haoxiang Luo, Gang Sun, Yinqiu Liu, Dongcheng Zhao, Dusit Niyato, Hongfang Yu, Schahram Dustdar

    Abstract: Large Language Models (LLMs) have achieved remarkable success across a wide range of applications. However, individual LLMs often produce inconsistent, biased, or hallucinated outputs due to limitations in their training corpora and model architectures. Recently, collaborative frameworks such as the Multi-LLM Network (MultiLLMN) have been introduced, enabling multiple LLMs to interact and jointly… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  49. arXiv:2505.05031  [pdf, ps, other

    cs.IR

    LSRP: A Leader-Subordinate Retrieval Framework for Privacy-Preserving Cloud-Device Collaboration

    Authors: Yingyi Zhang, Pengyue Jia, Xianneng Li, Derong Xu, Maolin Wang, Yichao Wang, Zhaocheng Du, Huifeng Guo, Yong Liu, Ruiming Tang, Xiangyu Zhao

    Abstract: Cloud-device collaboration leverages on-cloud Large Language Models (LLMs) for handling public user queries and on-device Small Language Models (SLMs) for processing private user data, collectively forming a powerful and privacy-preserving solution. However, existing approaches often fail to fully leverage the scalable problem-solving capabilities of on-cloud LLMs while underutilizing the advantag… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  50. arXiv:2505.04889  [pdf, other

    cs.LG cs.CR

    FedRE: Robust and Effective Federated Learning with Privacy Preference

    Authors: Tianzhe Xiao, Yichen Li, Yu Zhou, Yining Qi, Yi Liu, Wei Wang, Haozhao Wang, Yi Wang, Ruixuan Li

    Abstract: Despite Federated Learning (FL) employing gradient aggregation at the server for distributed training to prevent the privacy leakage of raw data, private information can still be divulged through the analysis of uploaded gradients from clients. Substantial efforts have been made to integrate local differential privacy (LDP) into the system to achieve a strict privacy guarantee. However, existing m… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.