Skip to main content

Showing 101–150 of 3,897 results for author: Zhe

.
  1. arXiv:2505.04470  [pdf, other

    math.CO math.DG

    Halin graphs with positive Lin-Lu-Yau curvature

    Authors: Kaizhe Chen, Huiqiu Lin, Shiping Liu, Zhe You

    Abstract: Halin graphs constitute an interesting class of planar and polyhedral graphs. A generalized Halin graph is obtained by connecting all leaves of a planar embedding of a tree via a cycle. A Halin graph is a generalized Halin graph having no vertex of degree two. We classify all generalized Halin graphs with positive Lin-Lu-Yau curvature.

    Submitted 7 May, 2025; originally announced May 2025.

    MSC Class: 05C10; 05C81; 51F99

  2. arXiv:2505.04093  [pdf, other

    hep-ph

    Neutrino-jet correlations in charged-current SIDIS

    Authors: Weihua Yang, Jing Zhao, Zhe Zhang

    Abstract: Charged-current deep inelastic scattering plays a significant role in determining parton distribution functions with flavour separation. In this work, we present a systematic calculation of the charged-current semi-inclusive deep inelastic scattering (SIDIS) in the $eN$ collinear frame up to twist-3 level at leading order. Semi-inclusive refers to the process in which a jet is detected in addition… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  3. DCO$^+$ and DCN 1-0 survey toward a sample of Planck cold clumps

    Authors: Fu Mo, Junzhi Wang, Shu Liu, Yan Duan, Huanxue Feng, Yuqiang Li, Zhe Lu, Rui Luo, Chao Ou, Yani Xu, Zhuoying Yan

    Abstract: Deuterated molecules can be used to study the physical conditions and the astro-chemical evolution of molecular clouds. large-sample surveys for deuterated molecules are needed to understand the enhancement of deuterated molecules from diffuse molecular gas to cold cores. A single-pointing survey toward the 559 Planck cold clumps of the Early Cold Core Catalogue (ECC) has been conducted using the… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 37 pages, 12 figures, published in A&A

    Journal ref: 2025A&A...696A.140M

  4. arXiv:2505.02795  [pdf, other

    cs.LG cs.AI cs.DC

    HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models

    Authors: Zheng Lin, Yuxin Zhang, Zhe Chen, Zihan Fang, Xianhao Chen, Praneeth Vepakomma, Wei Ni, Jun Luo, Yue Gao

    Abstract: Recently, large language models (LLMs) have achieved remarkable breakthroughs, revolutionizing the natural language processing domain and beyond. Due to immense parameter sizes, fine-tuning these models with private data for diverse downstream tasks has become mainstream. Though federated learning (FL) offers a promising solution for fine-tuning LLMs without sharing raw data, substantial computing… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: 16 pages, 22 figures

  5. arXiv:2505.01992  [pdf, ps, other

    astro-ph.GA

    Supermassive Black Holes with High Accretion Rates in Active Galactic Nuclei. XII. Reverberation Mapping Results for 15 PG Quasars from a Long-Duration High-Cadence Campaign

    Authors: Chen Hu, Sha-Sha Li, Sen Yang, Zi-Xu Yang, Wei-Jian Guo, Dong-Wei Bao, Bo-Wei Jiang, Pu Du, Yan-Rong Li, Ming Xiao, Yu-Yang Songsheng, Zhe Yu, Jin-Ming Bai, Luis C. Ho, Michael S. Brotherton, Jesús Aceituno, Hartmut Winkler, Jian-Min Wang

    Abstract: We present the first results from long-term high-cadence spectroscopic monitoring of 15 PG quasars with relatively strong Fe II emission as a part of a broader reverberation mapping campaign performed with the Calar Alto Observatory 2.2m telescope. The $V$-band, 5100 Å continuum, and H$β$ broad emission line light curves were measured for a set of quasars for between dozens to more than a hundred… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: 21 pages, 20 figures, published in ApJS, March 2021

    Journal ref: 2021, ApJS, 253, 20

  6. arXiv:2505.01981  [pdf, other

    physics.plasm-ph

    Electrospray Thruster Plume Dynamics: Insights from Precise PP Coulomb Field Simulation

    Authors: Zhe Liu, Yinjian Zhao

    Abstract: Electrospray thrusters are one important type of micropropulsion systems being developed for next-generation space missions, yet the primary challenge to their operational lifespan is propellant overspray resulting from wide plume angles driven by Coulomb interactions among charged droplets. While existing models often employ truncated Coulomb field approximations, such simplifications compromise… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

  7. arXiv:2505.01978  [pdf, other

    quant-ph

    Generation of 95-qubit genuine entanglement and verification of symmetry-protected topological phases

    Authors: Tao Jiang, Jianbin Cai, Junxiang Huang, Naibin Zhou, Yukun Zhang, Jiahao Bei, Guoqing Cai, Sirui Cao, Fusheng Chen, Jiang Chen, Kefu Chen, Xiawei Chen, Xiqing Chen, Zhe Chen, Zhiyuan Chen, Zihua Chen, Wenhao Chu, Hui Deng, Zhibin Deng, Pei Ding, Xun Ding, Zhuzhengqi Ding, Shuai Dong, Bo Fan, Daojin Fan , et al. (130 additional authors not shown)

    Abstract: Symmetry-protected topological (SPT) phases are fundamental features of cluster states, serving as key resources for measurement-based quantum computation (MBQC). Generating large-scale cluster states and verifying their SPT phases are essential steps toward practical MBQC, which however still presents significant experimental challenges. In this work, we address these challenges by utilizing adva… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: Main text: 15 pages, 4 figures; supplementary materials: 42 pages, 19 figures. Total: 57 pages, 23 figures

  8. arXiv:2505.01766  [pdf, other

    cs.CV cs.RO

    Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement

    Authors: Long Bai, Boyi Ma, Ruohan Wang, Guankun Wang, Beilei Cui, Zhongliang Jiang, Mobarakol Islam, Zhe Min, Jiewen Lai, Nassir Navab, Hongliang Ren

    Abstract: Surgical workflow recognition is vital for automating tasks, supporting decision-making, and training novice surgeons, ultimately improving patient safety and standardizing procedures. However, data corruption can lead to performance degradation due to issues like occlusion from bleeding or smoke in surgical scenes and problems with data storage and transmission. In this case, we explore a robust… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: Accepted by Information Fusion

  9. arXiv:2505.01476  [pdf, other

    eess.IV cs.AI cs.CV

    CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

    Authors: Zhe Zhang, Mingxiu Cai, Hanxiao Wang, Gaochang Wu, Tianyou Chai, Xiatian Zhu

    Abstract: Unsupervised anomaly detection (UAD) seeks to localize the anomaly mask of an input image with respect to normal samples. Either by reconstructing normal counterparts (reconstruction-based) or by learning an image feature embedding space (embedding-based), existing approaches fundamentally rely on image-level or feature-level matching to derive anomaly scores. Often, such a matching process is ina… ▽ More

    Submitted 23 May, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

    Comments: 25 pages, 12 figures, 20 tables, accepted by Forty-Second International Conference on Machine Learning ( ICML 2025 ), link: https://icml.cc/virtual/2025/poster/46359

  10. arXiv:2505.01273  [pdf, other

    cs.CL cs.AI

    Anti-adversarial Learning: Desensitizing Prompts for Large Language Models

    Authors: Xuan Li, Zhe Yin, Xiaodong Gu, Beijun Shen

    Abstract: With the widespread use of LLMs, preserving privacy in user prompts has become crucial, as prompts risk exposing privacy and sensitive data to the cloud LLMs. Traditional techniques like homomorphic encryption, secure multi-party computation, and federated learning face challenges due to heavy computational costs and user participation requirements, limiting their applicability in LLM scenarios. I… ▽ More

    Submitted 25 April, 2025; originally announced May 2025.

  11. arXiv:2505.00938  [pdf, other

    cs.CV cs.AI

    CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion

    Authors: Boyuan Meng, Xiaohan Zhang, Peilin Li, Zhe Wu, Yiming Li, Wenkai Zhao, Beinan Yu, Hui-Liang Shen

    Abstract: Cross-domain few-shot object detection (CD-FSOD) aims to detect novel objects across different domains with limited class instances. Feature confusion, including object-background confusion and object-object confusion, presents significant challenges in both cross-domain and few-shot settings. In this work, we introduce CDFormer, a cross-domain few-shot object detection transformer against feature… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  12. arXiv:2504.21801  [pdf, other

    cs.CL cs.AI

    DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition

    Authors: Z. Z. Ren, Zhihong Shao, Junxiao Song, Huajian Xin, Haocheng Wang, Wanjia Zhao, Liyue Zhang, Zhe Fu, Qihao Zhu, Dejian Yang, Z. F. Wu, Zhibin Gou, Shirong Ma, Hongxuan Tang, Yuxuan Liu, Wenjun Gao, Daya Guo, Chong Ruan

    Abstract: We introduce DeepSeek-Prover-V2, an open-source large language model designed for formal theorem proving in Lean 4, with initialization data collected through a recursive theorem proving pipeline powered by DeepSeek-V3. The cold-start training procedure begins by prompting DeepSeek-V3 to decompose complex problems into a series of subgoals. The proofs of resolved subgoals are synthesized into a ch… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  13. arXiv:2504.21296  [pdf, other

    cs.LG cs.AI

    Fairness in Graph Learning Augmented with Machine Learning: A Survey

    Authors: Renqiang Luo, Ziqi Xu, Xikun Zhang, Qing Qing, Huafei Huang, Enyan Dai, Zhe Wang, Bo Yang

    Abstract: Augmenting specialised machine learning techniques into traditional graph learning models has achieved notable success across various domains, including federated graph learning, dynamic graph learning, and graph transformers. However, the intricate mechanisms of these specialised techniques introduce significant challenges in maintaining model fairness, potentially resulting in discriminatory out… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  14. arXiv:2504.21054  [pdf, other

    cs.CR cs.AI

    FFCBA: Feature-based Full-target Clean-label Backdoor Attacks

    Authors: Yangxu Yin, Honglong Chen, Yudong Gao, Peng Sun, Liantao Wu, Zhe Li, Weifeng Liu

    Abstract: Backdoor attacks pose a significant threat to deep neural networks, as backdoored models would misclassify poisoned samples with specific triggers into target classes while maintaining normal performance on clean samples. Among these, multi-target backdoor attacks can simultaneously target multiple classes. However, existing multi-target backdoor attacks all follow the dirty-label paradigm, where… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  15. arXiv:2504.20820  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Experimental Observation of Extremely Strong Defect-Phonon Scatterings in Semiconductor Single Crystals

    Authors: Zifeng Huang, Jianbo Liang, Yuxiang Wang, Zixuan Sun, Naoteru Shigekawa, Ming Li, Runsheng Wang, Zhe Cheng

    Abstract: The role of doping in tailoring thermal transport in semiconductors is critical for efficient thermal management in electronic devices. While the effects of doping have been extensively studied to tune electrical properties, its impact on thermal transport has not yet been thoroughly explored, particularly with respect to experimental investigations into exceptionally strong non-Rayleigh defect-ph… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  16. arXiv:2504.20624  [pdf, other

    cs.AI

    PaRT: Enhancing Proactive Social Chatbots with Personalized Real-Time Retrieval

    Authors: Zihan Niu, Zheyong Xie, Shaosheng Cao, Chonggang Lu, Zheyu Ye, Tong Xu, Zuozhu Liu, Yan Gao, Jia Chen, Zhe Xu, Yi Wu, Yao Hu

    Abstract: Social chatbots have become essential intelligent companions in daily scenarios ranging from emotional support to personal interaction. However, conventional chatbots with passive response mechanisms usually rely on users to initiate or sustain dialogues by bringing up new topics, resulting in diminished engagement and shortened dialogue duration. In this paper, we present PaRT, a novel framework… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  17. arXiv:2504.20193  [pdf, other

    cs.LG

    ProFi-Net: Prototype-based Feature Attention with Curriculum Augmentation for WiFi-based Gesture Recognition

    Authors: Zhe Cui, Shuxian Zhang, Kangzhi Lou, Le-Nam Tran

    Abstract: This paper presents ProFi-Net, a novel few-shot learning framework for WiFi-based gesture recognition that overcomes the challenges of limited training data and sparse feature representations. ProFi-Net employs a prototype-based metric learning architecture enhanced with a feature-level attention mechanism, which dynamically refines the Euclidean distance by emphasizing the most discriminative fea… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: This paper was accepted at The 9th APWeb-WAIM joint International Conference on Web and Big Data

  18. arXiv:2504.20178  [pdf, other

    cs.CV cs.LG

    A Transformer-based Multimodal Fusion Model for Efficient Crowd Counting Using Visual and Wireless Signals

    Authors: Zhe Cui, Yuli Li, Le-Nam Tran

    Abstract: Current crowd-counting models often rely on single-modal inputs, such as visual images or wireless signal data, which can result in significant information loss and suboptimal recognition performance. To address these shortcomings, we propose TransFusion, a novel multimodal fusion-based crowd-counting model that integrates Channel State Information (CSI) with image data. By leveraging the powerful… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: This paper was accepted at IEEE WCNC 2025

  19. arXiv:2504.19959  [pdf, ps, other

    cs.AR

    From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification

    Authors: Junhao Ye, Yuchen Hu, Ke Xu, Dingrong Pan, Qichun Chen, Jie Zhou, Shuai Zhao, Xinwei Fang, Xi Wang, Nan Guan, Zhe Jiang

    Abstract: Verification presents a major bottleneck in Integrated Circuit (IC) development, consuming nearly 70% of the total development effort. While the Universal Verification Methodology (UVM) is widely used in industry to improve verification efficiency through structured and reusable testbenches, constructing these testbenches and generating sufficient stimuli remain challenging. These challenges arise… ▽ More

    Submitted 28 April, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

  20. arXiv:2504.19432  [pdf, other

    cs.CV cs.AI

    EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation

    Authors: Zhe Dong, Yuzhe Sun, Tianzhu Liu, Wangmeng Zuo, Yanfeng Gu

    Abstract: Satellite imagery and maps, as two fundamental data modalities in remote sensing, offer direct observations of the Earth's surface and human-interpretable geographic abstractions, respectively. The task of bidirectional translation between satellite images and maps (BSMT) holds significant potential for applications in urban planning and disaster response. However, this task presents two major cha… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

  21. arXiv:2504.19099  [pdf, other

    cs.SE cs.AI cs.AR

    VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction

    Authors: Ning Wang, Bingkun Yao, Jie Zhou, Yuchen Hu, Xi Wang, Nan Guan, Zhe Jiang

    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in debugging for various programming languages. However, the application of LLMs to Verilog debugging remains insufficiently explored. Here, we present VeriDebug, an approach that integrates contrastive representation and guided correction capabilities for automated Verilog debugging. Unlike existing methods, VeriDebug employs an… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

  22. arXiv:2504.18881  [pdf, other

    cs.LG

    TSCAN: Context-Aware Uplift Modeling via Two-Stage Training for Online Merchant Business Diagnosis

    Authors: Hangtao Zhang, Zhe Li, Kairui Zhang

    Abstract: A primary challenge in ITE estimation is sample selection bias. Traditional approaches utilize treatment regularization techniques such as the Integral Probability Metrics (IPM), re-weighting, and propensity score modeling to mitigate this bias. However, these regularizations may introduce undesirable information loss and limit the performance of the model. Furthermore, treatment effects vary acro… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: 15 pages,7 figures

  23. arXiv:2504.17187  [pdf, other

    eess.SP

    DualAttWaveNet: Multiscale Attention Networks for Satellite Interference Detection

    Authors: Chunyu Yang, Boyu Yang, Kun Qiu, Zhe Chen, Yue Gao

    Abstract: The escalating overlap between non-geostationary orbit (NGSO) and geostationary orbit (GSO) satellite frequency allocations necessitates accurate interference detection methods that address two pivotal technical gaps: computationally efficient signal analysis for real-time operation, and robust anomaly discrimination under varying interference patterns. Existing deep learning approaches employ enc… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  24. arXiv:2504.16122  [pdf, other

    cs.CY cs.AI

    SOTOPIA-S4: a user-friendly system for flexible, customizable, and large-scale social simulation

    Authors: Xuhui Zhou, Zhe Su, Sophie Feng, Jiaxu Zhou, Jen-tse Huang, Hsien-Te Kao, Spencer Lynch, Svitlana Volkova, Tongshuang Sherry Wu, Anita Woolley, Hao Zhu, Maarten Sap

    Abstract: Social simulation through large language model (LLM) agents is a promising approach to explore and validate hypotheses related to social science questions and LLM agents behavior. We present SOTOPIA-S4, a fast, flexible, and scalable social simulation system that addresses the technical barriers of current frameworks while enabling practitioners to generate multi-turn and multi-party LLM-based int… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Comments: The first author and the second author contributed equally

  25. arXiv:2504.15804  [pdf, other

    cs.AR cs.AI

    Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback

    Authors: Ning Wang, Bingkun Yao, Jie Zhou, Yuchen Hu, Xi Wang, Nan Guan, Zhe Jiang

    Abstract: Large language models (LLMs) have shown strong performance in Verilog generation from natural language description. However, ensuring the functional correctness of the generated code remains a significant challenge. This paper introduces a method that integrates verification insights from testbench into the training of Verilog generation LLMs, aligning the training with the fundamental goal of har… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  26. arXiv:2504.15722  [pdf, other

    stat.ML cs.LG

    From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning

    Authors: Zhe Huang, Simone Rossi, Rui Yuan, Thomas Hannagan

    Abstract: Transformers have become a standard architecture in machine learning, demonstrating strong in-context learning (ICL) abilities that allow them to learn from the prompt at inference time. However, uncertainty quantification for ICL remains an open challenge, particularly in noisy regression tasks. This paper investigates whether ICL can be leveraged for distribution-free uncertainty estimation, pro… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  27. arXiv:2504.15721  [pdf, other

    cs.AR

    BBAL: A Bidirectional Block Floating Point-Based Quantisation Accelerator for Large Language Models

    Authors: Xiaomeng Han, Yuan Cheng, Jing Wang, Junyang Lu, Hui Wang, X. x. Zhang, Ning Xu, Dawei Yang, Zhe Jiang

    Abstract: Large language models (LLMs), with their billions of parameters, pose substantial challenges for deployment on edge devices, straining both memory capacity and computational resources. Block Floating Point (BFP) quantisation reduces memory and computational overhead by converting high-overhead floating point operations into low-bit fixed point operations. However, BFP requires aligning all data to… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  28. arXiv:2504.15279  [pdf, other

    cs.CV

    VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

    Authors: Weiye Xu, Jiahao Wang, Weiyun Wang, Zhe Chen, Wengang Zhou, Aijun Yang, Lewei Lu, Houqiang Li, Xiaohua Wang, Xizhou Zhu, Wenhai Wang, Jifeng Dai, Jinguo Zhu

    Abstract: Visual reasoning is a core component of human intelligence and a critical capability for advanced multimodal models. Yet current reasoning evaluations of multimodal large language models (MLLMs) often rely on text descriptions and allow language-based reasoning shortcuts, failing to measure genuine vision-centric reasoning. To address this, we introduce VisuLogic: a benchmark of 1,000 human-verifi… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

    Comments: Code, data, and baselines are available at https://visulogic-benchmark.github.io/VisuLogic

  29. arXiv:2504.14352  [pdf, other

    math.CO

    Connectivity versus Lin-Lu-Yau curvature

    Authors: Kaizhe Chen, Shiping Liu, Zhe You

    Abstract: We explore the interaction between connectivity and Lin-Lu-Yau curvature of graphs systematically. The intuition is that connected graphs with large Lin-Lu-Yau curvature also have large connectivity, and vice versa. We prove that the connectivity of a connected graph is lower bounded by the product of its minimum degree and its Lin-Lu-Yau curvature. On the other hand, if the connectivity of a grap… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Comments: 22 pages

  30. arXiv:2504.14109  [pdf, other

    stat.ME

    Time-varying treatment effect models in stepped-wedge cluster-randomized trials with multiple interventions

    Authors: Zhe Chen, Wei Wang, Yingying Lu, Scott D. Halpern, Katherine R. Courtright, Fan Li, Michael O. Harhay

    Abstract: The traditional model specification of stepped-wedge cluster-randomized trials assumes a homogeneous treatment effect across time while adjusting for fixed-time effects. However, when treatment effects vary over time, the constant effect estimator may be biased. In the general setting of stepped-wedge cluster-randomized trials with multiple interventions, we derive the expected value of the consta… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 22 pages

  31. arXiv:2504.13914  [pdf, other

    cs.CL

    Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

    Authors: ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen , et al. (249 additional authors not shown)

    Abstract: We introduce Seed1.5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. Seed1.5-Thinking achieves 86.7 on AIME 2024, 55.0 on Codeforces and 77.3 on GPQA, demonstrating excellent reasoning abilities in STEM and coding. Beyond reasoning tasks, the method demonstrates notable generalization across diverse domains. For in… ▽ More

    Submitted 29 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

  32. arXiv:2504.13847  [pdf, other

    cs.HC cs.CL

    Interview AI-ssistant: Designing for Real-Time Human-AI Collaboration in Interview Preparation and Execution

    Authors: Zhe Liu

    Abstract: Recent advances in large language models (LLMs) offer unprecedented opportunities to enhance human-AI collaboration in qualitative research methods, including interviews. While interviews are highly valued for gathering deep, contextualized insights, interviewers often face significant cognitive challenges, such as real-time information processing, question adaptation, and rapport maintenance. My… ▽ More

    Submitted 3 March, 2025; originally announced April 2025.

    Comments: 4 pages, 2 figures, submitted and accepted by IUI 2025 Doctoral Consortium

  33. arXiv:2504.13807  [pdf, other

    cs.RO

    DiffOG: Differentiable Policy Trajectory Optimization with Generalizability

    Authors: Zhengtong Xu, Zichen Miao, Qiang Qiu, Zhe Zhang, Yu She

    Abstract: Imitation learning-based visuomotor policies excel at manipulation tasks but often produce suboptimal action trajectories compared to model-based methods. Directly mapping camera data to actions via neural networks can result in jerky motions and difficulties in meeting critical constraints, compromising safety and robustness in real-world deployment. For tasks that require high robustness or stri… ▽ More

    Submitted 13 May, 2025; v1 submitted 18 April, 2025; originally announced April 2025.

  34. arXiv:2504.13479  [pdf, other

    cs.NI cs.DC cs.LG

    SFL-LEO: Asynchronous Split-Federated Learning Design for LEO Satellite-Ground Network Framework

    Authors: Jiasheng Wu, Jingjing Zhang, Zheng Lin, Zhe Chen, Xiong Wang, Wenjun Zhu, Yue Gao

    Abstract: Recently, the rapid development of LEO satellite networks spurs another widespread concern-data processing at satellites. However, achieving efficient computation at LEO satellites in highly dynamic satellite networks is challenging and remains an open problem when considering the constrained computation capability of LEO satellites. For the first time, we propose a novel distributed learning fram… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 13 pages, 14 figures

  35. arXiv:2504.13207  [pdf, other

    cs.GR cs.RO

    BEV-GS: Feed-forward Gaussian Splatting in Bird's-Eye-View for Road Reconstruction

    Authors: Wenhua Wu, Tong Zhao, Chensheng Peng, Lei Yang, Yintao Wei, Zhe Liu, Hesheng Wang

    Abstract: Road surface is the sole contact medium for wheels or robot feet. Reconstructing road surface is crucial for unmanned vehicles and mobile robots. Recent studies on Neural Radiance Fields (NeRF) and Gaussian Splatting (GS) have achieved remarkable results in scene reconstruction. However, they typically rely on multi-view image inputs and require prolonged optimization times. In this paper, we prop… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  36. arXiv:2504.12636  [pdf, other

    cs.RO

    A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation

    Authors: Rongtao Xu, Jian Zhang, Minghao Guo, Youpeng Wen, Haoting Yang, Min Lin, Jianzheng Huang, Zhe Li, Kaidong Zhang, Liqiong Wang, Yuxuan Kuang, Meng Cao, Feng Zheng, Xiaodan Liang

    Abstract: Robotic manipulation faces critical challenges in understanding spatial affordances--the "where" and "how" of object interactions--essential for complex manipulation tasks like wiping a board or stacking objects. Existing methods, including modular-based and end-to-end approaches, often lack robust spatial reasoning capabilities. Unlike recent point-based and flow-based affordance methods that foc… ▽ More

    Submitted 6 May, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

  37. arXiv:2504.12292  [pdf, ps, other

    cs.CV cs.AI cs.LG

    SHeaP: Self-Supervised Head Geometry Predictor Learned via 2D Gaussians

    Authors: Liam Schoneveld, Zhe Chen, Davide Davoli, Jiapeng Tang, Saimon Terazawa, Ko Nishino, Matthias Nießner

    Abstract: Accurate, real-time 3D reconstruction of human heads from monocular images and videos underlies numerous visual applications. As 3D ground truth data is hard to come by at scale, previous methods have sought to learn from abundant 2D videos in a self-supervised manner. Typically, this involves the use of differentiable mesh rendering, which is effective but faces limitations. To improve on this, w… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: For video demonstrations and additional materials please see https://nlml.github.io/sheap/

  38. arXiv:2504.11845  [pdf, other

    cs.CV

    Boosting Multi-View Stereo with Depth Foundation Model in the Absence of Real-World Labels

    Authors: Jie Zhu, Bo Peng, Zhe Zhang, Bingzheng Liu, Jianjun Lei

    Abstract: Learning-based Multi-View Stereo (MVS) methods have made remarkable progress in recent years. However, how to effectively train the network without using real-world labels remains a challenging problem. In this paper, driven by the recent advancements of vision foundation models, a novel method termed DFM-MVS, is proposed to leverage the depth foundation model to generate the effective depth prior… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  39. arXiv:2504.11773  [pdf, other

    cs.CV

    TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion

    Authors: Yiran Wang, Jiaqi Li, Chaoyi Hong, Ruibo Li, Liusheng Sun, Xiao Song, Zhe Wang, Zhiguo Cao, Guosheng Lin

    Abstract: Radar-Camera depth estimation aims to predict dense and accurate metric depth by fusing input images and Radar data. Model efficiency is crucial for this task in pursuit of real-time processing on autonomous vehicles and robotic platforms. However, due to the sparsity of Radar returns, the prevailing methods adopt multi-stage frameworks with intermediate quasi-dense depth, which are time-consuming… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: Accepted by CVPR 2025 (Oral Presentation)

  40. arXiv:2504.11702  [pdf, other

    cs.LG cs.CR

    Clustering and analysis of user behaviour in blockchain: A case study of Planet IX

    Authors: Dorottya Zelenyanszki, Zhe Hou, Kamanashis Biswas, Vallipuram Muthukkumarasamy

    Abstract: Decentralised applications (dApps) that run on public blockchains have the benefit of trustworthiness and transparency as every activity that happens on the blockchain can be publicly traced through the transaction data. However, this introduces a potential privacy problem as this data can be tracked and analysed, which can reveal user-behaviour information. A user behaviour analysis pipeline was… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 15 pages, 8 figures, submitted to Blockchain: Research and Applications

  41. arXiv:2504.11349  [pdf, other

    cs.CV cs.AI cs.GR

    Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A Systematic Review

    Authors: Yuezhe Yang, Boyu Yang, Yaqian Wang, Yang He, Xingbo Dong, Zhe Jin

    Abstract: The demand for high-quality medical imaging in clinical practice and assisted diagnosis has made 3D reconstruction in radiological imaging a key research focus. Artificial intelligence (AI) has emerged as a promising approach to enhancing reconstruction accuracy while reducing acquisition and processing time, thereby minimizing patient radiation exposure and discomfort and ultimately benefiting cl… ▽ More

    Submitted 17 May, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: 20 pages, 5 figures, submit to Medical Image Analysis

    MSC Class: 68T45 ACM Class: I.4.5

  42. arXiv:2504.11148  [pdf, other

    physics.optics

    Super time-resolved tomography

    Authors: Zhe Hu, Kalle Josefsson, Zisheng Yao, Francisco García-Moreno, Malgorzata Makowska, Yuhe Zhang, Pablo Villanueva-Perez

    Abstract: Understanding 3D fundamental processes is crucial for academic and industrial applications. Nowadays, X-ray time-resolved tomography, or tomoscopy, is a leading technique for in-situ and operando 4D (3D+time) characterization. Despite its ability to achieve 1000 tomograms per second at large-scale X-ray facilities, its applicability is limited by the centrifugal forces exerted on samples and the c… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  43. arXiv:2504.10772  [pdf

    physics.optics

    Scanning-free three-dimensional fluorescent dipoles imaging by polarization self-interference digital holography (pSIDH)

    Authors: Tianlong Man, Wenxue Zhang, Lu Zhang, Ran Zheng, Hua Huang, Xinhui Liu, Hongqiang Zhou, Zhe Wang, Yuhong Wan

    Abstract: Polarization microscopy provides insights into the structure and orientational organization of biomolecules and their architectures in cells. The above key functional signatures, which are natively 3D, can be only detected in 2D for a single measurement in conventional polarization microscopy. It is so far a challenging task to capture simultaneously the 3D structure and molecular orientation in a… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  44. arXiv:2504.10525  [pdf

    q-bio.QM cs.CL cs.IR

    BioChemInsight: An Open-Source Toolkit for Automated Identification and Recognition of Optical Chemical Structures and Activity Data in Scientific Publications

    Authors: Zhe Wang, Fangtian Fu, Wei Zhang, Lige Yan, Yan Meng, Jianping Wu, Hui Wu, Gang Xu, Si Chen

    Abstract: Automated extraction of chemical structures and their bioactivity data is crucial for accelerating drug discovery and enabling data-driven pharmaceutical research. Existing optical chemical structure recognition (OCSR) tools fail to autonomously associate molecular structures with their bioactivity profiles, creating a critical bottleneck in structure-activity relationship (SAR) analysis. Here, we… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

    Comments: 20 pages, 7 figures

  45. arXiv:2504.10479  [pdf, other

    cs.CV

    InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

    Authors: Jinguo Zhu, Weiyun Wang, Zhe Chen, Zhaoyang Liu, Shenglong Ye, Lixin Gu, Hao Tian, Yuchen Duan, Weijie Su, Jie Shao, Zhangwei Gao, Erfei Cui, Xuehui Wang, Yue Cao, Yangzhou Liu, Xingguang Wei, Hongjie Zhang, Haomin Wang, Weiye Xu, Hao Li, Jiahao Wang, Nianchen Deng, Songze Li, Yinan He, Tan Jiang , et al. (26 additional authors not shown)

    Abstract: We introduce InternVL3, a significant advancement in the InternVL series featuring a native multimodal pre-training paradigm. Rather than adapting a text-only large language model (LLM) into a multimodal large language model (MLLM) that supports visual inputs, InternVL3 jointly acquires multimodal and linguistic capabilities from both diverse multimodal data and pure-text corpora during a single p… ▽ More

    Submitted 18 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

    Comments: Technical Report

  46. arXiv:2504.10474  [pdf, other

    cs.RO

    Co-optimizing Physical Reconfiguration Parameters and Controllers for an Origami-inspired Reconfigurable Manipulator

    Authors: Zhe Chen, Li Chen, Hao Zhang, Jianguo Zhao

    Abstract: Reconfigurable robots that can change their physical configuration post-fabrication have demonstrate their potential in adapting to different environments or tasks. However, it is challenging to determine how to optimally adjust reconfigurable parameters for a given task, especially when the controller depends on the robot's configuration. In this paper, we address this problem using a tendon-driv… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  47. arXiv:2504.10160  [pdf, other

    cs.CL cs.AI cs.LG

    MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning

    Authors: Zhaopeng Feng, Shaosheng Cao, Jiahan Ren, Jiayuan Su, Ruizhe Chen, Yan Zhang, Zhe Xu, Yao Hu, Jian Wu, Zuozhu Liu

    Abstract: Large-scale reinforcement learning (RL) methods have proven highly effective in enhancing the reasoning abilities of large language models (LLMs), particularly for tasks with verifiable solutions such as mathematics and coding. However, applying this idea to machine translation (MT), where outputs are flexibly formatted and difficult to automatically evaluate with explicit rules, remains underexpl… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Work in progress. Our code is available at https://github.com/fzp0424/MT-R1-Zero

  48. arXiv:2504.09377  [pdf, other

    cs.CV

    Beyond Degradation Conditions: All-in-One Image Restoration via HOG Transformers

    Authors: Jiawei Wu, Zhifei Yang, Zhe Wang, Zhi Jin

    Abstract: All-in-one image restoration, which aims to address diverse degradations within a unified framework, is critical for practical applications. However, existing methods rely on predicting and integrating degradation conditions, which can misactivate degradation-specific features in complex scenarios, limiting their restoration performance. To address this issue, we propose a novel all-in-one image r… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  49. arXiv:2504.09223  [pdf, other

    cs.CV cs.AI cs.LG

    DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models

    Authors: Wenjin Ke, Zhe Li, Dong Li, Lu Tian, Emad Barsoum

    Abstract: Improving the efficiency of inference in Large Language Models (LLMs) is a critical area of research. Post-training Quantization (PTQ) is a popular technique, but it often faces challenges at low-bit levels, particularly in downstream tasks. Quantization-aware Training (QAT) can alleviate this problem, but it requires significantly more computational resources. To tackle this, we introduced Weight… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

    Journal ref: https://aclanthology.org/2024.emnlp-industry.10/

  50. arXiv:2504.09189  [pdf

    physics.ao-ph

    Low latency global carbon budget reveals a continuous decline of the land carbon sink during the 2023/24 El Nino event

    Authors: Piyu Ke, Philippe Ciais, Yitong Yao, Stephen Sitch, Wei Li, Yidi Xu, Xiaomeng Du, Xiaofan Gui, Ana Bastos, Sonke Zaehle, Ben Poulter, Thomas Colligan, Auke M. van der Woude, Wouter Peters, Zhu Liu, Zhe Jin, Xiangjun Tian, Yilong Wang, Junjie Liu, Sudhanshu Pandey, Chris O'Dell, Jiang Bian, Chuanlong Zhou, John Miller, Xin Lan , et al. (6 additional authors not shown)

    Abstract: The high growth rate of atmospheric CO2 in 2023 was found to be caused by a severe reduction of the global net land carbon sink. Here we update the global CO2 budget from January 1st to July 1st 2024, during which El Niño drought conditions continued to prevail in the Tropics but ceased by March 2024. We used three dynamic global vegetation models (DGVMs), machine learning emulators of ocean model… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.