Skip to main content

Showing 201–250 of 7,142 results for author: Zhou, Y

.
  1. arXiv:2504.03651  [pdf, other

    cs.DC cs.AI cs.LG

    Echo: Efficient Co-Scheduling of Hybrid Online-Offline Tasks for Large Language Model Serving

    Authors: Zhibin Wang, Shipeng Li, Xue Li, Yuhang Zhou, Zhonghui Zhang, Zibo Wang, Rong Gu, Chen Tian, Kun Yang, Sheng Zhong

    Abstract: Large language models have been widely deployed in various applications, encompassing both interactive online tasks and batched offline tasks. Given the burstiness and latency sensitivity of online tasks, over-provisioning resources is common practice. This allows for the integration of latency-insensitive offline tasks during periods of low online load, enhancing resource utilization. However, st… ▽ More

    Submitted 1 March, 2025; originally announced April 2025.

  2. arXiv:2504.03639  [pdf, other

    cs.CV

    Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

    Authors: Ting-Hsuan Liao, Yi Zhou, Yu Shen, Chun-Hao Paul Huang, Saayan Mitra, Jia-Bin Huang, Uttaran Bhattacharya

    Abstract: We explore how body shapes influence human motion synthesis, an aspect often overlooked in existing text-to-motion generation methods due to the ease of learning a homogenized, canonical body shape. However, this homogenization can distort the natural correlations between different body shapes and their motion dynamics. Our method addresses this gap by generating body-shape-aware human motions fro… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: CVPR 2025. Project page: https://shape-move.github.io

  3. arXiv:2504.03559  [pdf, other

    hep-ph hep-ex physics.ins-det

    Constraints on dark matter boosted by supernova shock within the effective field theory framework from the CDEX-10 experiment

    Authors: J. Z. Wang, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, H. Chen, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, H. X. Huang, T. C. Huang, S. Karmakar, H. B. Li , et al. (62 additional authors not shown)

    Abstract: Supernova shocks can boost dark matter (DM) particles to high, yet nonrelativistic, velocities, providing a suitable mechanism for analysis within the framework of the nonrelativistic effective field theory (NREFT). These accelerated DM sources extend the experimental ability to scan the parameter space of light DM into the sub-GeV region. In this study, we specifically analyze DM accelerated by t… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 9 pages, 5 figures

  4. arXiv:2504.03515  [pdf, other

    cs.RO cs.LG

    Dexterous Manipulation through Imitation Learning: A Survey

    Authors: Shan An, Ziyu Meng, Chao Tang, Yuning Zhou, Tengyu Liu, Fangqiang Ding, Shufang Zhang, Yao Mu, Ran Song, Wei Zhang, Zeng-Guang Hou, Hong Zhang

    Abstract: Dexterous manipulation, which refers to the ability of a robotic hand or multi-fingered end-effector to skillfully control, reorient, and manipulate objects through precise, coordinated finger movements and adaptive force modulation, enables complex interactions similar to human hand dexterity. With recent advances in robotics and machine learning, there is a growing demand for these systems to op… ▽ More

    Submitted 24 April, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: 22pages, 5 figures

  5. arXiv:2504.03471  [pdf, other

    cs.CV

    Dynamic Importance in Diffusion U-Net for Enhanced Image Synthesis

    Authors: Xi Wang, Ziqi He, Yang Zhou

    Abstract: Traditional diffusion models typically employ a U-Net architecture. Previous studies have unveiled the roles of attention blocks in the U-Net. However, they overlook the dynamic evolution of their importance during the inference process, which hinders their further exploitation to improve image applications. In this study, we first theoretically proved that, re-weighting the outputs of the Transfo… ▽ More

    Submitted 5 May, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: Accepted to ICME 2025. Appendix & Code: https://github.com/Hytidel/UNetReweighting

  6. arXiv:2504.03294  [pdf, other

    hep-ph nucl-th physics.atom-ph

    Relativistic dynamics of charmonia in strong magnetic fields

    Authors: Liuyuan Wen, Meijian Li, Yiyu Zhou, Yang Li, James P. Vary

    Abstract: We investigate the properties of charmonium systems in strong external magnetic fields using a relativistic light-front Hamiltonian approach within the Basis Light-Front Quantization (BLFQ) framework. By solving the eigenvalue problem for the invariant mass squared operator with confinement potentials and one-gluon-exchange interactions, we obtain the mass spectrum and wave functions under varying… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 21 pages, 9 figures

  7. arXiv:2504.03164  [pdf, other

    cs.CV cs.AI

    NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving

    Authors: Kexin Tian, Jingrui Mao, Yunlong Zhang, Jiwan Jiang, Yang Zhou, Zhengzhong Tu

    Abstract: Recent advancements in Vision-Language Models (VLMs) have demonstrated strong potential for autonomous driving tasks. However, their spatial understanding and reasoning-key capabilities for autonomous driving-still exhibit significant limitations. Notably, none of the existing benchmarks systematically evaluate VLMs' spatial reasoning capabilities in driving scenarios. To fill this gap, we propose… ▽ More

    Submitted 6 April, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

  8. arXiv:2504.03044  [pdf, other

    nucl-th nucl-ex

    A unified algorithm for multi-particle correlations between azimuthal angle and transverse momentum in ultra-relativistic nuclear collisions

    Authors: Emil Gorm Dahlbæk Nielsen, Nina Nathanson, Kristjan Gulbrandsen, You Zhou

    Abstract: Multi-particle correlations between azimuthal angle and mean transverse momentum are a powerful tool for probing size and shape correlations in the initial conditions of heavy-ion collisions. These correlations have also been employed to investigate nuclear structure, including potential nuclear shape phase transitions at the energy frontier. However, their implementation is highly nontrivial, and… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

    Comments: 10 pages, 3 figures, 1 table

  9. arXiv:2504.02436  [pdf, other

    cs.CV

    SkyReels-A2: Compose Anything in Video Diffusion Transformers

    Authors: Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou

    Abstract: This paper presents SkyReels-A2, a controllable video generation framework capable of assembling arbitrary visual elements (e.g., characters, objects, backgrounds) into synthesized videos based on textual prompts while maintaining strict consistency with reference images for each element. We term this task elements-to-video (E2V), whose primary challenges lie in preserving the fidelity of each ref… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

  10. arXiv:2504.02316  [pdf, other

    cs.CV cs.AI

    ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

    Authors: Yuan Zhou, Shilong Jin, Litao Hua, Wanjun Lv, Haoran Duan, Jungong Han

    Abstract: Recent advances in zero-shot text-to-3D generation have revolutionized 3D content creation by enabling direct synthesis from textual descriptions. While state-of-the-art methods leverage 3D Gaussian Splatting with score distillation to enhance multi-view rendering through pre-trained text-to-image (T2I) models, they suffer from inherent view biases in T2I priors. These biases lead to inconsistent… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

    Comments: 13 pages, 11 figures, 3 tables

  11. arXiv:2504.01823  [pdf, other

    hep-ex

    Evidence of doubly OZI-suppressed decay $η_{c} \to ωφ$ in the radiative decay $J/ψ\to γη_{c}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

    Abstract: Using a sample of $(10087\pm44) \times 10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, the first evidence for the doubly OZI-suppressed decay $η_{c} \to ωφ$ is reported with a significance of 4.0$σ$. The branching fraction of $η_{c} \to ωφ$ is measured to be $\mathcal{B}(η_{c} \to ωφ) = (3.86 \pm 0.92 \pm 0.62) \times 10^{-5}$, where the first uncertainty is statist… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  12. arXiv:2504.01763  [pdf, other

    cond-mat.supr-con cond-mat.str-el

    Fully-gapped superconductivity with rotational symmetry breaking in pressurized kagome metal CsV$_3$Sb$_5$

    Authors: X. Y. Feng, Z. Zhao, J. Luo, Y. Z. Zhou, J. Yang, A. F. Fang, H. T. Yang, H. -J. Gao, R. Zhou, Guo-qing Zheng

    Abstract: The discovery of the kagome metal CsV$_3$Sb$_5$ has generated significant interest in its complex physical properties, particularly its superconducting behavior under different pressures, though its nature remains debated. Here, we performed low-temperature, high-pressure $^{121/123}$Sb nuclear quadrupole resonance (NQR) measurements to explore the superconducting pairing symmetry in CsV$_3$Sb… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

    Comments: 25 pages, 4 figures, to appear in Nature Communications

  13. The Mini-SiTian Array: first-two-year operation

    Authors: Min He, Hong Wu, Liang Ge, Jian-feng Tian, Zheng Wang, Hai-yang Mu, Yu Zhang, Yang Huang, Jie Zheng, Zhou Fan, Zheng-yang Li, Hong-hui Gu, Heng-geng Han, Kai Xiao, Zhi-rui Li, Jun-jie Jin, Bei-chuan Wang, Jun Ma, Jin-hang Zou, Ying Wu, Jiu-peng Guo, Li-guo Fang, Zhi-gang Hou, Bo-wen Zhang, Yun-fei Xu , et al. (48 additional authors not shown)

    Abstract: The SiTian project, designed to utilize 60 telescopes distributed across multiple sites in China, is a next-generation time-domain survey initiative. As a pathfinder for the SiTian project, the Mini-SiTian (MST) has been proposed and implemented to test the SiTian's brain and data pipeline, and to evaluate the feasibility of its technology and science cases. Mounted at the Xinglong Observatory, th… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

    Comments: 10 pages, 11 figures, Accepted for publication in a special issue of Research in Astronomy and Astrophysics on the Mini-SiTian Array

  14. arXiv:2504.01025  [pdf

    eess.IV cs.AI cs.CV physics.med-ph

    Diagnosis of Pulmonary Hypertension by Integrating Multimodal Data with a Hybrid Graph Convolutional and Transformer Network

    Authors: Fubao Zhu, Yang Zhang, Gengmin Liang, Jiaofen Nan, Yanting Li, Chuang Han, Danyang Sun, Zhiguo Wang, Chen Zhao, Wenxuan Zhou, Jian He, Yi Xu, Iokfai Cheang, Xu Zhu, Yanli Zhou, Weihua Zhou

    Abstract: Early and accurate diagnosis of pulmonary hypertension (PH) is essential for optimal patient management. Differentiating between pre-capillary and post-capillary PH is critical for guiding treatment decisions. This study develops and validates a deep learning-based diagnostic model for PH, designed to classify patients as non-PH, pre-capillary PH, or post-capillary PH. This retrospective study ana… ▽ More

    Submitted 27 March, 2025; originally announced April 2025.

    Comments: 23 pages, 8 figures, 4 tables

  15. arXiv:2504.00996  [pdf, other

    cs.CV

    TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

    Authors: Liangbin Xie, Daniil Pakhomov, Zhonghao Wang, Zongze Wu, Ziyan Chen, Yuqian Zhou, Haitian Zheng, Zhifei Zhang, Zhe Lin, Jiantao Zhou, Chao Dong

    Abstract: This paper introduces TurboFill, a fast image inpainting model that enhances a few-step text-to-image diffusion model with an inpainting adapter for high-quality and efficient inpainting. While standard diffusion models generate high-quality results, they incur high computational costs. We overcome this by training an inpainting adapter on a few-step distilled text-to-image model, DMD2, using a no… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: Project webpage available at https://liangbinxie.github.io/projects/TurboFill/

  16. arXiv:2504.00993  [pdf, other

    cs.CL cs.AI

    MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

    Authors: Juncheng Wu, Wenlong Deng, Xingxuan Li, Sheng Liu, Taomian Mi, Yifan Peng, Ziyang Xu, Yi Liu, Hyunjin Cho, Chang-In Choi, Yihan Cao, Hui Ren, Xiang Li, Xiaoxiao Li, Yuyin Zhou

    Abstract: Medical tasks such as diagnosis and treatment planning require precise and complex reasoning, particularly in life-critical domains. Unlike mathematical reasoning, medical reasoning demands meticulous, verifiable thought processes to ensure reliability and accuracy. However, there is a notable lack of datasets that provide transparent, step-by-step reasoning to validate and enhance the medical rea… ▽ More

    Submitted 4 April, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

    Comments: 18 pages, 11 figures, 6 tables. Project page: https://github.com/UCSC-VLAA/MedReason

  17. arXiv:2504.00966  [pdf, other

    cs.RO eess.SY

    Time-optimal Convexified Reeds-Shepp Paths on a Sphere

    Authors: Sixu Li, Deepak Prakash Kumar, Swaroop Darbha, Yang Zhou

    Abstract: This article addresses time-optimal path planning for a vehicle capable of moving both forward and backward on a unit sphere with a unit maximum speed, and constrained by a maximum absolute turning rate $U_{max}$. The proposed formulation can be utilized for optimal attitude control of underactuated satellites, optimal motion planning for spherical rolling robots, and optimal path planning for mob… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  18. arXiv:2504.00869  [pdf, other

    cs.CL cs.AI

    m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models

    Authors: Xiaoke Huang, Juncheng Wu, Hui Liu, Xianfeng Tang, Yuyin Zhou

    Abstract: Test-time scaling has emerged as a powerful technique for enhancing the reasoning capabilities of large language models. However, its effectiveness in medical reasoning remains uncertain, as the medical domain fundamentally differs from mathematical tasks in terms of knowledge representation and decision-making processes. In this paper, we provide the first comprehensive investigation of test-time… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: 17 pages; 7 figures; Data, code, and models: https://github.com/UCSC-VLAA/m1

  19. arXiv:2504.00784  [pdf, other

    cs.CV cs.LG

    CellVTA: Enhancing Vision Foundation Models for Accurate Cell Segmentation and Classification

    Authors: Yang Yang, Xijie Xu, Yixun Zhou, Jie Zheng

    Abstract: Cell instance segmentation is a fundamental task in digital pathology with broad clinical applications. Recently, vision foundation models, which are predominantly based on Vision Transformers (ViTs), have achieved remarkable success in pathology image analysis. However, their improvements in cell instance segmentation remain limited. A key challenge arises from the tokenization process in ViTs, w… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  20. arXiv:2504.00598  [pdf, other

    cs.DC

    CFP: Low-overhead Profiling-based Intra-operator Parallelism Generation by Preserving Communication-Free Structures

    Authors: Weifang Hu, Xuanhua Shi, Chang Wu, Yunkai Zhang, Xuan Peng, Jiaqi Zhai, Hai Jin, Yongluan Zhou, Xuehai Qian

    Abstract: This paper introduces CFP, a system that search intra-operator parallelism configurations by leveraging runtime profiles of actual parallel programs. The key idea is to profile a limited space by identifying a new structure named ParallelBlock, which is a group of operators with the property of communication-free tensor partition propagation: the partition of its input tensor can propagate through… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  21. arXiv:2504.00431  [pdf, other

    cs.CV

    Enhancing Fundus Image-based Glaucoma Screening via Dynamic Global-Local Feature Integration

    Authors: Yuzhuo Zhou, Chi Liu, Sheng Shen, Siyu Le, Liwen Yu, Sihan Ouyang, Zongyuan Ge

    Abstract: With the advancements in medical artificial intelligence (AI), fundus image classifiers are increasingly being applied to assist in ophthalmic diagnosis. While existing classification models have achieved high accuracy on specific fundus datasets, they struggle to address real-world challenges such as variations in image quality across different imaging devices, discrepancies between training and… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  22. arXiv:2504.00299  [pdf, other

    cs.AI

    Collaborative LLM Numerical Reasoning with Local Data Protection

    Authors: Min Zhang, Yuzhe Lu, Yun Zhou, Panpan Xu, Lin Lee Cheong, Chang-Tien Lu, Haozhu Wang

    Abstract: Numerical reasoning over documents, which demands both contextual understanding and logical inference, is challenging for low-capacity local models deployed on computation-constrained devices. Although such complex reasoning queries could be routed to powerful remote models like GPT-4, exposing local data raises significant data leakage concerns. Existing mitigation methods generate problem descri… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

  23. arXiv:2504.00269  [pdf, other

    math.PR math-ph

    Existence of Full Replica Symmetry Breaking for the Sherrington-Kirkpatrick Model at Low Temperature

    Authors: Yuxin Zhou

    Abstract: We prove the existence of full replica symmetry breaking (FRSB) for the Sherrington-Kirkpatrick (SK) model at low temperature. More specifically, we prove that slightly beyond the critical temperature, the Parisi measure for the SK model is supported on an interval starting at the origin and only has one jump discontinuity at the right endpoint.

    Submitted 15 April, 2025; v1 submitted 31 March, 2025; originally announced April 2025.

    Comments: 19 pages, 2 figures. Results improved in Theorem 1 and Corollary 2

  24. arXiv:2504.00035  [pdf, other

    cs.CR cs.AI

    MiZero: The Shadowy Defender Against Text Style Infringements

    Authors: Ziwei Zhang, Juan Wen, Wanli Peng, Zhengxian Wu, Yinghan Zhou, Yiming Xue

    Abstract: In-Context Learning (ICL) and efficient fine-tuning methods significantly enhanced the efficiency of applying Large Language Models (LLMs) to downstream tasks. However, they also raise concerns about the imitation and infringement of personal creative data. Current methods for data copyright protection primarily focuses on content security but lacks effectiveness in protecting the copyrights of te… ▽ More

    Submitted 30 March, 2025; originally announced April 2025.

  25. arXiv:2503.24304  [pdf, other

    astro-ph.GA

    Gravitational Waves from Massive Black Hole Mergers in ASTRID: Predictions for LISA

    Authors: Bonny Y. Wang, Yihao Zhou, William Chen, Nianyi Chen, Tiziana Di Matteo, Rupert Croft, Simeon Bird, Yueying Ni

    Abstract: We use the ASTRID cosmological simulation to forecast massive black hole (MBH) mergers detectable by LISA down to $z=0$. ASTRID directly models MBH dynamical friction, allowing a realistic tracking of their trajectory. It also incorporates relatively low-mass MBH seeds down to $5\times 10^{4}\mathrm{M}_{\odot}$, providing a more complete picture of LISA MBH mergers. We find that LISA MBH mergers i… ▽ More

    Submitted 26 April, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

    Comments: 19 Pages, 12 Figures; Submitted to ApJ

  26. arXiv:2503.24026  [pdf, other

    cs.CV

    HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation

    Authors: Boyuan Wang, Xiaofeng Wang, Chaojun Ni, Guosheng Zhao, Zhiqin Yang, Zheng Zhu, Muyang Zhang, Yukun Zhou, Xinze Chen, Guan Huang, Lihong Liu, Xingang Wang

    Abstract: Human-motion video generation has been a challenging task, primarily due to the difficulty inherent in learning human body movements. While some approaches have attempted to drive human-centric video generation explicitly through pose control, these methods typically rely on poses derived from existing videos, thereby lacking flexibility. To address this, we propose HumanDreamer, a decoupled human… ▽ More

    Submitted 31 March, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

    Comments: Project Page: https://humandreamer.github.io

  27. arXiv:2503.23752  [pdf, other

    cs.GR cs.CV

    StrokeFusion: Vector Sketch Generation via Joint Stroke-UDF Encoding and Latent Sequence Diffusion

    Authors: Jin Zhou, Yi Zhou, Pengfei Xu, Hui Huang

    Abstract: In the field of sketch generation, raster-format trained models often produce non-stroke artifacts, while vector-format trained models typically lack a holistic understanding of sketches, leading to compromised recognizability. Moreover, existing methods struggle to extract common features from similar elements (e.g., eyes of animals) appearing at varying positions across sketches. To address thes… ▽ More

    Submitted 16 April, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

  28. arXiv:2503.23751  [pdf, other

    cs.CV

    Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

    Authors: Yu Zhou, Dian Zheng, Qijie Mo, Renjie Lu, Kun-Yu Lin, Wei-Shi Zheng

    Abstract: In this work, we present DEcoupLEd Distillation To Erase (DELETE), a general and strong unlearning method for any class-centric tasks. To derive this, we first propose a theoretical framework to analyze the general form of unlearning loss and decompose it into forgetting and retention terms. Through the theoretical framework, we point out that a class of previous methods could be mainly formulated… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: CVPR2025, Equal contributions from first two authors

  29. arXiv:2503.23725  [pdf, other

    cs.CV

    Exploring Temporal Dynamics in Event-based Eye Tracker

    Authors: Hongwei Ren, Xiaopeng Lin, Hongxiang Huang, Yue Zhou, Bojun Cheng

    Abstract: Eye-tracking is a vital technology for human-computer interaction, especially in wearable devices such as AR, VR, and XR. The realization of high-speed and high-precision eye-tracking using frame-based image sensors is constrained by their limited temporal resolution, which impairs the accurate capture of rapid ocular dynamics, such as saccades and blinks. Event cameras, inspired by biological vis… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: Accepted by CVPR 2025 Event-based Vision Workshop

  30. arXiv:2503.23169  [pdf, other

    physics.optics

    Nonreciprocity and unidirectional invisibility in three optical modes with non-Markovian effects

    Authors: H. Yi, T. Z. Luan, W. Y. Hu, Cheng Shang, Yan-Hui Zhou, Zhi-Cheng Shi, H. Z. Shen

    Abstract: In this work, we construct three coupled optical modes systems to obtain effective Hamiltonian mediated by coherent dissipative coupling during adiabatic elimination of large dissipation mode. We investigate the cooperative effect of coherent and dissipative photon-photon couplings in an open cavity system, which leads to nonreciprocity with a considerably large isolation ratio and flexible contro… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

    Comments: 20 pages, 11 figures

  31. arXiv:2503.23137  [pdf, other

    cs.CV cs.CL

    When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?

    Authors: Tuo Liang, Zhe Hu, Jing Li, Hao Zhang, Yiren Lu, Yunlai Zhou, Yiran Qiao, Disheng Liu, Jeirui Peng, Jing Ma, Yu Yin

    Abstract: Understanding humor-particularly when it involves complex, contradictory narratives that require comparative reasoning-remains a significant challenge for large vision-language models (VLMs). This limitation hinders AI's ability to engage in human-like reasoning and cultural expression. In this paper, we investigate this challenge through an in-depth analysis of comics that juxtapose panels to cre… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  32. arXiv:2503.22976  [pdf, other

    cs.CV

    From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D

    Authors: Jiahui Zhang, Yurui Chen, Yanpeng Zhou, Yueming Xu, Ze Huang, Jilin Mei, Junhui Chen, Yu-Jie Yuan, Xinyue Cai, Guowei Huang, Xingyue Quan, Hang Xu, Li Zhang

    Abstract: Recent advances in LVLMs have improved vision-language understanding, but they still struggle with spatial perception, limiting their ability to reason about complex 3D scenes. Unlike previous approaches that incorporate 3D representations into models to improve spatial understanding, we aim to unlock the potential of VLMs by leveraging spatially relevant image data. To this end, we introduce a no… ▽ More

    Submitted 9 May, 2025; v1 submitted 29 March, 2025; originally announced March 2025.

    Comments: Project page: https://fudan-zvg.github.io/spar

  33. arXiv:2503.22923  [pdf, other

    math.OC cs.LG stat.ML

    Nested Stochastic Gradient Descent for (Generalized) Sinkhorn Distance-Regularized Distributionally Robust Optimization

    Authors: Yufeng Yang, Yi Zhou, Zhaosong Lu

    Abstract: Distributionally robust optimization (DRO) is a powerful technique to train robust models against data distribution shift. This paper aims to solve regularized nonconvex DRO problems, where the uncertainty set is modeled by a so-called generalized Sinkhorn distance and the loss function is nonconvex and possibly unbounded. Such a distance allows to model uncertainty of distributions with different… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

    Comments: 30 pages, 20 figures, 1 table

  34. arXiv:2503.22728  [pdf, other

    cs.SD cs.CV eess.AS

    Dual Audio-Centric Modality Coupling for Talking Head Generation

    Authors: Ao Fu, Ziqi Ni, Yi Zhou

    Abstract: The generation of audio-driven talking head videos is a key challenge in computer vision and graphics, with applications in virtual avatars and digital media. Traditional approaches often struggle with capturing the complex interaction between audio and facial dynamics, leading to lip synchronization and visual quality issues. In this paper, we propose a novel NeRF-based framework, Dual Audio-Cent… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: 9 pages, 4 figures

  35. arXiv:2503.22322  [pdf, other

    astro-ph.CO

    LiteBIRD Science Goals and Forecasts: constraining isotropic cosmic birefringence

    Authors: E. de la Hoz, P. Diego-Palazuelos, J. Errard, A. Gruppuso, B. Jost, R. M. Sullivan, M. Bortolami, Y. Chinone, L. T. Hergt, E. Komatsu, Y. Minami, I. Obata, D. Paoletti, D. Scott, P. Vielva, D. Adak, R. Akizawa, A. Anand, J. Aumont, C. Baccigalupi, A. J. Banday, R. B. Barreiro, N. Bartolo, S. Basak, A. Basyrov , et al. (90 additional authors not shown)

    Abstract: Cosmic birefringence (CB) is the rotation of the photons' linear polarisation plane during propagation. Such an effect is a tracer of parity-violating extensions of standard electromagnetism and would probe the existence of a new cosmological field acting as dark matter or dark energy. It has become customary to employ cosmic microwave background (CMB) polarised data to probe such a phenomenon. Re… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

    Comments: 54 pages, 22 figures

  36. arXiv:2503.22204  [pdf, other

    cs.CV

    Segment then Splat: A Unified Approach for 3D Open-Vocabulary Segmentation based on Gaussian Splatting

    Authors: Yiren Lu, Yunlai Zhou, Yiran Qiao, Chaoda Song, Tuo Liang, Jing Ma, Yu Yin

    Abstract: Open-vocabulary querying in 3D space is crucial for enabling more intelligent perception in applications such as robotics, autonomous systems, and augmented reality. However, most existing methods rely on 2D pixel-level parsing, leading to multi-view inconsistencies and poor 3D object retrieval. Moreover, they are limited to static scenes and struggle with dynamic scenes due to the complexities of… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

    Comments: Project page: https://vulab-ai.github.io/Segment-then-Splat/

  37. arXiv:2503.22150  [pdf, ps, other

    math.AG

    Uniform vector bundles over $\mathbb{P}^4$

    Authors: Rong Du, Yuhang Zhou

    Abstract: There is a long-standing conjecture which states that every uniform algebraic vector bundle of rank $r<2n$ on the $n$-dimensional projective space $\mathbb{P}^n$ over an algebraically closed field of characteristic $0$ is homogeneous. This conjecture is valid for $n\leq3$. In this paper, we classify all uniform vector bundles of rank $r<8$ over $\mathbb{P}^4$ and show that the conjecture holds for… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  38. arXiv:2503.22126  [pdf, other

    hep-ex

    Updated model-independent measurement of the strong-phase differences between $D^0$ and $\bar{D}^0 \to K^{0}_{S/L}π^+π^-$ decays

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (696 additional authors not shown)

    Abstract: The strong-phase differences between $D^0\to K_{S/L}^0π^+π^-$ and $\bar{D}^0\to K_{S/L}^0π^+π^-$ decays are one of the most important inputs in measuring the $C\!P$ violating angle $γ$ via $B^- \to D K^-$ decays. They also play a key role in studies of charm mixing and indirect $C\!P$ violation. In this paper, the strong-phase differences are determined in a model-independent way with quantum-corr… ▽ More

    Submitted 18 April, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

  39. arXiv:2503.22017  [pdf, other

    cs.AR

    Performance Characterizations and Usage Guidelines of Samsung CXL Memory Module Hybrid Prototype

    Authors: Jianping Zeng, Shuyi Pei, Da Zhang, Yuchen Zhou, Amir Beygi, Xuebin Yao, Ramdas Kachare, Tong Zhang, Zongwang Li, Marie Nguyen, Rekha Pitchumani, Yang Soek Ki, Changhee Jung

    Abstract: The growing prevalence of data-intensive workloads, such as artificial intelligence (AI), machine learning (ML), high-performance computing (HPC), in-memory databases, and real-time analytics, has exposed limitations in conventional memory technologies like DRAM. While DRAM offers low latency and high throughput, it is constrained by high costs, scalability challenges, and volatility, making it le… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  40. arXiv:2503.21969  [pdf, other

    cs.RO cs.AI

    Data-Agnostic Robotic Long-Horizon Manipulation with Vision-Language-Guided Closed-Loop Feedback

    Authors: Yuan Meng, Xiangtong Yao, Haihui Ye, Yirui Zhou, Shengqiang Zhang, Zhenshan Bing, Alois Knoll

    Abstract: Recent advances in language-conditioned robotic manipulation have leveraged imitation and reinforcement learning to enable robots to execute tasks from human commands. However, these methods often suffer from limited generalization, adaptability, and the lack of large-scale specialized datasets, unlike data-rich domains such as computer vision, making long-horizon task execution challenging. To ad… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: initial upload 8 page

  41. arXiv:2503.21739  [pdf, other

    physics.optics

    Emergent Non-Markovian Gain in Open Quantum Systems

    Authors: H. Z. Shen, Cheng Shang, Yan-Hui Zhou, X. X. Yi

    Abstract: Non-Markovian dynamics go beyond the Markovian approximation by capturing memory effects and information backflow in open quantum systems, which are crucial for describing realistic physical processes. In this work, we study the exact non-Markovian dynamics of a driven cavity coupled to an anisotropic three-dimensional photonic-crystal environment via counterrotating-wave interactions. We derive a… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 17 pages, 13 figures

  42. arXiv:2503.21413  [pdf, other

    hep-ex

    First observation of $Λ_{c}(2595)^{+} \to Λ^{+}_{c}π^0π^0$ and $Λ_{c}(2625)^{+}\to Λ^{+}_{c}π^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (657 additional authors not shown)

    Abstract: By analysing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 368.48~pb$^{-1}$ collected at the centre-of-mass energies of $\sqrt{s} = 4.918$ and $4.951$~GeV with the BESIII detector, we report the first observation of $Λ_{c}(2595)^{+}$ and $Λ_{c}(2625)^{+}\to Λ^{+}_{c}π^0π^0$ with statistical significances of 7.9$σ$ and 11.8$σ$, respectively. The branching fractions of… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 20 pages, 4 figures

  43. arXiv:2503.21364  [pdf, other

    cs.CV

    LandMarkSystem Technical Report

    Authors: Zhenxiang Ma, Zhenyu Yang, Miao Tao, Yuanzhen Zhou, Zeyu He, Yuchang Zhang, Rong Fu, Hengjie Li

    Abstract: 3D reconstruction is vital for applications in autonomous driving, virtual reality, augmented reality, and the metaverse. Recent advancements such as Neural Radiance Fields(NeRF) and 3D Gaussian Splatting (3DGS) have transformed the field, yet traditional deep learning frameworks struggle to meet the increasing demands for scene quality and scale. This paper introduces LandMarkSystem, a novel comp… ▽ More

    Submitted 28 March, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

  44. arXiv:2503.21311  [pdf, other

    hep-ph hep-ex nucl-ex nucl-th

    Global analysis of fragmentation functions to light neutral hadrons

    Authors: Jun Gao, ChongYang Liu, Mengyang Li, XiaoMin Shen, Hongxi Xing, Yuxiang Zhao, Yiyu Zhou

    Abstract: Fragmentation functions (FFs) are crucial non-perturbative components in quantum chromodynamics (QCD), playing a vital role in predictions and understanding of the hadronization process. In this paper, we present the FFs for $K_S^0$, $η$, $π^0$ mesons, and $Λ$ baryons in the context of global QCD analysis. The data included in the fit are from single inclusive $e^+ e^-$ annihilation (SIA), semi-in… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 62 pages, 53 figures

  45. arXiv:2503.21239  [pdf

    eess.SP

    The Optimal Tradeoff Between PAPR and Ambiguity Functions for Generalized OFDM Waveform Set in ISAC Systems

    Authors: Bichai Wang, Xiuhong Wei, Xueru Li, Yongxing Zhou

    Abstract: Integrated sensing and communications (ISAC) has been identified as one of the six usage scenarios for IMT-2030. Compared with communication performance, sensing performance is much more vulnerable to interference, and the received backscattered sensing signal with target information is usually too weak to be detected. It is interesting to understand the optimal tradeoff between interference rejec… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 13 pages, 8 figures

  46. arXiv:2503.21125  [pdf, other

    cs.CV

    Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection

    Authors: Jiajie Quan, Ao Tong, Yuxuan Cai, Xinwei He, Yulong Wang, Yang Zhou

    Abstract: In multi-class unsupervised anomaly detection(MUAD), reconstruction-based methods learn to map input images to normal patterns to identify anomalous pixels. However, this strategy easily falls into the well-known "learning shortcut" issue when decoders fail to capture normal patterns and reconstruct both normal and abnormal samples naively. To address that, we propose to learn the input features i… ▽ More

    Submitted 28 March, 2025; v1 submitted 26 March, 2025; originally announced March 2025.

  47. arXiv:2503.20394  [pdf, other

    cs.LG cs.AI

    FastFT: Accelerating Reinforced Feature Transformation via Advanced Exploration Strategies

    Authors: Tianqi He, Xiaohan Huang, Yi Du, Qingqing Long, Ziyue Qiao, Min Wu, Yanjie Fu, Yuanchun Zhou, Meng Xiao

    Abstract: Feature Transformation is crucial for classic machine learning that aims to generate feature combinations to enhance the performance of downstream tasks from a data-centric perspective. Current methodologies, such as manual expert-driven processes, iterative-feedback techniques, and exploration-generative tactics, have shown promise in automating such data engineering workflow by minimizing human… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: 14 pages, Accepted by ICDE 2025

  48. arXiv:2503.20376  [pdf, other

    cs.IR

    Dewey Long Context Embedding Model: A Technical Report

    Authors: Dun Zhang, Panxiang Zou, Yudong Zhou

    Abstract: This technical report presents the training methodology and evaluation results of the open-source dewey_en_beta embedding model. The increasing demand for retrieval-augmented generation (RAG) systems and the expanding context window capabilities of large language models (LLMs) have created critical challenges for conventional embedding models. Current approaches often struggle to maintain semantic… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: 5 pages, 1 figure

  49. arXiv:2503.20258  [pdf, other

    cs.CV cs.AI cs.LG

    Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos

    Authors: Jiaheng Zhou, Yanfeng Zhou, Wei Fang, Yuxing Tang, Le Lu, Ge Yang

    Abstract: Ultrasound videos are an important form of clinical imaging data, and deep learning-based automated analysis can improve diagnostic accuracy and clinical efficiency. However, the scarcity of labeled data and the inherent challenges of video analysis have impeded the advancement of related methods. In this work, we introduce E-ViM$^3$, a data-efficient Vision Mamba network that preserves the 3D str… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  50. arXiv:2503.20218  [pdf, other

    cs.CV

    Video Motion Graphs

    Authors: Haiyang Liu, Zhan Xu, Fa-Ting Hong, Hsin-Ping Huang, Yi Zhou, Yang Zhou

    Abstract: We present Video Motion Graphs, a system designed to generate realistic human motion videos. Using a reference video and conditional signals such as music or motion tags, the system synthesizes new videos by first retrieving video clips with gestures matching the conditions and then generating interpolation frames to seamlessly connect clip boundaries. The core of our approach is HMInterp, a robus… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: 14 pages,10 figures