Skip to main content

Showing 1–50 of 489 results for author: Sha, S

.
  1. arXiv:2506.14229  [pdf, ps, other

    cs.CV cs.AI

    HRGS: Hierarchical Gaussian Splatting for Memory-Efficient High-Resolution 3D Reconstruction

    Authors: Changbai Li, Haodong Zhu, Hanlin Chen, Juan Zhang, Tongfei Chen, Shuo Yang, Shuwei Shao, Wenhao Dong, Baochang Zhang

    Abstract: 3D Gaussian Splatting (3DGS) has made significant strides in real-time 3D scene reconstruction, but faces memory scalability issues in high-resolution scenarios. To address this, we propose Hierarchical Gaussian Splatting (HRGS), a memory-efficient framework with hierarchical block-level optimization. First, we generate a global, coarse Gaussian representation from low-resolution data. Then, we pa… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  2. arXiv:2506.13059  [pdf, ps, other

    cs.CL cs.LG

    Multipole Attention for Efficient Long Context Reasoning

    Authors: Coleman Hooper, Sebastian Zhao, Luca Manolache, Sehoon Kim, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

    Abstract: Large Reasoning Models (LRMs) have shown promising accuracy improvements on complex problem-solving tasks. While these models have attained high accuracy by leveraging additional computation at test time, they need to generate long chain-of-thought reasoning in order to think before answering, which requires generating thousands of tokens. While sparse attention methods can help reduce the KV cach… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 15 pages

  3. arXiv:2506.11244  [pdf, ps, other

    cs.CL

    Iterative Multilingual Spectral Attribute Erasure

    Authors: Shun Shao, Yftah Ziser, Zheng Zhao, Yifu Qiu, Shay B. Cohen, Anna Korhonen

    Abstract: Multilingual representations embed words with similar meanings to share a common semantic space across languages, creating opportunities to transfer debiasing effects between languages. However, existing methods for debiasing are unable to exploit this opportunity because they operate on individual languages. We present Iterative Multilingual Spectral Attribute Erasure (IMSAE), which identifies an… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 8 pages, 3 figures

  4. arXiv:2506.10741  [pdf, ps, other

    cs.CV

    PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

    Authors: SiXiang Chen, Jianyu Lai, Jialin Gao, Tian Ye, Haoyu Chen, Hengyu Shi, Shitong Shao, Yunlong Lin, Song Fei, Zhaohu Xing, Yeying Jin, Junfeng Luo, Xiaoming Wei, Lei Zhu

    Abstract: Generating aesthetic posters is more challenging than simple design images: it requires not only precise text rendering but also the seamless integration of abstract artistic content, striking layouts, and overall stylistic harmony. To address this, we propose PosterCraft, a unified framework that abandons prior modular pipelines and rigid, predefined layouts, allowing the model to freely explore… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  5. arXiv:2506.04544  [pdf, other

    cs.AR cs.AI cs.LG cs.PL

    hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation

    Authors: Charles Hong, Brendan Roberts, Huijae An, Alex Um, Advay Ratan, Yakun Sophia Shao

    Abstract: Large language models (LLMs) are playing an increasingly large role in domains such as code generation, including hardware code generation, where Verilog is the key language. However, the amount of publicly available Verilog code pales in comparison to the amount of code available for software languages like Python. In this work, we present hdl2v ("HDL-to-Verilog"), a dataset which seeks to increa… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  6. arXiv:2506.04361  [pdf, ps, other

    astro-ph.GA

    ViT-based Local Volume dwarf galaxy Identificationin (VIDA) in the CSST survey

    Authors: Han Qu, Zhen Yuan, Chengliang Wei, Chao Liu, Jiang Chang, Guoliang Li, Nicolas F. Martin, Chaowei Tsai, Shi Shao, Yu Luo, Ran Li, Xi Kang, Xiangxiang Xue, Zhou Fan

    Abstract: Identifying dwarf galaxies within the Local Volume is crucial for constraining the luminosity function of satellite galaxies in the nearby universe. We report the detection capabilities of dwarf galaxies within the Local Volume using the Chinese Space Station Telescope (CSST). Based on the simulated imaging data of CSST, we develop a detection and classification pipeline that combines traditional… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: 16 pages, 18 figures

  7. arXiv:2506.01942  [pdf, ps, other

    cs.CV

    OD3: Optimization-free Dataset Distillation for Object Detection

    Authors: Salwa K. Al Khatib, Ahmed ElHagry, Shitong Shao, Zhiqiang Shen

    Abstract: Training large neural networks on large-scale datasets requires substantial computational resources, particularly for dense prediction tasks such as object detection. Although dataset distillation (DD) has been proposed to alleviate these demands by synthesizing compact datasets from larger ones, most existing work focuses solely on image classification, leaving the more complex detection setting… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Equal Contribution of the first three authors

  8. arXiv:2506.00618  [pdf, ps, other

    cs.AI

    RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents

    Authors: Jingyi Yang, Shuai Shao, Dongrui Liu, Jing Shao

    Abstract: With the rapid development of multimodal large language models (MLLMs), they are increasingly deployed as autonomous computer-use agents capable of accomplishing complex computer tasks. However, a pressing issue arises: Can the safety risk principles designed and aligned for general MLLMs in dialogue scenarios be effectively transferred to real-world computer-use scenarios? Existing research on ev… ▽ More

    Submitted 4 June, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

    Comments: 40 pages, 6 figures, Project Page: https://yjyddq.github.io/RiOSWorld.github.io/

  9. arXiv:2505.22863  [pdf, other

    cs.HC cs.CL

    Large Language Models for Depression Recognition in Spoken Language Integrating Psychological Knowledge

    Authors: Yupei Li, Shuaijie Shao, Manuel Milling, Björn W. Schuller

    Abstract: Depression is a growing concern gaining attention in both public discourse and AI research. While deep neural networks (DNNs) have been used for recognition, they still lack real-world effectiveness. Large language models (LLMs) show strong potential but require domain-specific fine-tuning and struggle with non-textual cues. Since depression is often expressed through vocal tone and behaviour rath… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  10. arXiv:2505.18637  [pdf, ps, other

    cs.IT

    Neural Coding Is Not Always Semantic: Towards The Standardized Coding Workflow in Semantic Communications

    Authors: Hai-Long Qin, Jincheng Dai, Sixian Wang, Xiaoqi Qin, Shuo Shao, Kai Niu, Wenjun Xu, Ping Zhang

    Abstract: Semantic communication, leveraging advanced deep learning techniques, emerges as a new paradigm that meets the requirements of next-generation wireless networks. However, current semantic communication systems, which employ neural coding for feature extraction from raw data, have not adequately addressed the fundamental question: Is general feature extraction through deep neural networks sufficien… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  11. arXiv:2505.18574  [pdf, ps, other

    cs.PL cs.AI cs.AR cs.LG

    Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

    Authors: Charles Hong, Sahil Bhatia, Alvin Cheung, Yakun Sophia Shao

    Abstract: Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today's computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains challenging, leaving much of their potential underutilized. Recently, large language models (LLMs), trained on large amounts of code, have shown significant promise… ▽ More

    Submitted 5 June, 2025; v1 submitted 24 May, 2025; originally announced May 2025.

  12. arXiv:2505.16505  [pdf, ps, other

    cs.CL cs.AI cs.HC

    Sparse Activation Editing for Reliable Instruction Following in Narratives

    Authors: Runcong Zhao, Chengyu Cao, Qinglin Zhu, Xiucheng Lv, Shun Shao, Lin Gui, Ruifeng Xu, Yulan He

    Abstract: Complex narrative contexts often challenge language models' ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  13. arXiv:2505.14205  [pdf, ps, other

    math.DS

    Structure theorems of commuting transformations and minimal $\mathbb{R}$-flows

    Authors: Song Shao, Hui Xu

    Abstract: In this paper, we develop several structure theorems concerning commuting transformations and minimal $\mathbb{R}$-flows. Specifically, we show that if $(X,S)$, $(X,T)$ are minimal systems with $S$ and $T$ being commutative, then they possess an identical higher-order regionally proximal relation. Consequently, both $(X, S)$ and $(X, T)$ share the same increasing sequence of pro-nilfactors. For mi… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 41pages

  14. arXiv:2505.14135  [pdf, other

    cs.CV

    Hunyuan-Game: Industrial-grade Intelligent Game Creation Model

    Authors: Ruihuang Li, Caijin Zhou, Shoujian Zheng, Jianxiang Lu, Jiabin Huang, Comi Chen, Junshu Tang, Guangzheng Xu, Jiale Tao, Hongmei Wang, Donghao Li, Wenqing Yu, Senbo Wang, Zhimin Li, Yetshuan Shi, Haoyu Yang, Yukun Wang, Wenxun Dai, Jiaqi Li, Linqing Wang, Qixun Wang, Zhiyong Xu, Yingfang Zhang, Jiangfeng Xiong, Weijie Kong , et al. (33 additional authors not shown)

    Abstract: Intelligent game creation represents a transformative advancement in game development, utilizing generative artificial intelligence to dynamically generate and enhance game content. Despite notable progress in generative models, the comprehensive synthesis of high-quality game assets, including both images and videos, remains a challenging frontier. To create high-fidelity game content that simult… ▽ More

    Submitted 28 May, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

  15. arXiv:2505.11792  [pdf, ps, other

    cs.AI

    Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling

    Authors: Yitian Chen, Jingfan Xia, Siyu Shao, Dongdong Ge, Yinyu Ye

    Abstract: Optimization modeling is fundamental to decision-making across diverse domains. Despite progress in automating optimization formulation from natural language descriptions, Large Language Models (LLMs) often struggle to generate formally correct and usable models against hallucinations, posing a challenge for reliable automation. Inspired by the success of Reinforcement Learning (RL) in enhancing L… ▽ More

    Submitted 28 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  16. arXiv:2505.09243  [pdf, ps, other

    astro-ph.GA

    The Shape and Mass of the Galactic Dark Matter Halo from the Axisymmetric Jeans Model

    Authors: Lan Zhang, Xiang-Xiang Xue, Ling Zhu, Ruizhi Zhang, Chengqun Yang, Shi Shao, Jiang Chang, Feilu Wang, Hao Tian, Gang Zhao, Chao Liu

    Abstract: We explore the density profile, shape, and virial mass of the Milky Way's dark matter halo using K giants (KG) from LAMOST and SDSS/SEGUE, as well as blue horizontal branch (BHB) stars from SDSS. Incorporating Gaia DR3 proper motions, we first investigate the velocity ellipsoid distribution within the $(R, |z|)$ space. The ellipsoids projected onto the $(v_R, v_z)$ plane exhibit near-spherical ali… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: 15 figures, 2 tables. Accepted for publication in ApJ

  17. arXiv:2505.04684  [pdf, ps, other

    cond-mat.str-el cond-mat.mes-hall hep-th

    Parity anomaly from LSM: exact valley symmetries on the lattice

    Authors: Salvatore D. Pace, Minho Luke Kim, Arkya Chatterjee, Shu-Heng Shao

    Abstract: We show that the honeycomb tight-binding model hosts an exact microscopic avatar of its low-energy SU(2) valley symmetry and parity anomaly. Specifically, the SU(2) valley symmetry arises from a collection of conserved, integer-quantized charge operators that obey the Onsager algebra. Along with lattice reflection and time-reversal symmetries, this Onsager symmetry has a Lieb-Schultz-Mattis (LSM)… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 7 pages plus appendices

    Report number: MIT-CTP/5869, YITP-SB-2025-10

  18. arXiv:2504.21738  [pdf, ps, other

    cs.RO

    LangWBC: Language-directed Humanoid Whole-Body Control via End-to-end Learning

    Authors: Yiyang Shao, Xiaoyu Huang, Bike Zhang, Qiayuan Liao, Yuman Gao, Yufeng Chi, Zhongyu Li, Sophia Shao, Koushil Sreenath

    Abstract: General-purpose humanoid robots are expected to interact intuitively with humans, enabling seamless integration into daily life. Natural language provides the most accessible medium for this purpose. However, translating language into humanoid whole-body motion remains a significant challenge, primarily due to the gap between linguistic understanding and physical actions. In this work, we present… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  19. arXiv:2504.20706  [pdf, other

    math.CO

    Every 2-connected, cubic, planar graph with faces of size at most 6 is Hamiltonian

    Authors: Sihong Shao, Yuxuan Wu

    Abstract: We prove that every 2-connected, cubic, planar graph with faces of size at most 6 is Hamiltonian, and show that the 6-face condition is tight. Our results push the connectivity condition of the Barnette-Goodey conjecture to the weakest possible.

    Submitted 29 April, 2025; originally announced April 2025.

  20. arXiv:2504.17504  [pdf, ps, other

    math.DS

    On systems disjoint from all minimal systems

    Authors: Wen Huang, Song Shao, Hui Xu, Xiangdong Ye

    Abstract: Recently, Górska, Lemańczyk, and de la Rue characterized the class of automorphisms disjoint from all ergodic automorphisms. Inspired by their work, we provide several characterizations of systems that are disjoint from all minimal systems. For a topological dynamical system $(X,T)$, it is disjoint from all minimal systems if and only if there exist minimal subsets $(M_i)_{i\in\mathbb{N}}$ of… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: 32 pages

  21. arXiv:2504.17249  [pdf, other

    cs.RO

    Demonstrating Berkeley Humanoid Lite: An Open-source, Accessible, and Customizable 3D-printed Humanoid Robot

    Authors: Yufeng Chi, Qiayuan Liao, Junfeng Long, Xiaoyu Huang, Sophia Shao, Borivoje Nikolic, Zhongyu Li, Koushil Sreenath

    Abstract: Despite significant interest and advancements in humanoid robotics, most existing commercially available hardware remains high-cost, closed-source, and non-transparent within the robotics community. This lack of accessibility and customization hinders the growth of the field and the broader development of humanoid technologies. To address these challenges and promote democratization in humanoid ro… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: Accepted in Robotics: Science and Systems (RSS) 2025

  22. arXiv:2504.16960  [pdf, other

    cs.IT eess.IV

    Can Knowledge Improve Security? A Coding-Enhanced Jamming Approach for Semantic Communication

    Authors: Weixuan Chen, Qianqian Yang, Shuo Shao, Zhiguo Shi, Jiming Chen, Xuemin, Shen

    Abstract: As semantic communication (SemCom) attracts growing attention as a novel communication paradigm, ensuring the security of transmitted semantic information over open wireless channels has become a critical issue. However, traditional encryption methods often introduce significant additional communication overhead to maintain stability, and conventional learning-based secure SemCom methods typically… ▽ More

    Submitted 6 May, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

  23. arXiv:2504.14152  [pdf, ps, other

    cs.AR cs.LG

    FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference

    Authors: Coleman Hooper, Charbel Sakr, Ben Keller, Rangharajan Venkatesan, Kurt Keutzer, Sophia Shao, Brucek Khailany

    Abstract: Quantization is a powerful tool to improve large language model (LLM) inference efficiency by utilizing more energy-efficient low-precision datapaths and reducing memory footprint. However, accurately quantizing LLM weights and activations to low precision is challenging without degrading model accuracy. We propose fine-grained mixed precision (FGMP) quantization, a post-training mixed-precision q… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

  24. arXiv:2504.13151  [pdf, ps, other

    cs.LG cs.AI cs.CL

    MIB: A Mechanistic Interpretability Benchmark

    Authors: Aaron Mueller, Atticus Geiger, Sarah Wiegreffe, Dana Arad, Iván Arcuschin, Adam Belfki, Yik Siu Chan, Jaden Fiotto-Kaufman, Tal Haklay, Michael Hanna, Jing Huang, Rohan Gupta, Yaniv Nikankin, Hadas Orgad, Nikhil Prakash, Anja Reusch, Aruna Sankaranarayanan, Shun Shao, Alessandro Stolfo, Martin Tutek, Amir Zur, David Bau, Yonatan Belinkov

    Abstract: How can we know whether new mechanistic interpretability methods achieve real improvements? In pursuit of lasting evaluation standards, we propose MIB, a Mechanistic Interpretability Benchmark, with two tracks spanning four tasks and five models. MIB favors methods that precisely and concisely recover relevant causal pathways or causal variables in neural language models. The circuit localization… ▽ More

    Submitted 9 June, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Accepted to ICML 2025. Project website at https://mib-bench.github.io

  25. arXiv:2504.01570  [pdf, other

    stat.ML cs.LG physics.comp-ph stat.ME

    Density estimation via mixture discrepancy and moments

    Authors: Zhengyang Lei, Sihong Shao

    Abstract: With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed [D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107] to learn an adaptive piecewise constant approximation defined on a binary sequential partition of the underlying domain, where the star discr… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  26. arXiv:2504.00587  [pdf, ps, other

    cs.MA cs.CL

    AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems

    Authors: Yingxuan Yang, Huacan Chai, Shuai Shao, Yuanyi Song, Siyuan Qi, Renting Rui, Weinan Zhang

    Abstract: The rapid advancement of large language models (LLMs) has enabled the development of multi-agent systems where multiple LLM-based agents collaborate on complex tasks. However, existing systems often rely on centralized coordination, leading to scalability bottlenecks, reduced adaptability, and single points of failure. Privacy and proprietary knowledge concerns further hinder cross-organizational… ▽ More

    Submitted 29 May, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

  27. arXiv:2503.20863  [pdf, other

    hep-th cond-mat.str-el math.OA quant-ph

    Additivity, Haag duality, and non-invertible symmetries

    Authors: Shu-Heng Shao, Jonathan Sorce, Manu Srivastava

    Abstract: The algebraic approach to quantum field theory focuses on the properties of local algebras, whereas the study of (possibly non-invertible) global symmetries emphasizes global aspects of the theory and spacetime. We study connections between these two perspectives by examining how either of two core algebraic properties -- "additivity" or "Haag duality" -- is violated in a 1+1D CFT or lattice model… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: 22 pages

    Report number: MIT-CTP/5853, YITP-SB-2025-06

  28. arXiv:2503.20211  [pdf, other

    cs.CV cs.RO

    Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

    Authors: Weilong Yan, Ming Li, Haipeng Li, Shuwei Shao, Robby T. Tan

    Abstract: Self-supervised depth estimation from monocular cameras in diverse outdoor conditions, such as daytime, rain, and nighttime, is challenging due to the difficulty of learning universal representations and the severe lack of labeled real-world adverse data. Previous methods either rely on synthetic inputs and pseudo-depth labels or directly apply daytime strategies to adverse conditions, resulting i… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  29. arXiv:2503.13319  [pdf, other

    cs.CV

    MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis

    Authors: Shitong Shao, Hongwei Yi, Hanzhong Guo, Tian Ye, Daquan Zhou, Michael Lingelbach, Zhiqiang Xu, Zeke Xie

    Abstract: Recently, open-source video diffusion models (VDMs), such as WanX, Magic141 and HunyuanVideo, have been scaled to over 10 billion parameters. These large-scale VDMs have demonstrated significant improvements over smaller-scale VDMs across multiple dimensions, including enhanced visual quality and more natural motion dynamics. However, these models face two major limitations: (1) High inference ove… ▽ More

    Submitted 31 March, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

  30. arXiv:2503.12387  [pdf, other

    cs.RO

    M2UD: A Multi-model, Multi-scenario, Uneven-terrain Dataset for Ground Robot with Localization and Mapping Evaluation

    Authors: Yanpeng Jia, Shiyi Wang, Shiliang Shao, Yue Wang, Fu Zhang, Ting Wang

    Abstract: Ground robots play a crucial role in inspection, exploration, rescue, and other applications. In recent years, advancements in LiDAR technology have made sensors more accurate, lightweight, and cost-effective. Therefore, researchers increasingly integrate sensors, for SLAM studies, providing robust technical support for ground robots and expanding their application domains. Public datasets are ess… ▽ More

    Submitted 16 March, 2025; originally announced March 2025.

    Comments: 18 pages, 12 figures

  31. arXiv:2503.11254  [pdf, ps, other

    math.OC

    A scalable sequential adaptive cubic regularization algorithm for optimization with general equality constraints

    Authors: Yonggang Pei, Shuai Shao, Mauricio Silva Louzeiro, Detong Zhu

    Abstract: The scalable adaptive cubic regularization method ($\mathrm{ARC_{q}K}$: Dussault et al. in Math. Program. Ser. A 207(1-2):191-225, 2024) has been recently proposed for unconstrained optimization. It has excellent convergence properties, complexity, and promising numerical performance. In this paper, we extend $\mathrm{ARC_{q}K}$ to large scale nonlinear optimization with general equality constrain… ▽ More

    Submitted 27 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

  32. arXiv:2503.09662  [pdf, other

    cs.CV

    CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

    Authors: Shitong Shao, Zikai Zhou, Dian Xie, Yuetong Fang, Tian Ye, Lichen Bai, Zeke Xie

    Abstract: Making text-to-image (T2I) generative model sample both fast and well represents a promising research direction. Previous studies have typically focused on either enhancing the visual quality of synthesized images at the expense of sampling efficiency or dramatically accelerating sampling without improving the base model's generative capacity. Moreover, nearly all inference methods have not been a… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  33. arXiv:2503.05978  [pdf, other

    cs.CV

    MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

    Authors: Hongwei Yi, Tian Ye, Shitong Shao, Xuancheng Yang, Jiantong Zhao, Hanzhong Guo, Terrance Wang, Qingyu Yin, Zeke Xie, Lei Zhu, Wei Li, Michael Lingelbach, Daquan Zhou

    Abstract: We present MagicInfinite, a novel diffusion Transformer (DiT) framework that overcomes traditional portrait animation limitations, delivering high-fidelity results across diverse character types-realistic humans, full-body figures, and stylized anime characters. It supports varied facial poses, including back-facing views, and animates single or multiple characters with input masks for precise spe… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Comments: MagicInfinite is publicly accessible at https://www.hedra.com/. More examples are at https://magicinfinite.github.io/

  34. arXiv:2503.05794  [pdf, other

    cs.CR cs.AI cs.LG cs.SD eess.AS

    CBW: Towards Dataset Ownership Verification for Speaker Verification via Clustering-based Backdoor Watermarking

    Authors: Yiming Li, Kaiying Yan, Shuo Shao, Tongqing Zhai, Shu-Tao Xia, Zhan Qin, Dacheng Tao

    Abstract: With the increasing adoption of deep learning in speaker verification, large-scale speech datasets have become valuable intellectual property. To audit and prevent the unauthorized usage of these valuable released datasets, especially in commercial or open-source scenarios, we propose a novel dataset ownership verification method. Our approach introduces a clustering-based backdoor watermark (CBW)… ▽ More

    Submitted 5 April, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

    Comments: 14 pages. The journal extension of our ICASSP'21 paper (arXiv:2010.11607)

  35. arXiv:2503.02925  [pdf, other

    cond-mat.str-el hep-th quant-ph

    Gauging non-invertible symmetries on the lattice

    Authors: Sahand Seifnashri, Shu-Heng Shao, Xinping Yang

    Abstract: We provide a general prescription for gauging finite non-invertible symmetries in 1+1d lattice Hamiltonian systems. Our primary example is the Rep(D$_8$) fusion category generated by the Kennedy-Tasaki transformation, which is the simplest anomaly-free non-invertible symmetry on a spin chain of qubits. We explicitly compute its lattice F-symbols and illustrate our prescription for a particular (no… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 66 pages, 1 figure, 1 table

    Report number: MIT-CTP/5842, YITP-SB-2025-03

  36. PCL: Prompt-based Continual Learning for User Modeling in Recommender Systems

    Authors: Mingdai Yang, Fan Yang, Yanhui Guo, Shaoyuan Xu, Tianchen Zhou, Yetian Chen, Simone Shao, Jia Liu, Yan Gao

    Abstract: User modeling in large e-commerce platforms aims to optimize user experiences by incorporating various customer activities. Traditional models targeting a single task often focus on specific business metrics, neglecting the comprehensive user behavior, and thus limiting their effectiveness. To develop more generalized user representations, some existing work adopts Multi-task Learning (MTL)approac… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

    Comments: 5 pages. Accepted by www'25 as short paper

  37. arXiv:2502.18508  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    REFINE: Inversion-Free Backdoor Defense via Model Reprogramming

    Authors: Yukun Chen, Shuo Shao, Enhao Huang, Yiming Li, Pin-Yu Chen, Zhan Qin, Kui Ren

    Abstract: Backdoor attacks on deep neural networks (DNNs) have emerged as a significant security threat, allowing adversaries to implant hidden malicious behaviors during the model training phase. Pre-processing-based defense, which is one of the most important defense paradigms, typically focuses on input transformations or backdoor trigger inversion (BTI) to deactivate or eliminate embedded backdoor trigg… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: This paper is accept by ICLR 2025. The first two authors contributed equally to this work. Our code is available at BackdoorBox (https://github.com/THUYimingLi/BackdoorBox) and Github repository (https://github.com/WhitolfChen/REFINE). 28 pages

  38. arXiv:2502.17088  [pdf, other

    astro-ph.GA

    Where are the earliest stars relics in the simulated Milky Way analogues?

    Authors: Hang Yang, Liang Gao, Qi Guo, Haining Li, Shi Shao, Gang Zhao

    Abstract: Using 6 Milky Way analogues with two different numerical resolutions from the Auriga simulation, we investigate the total mass, spatial distribution and kinematics of the earliest stars relics in the Milky Way at $z=0$. These relics (second generation stars) formed over a wide redshift range, from about $z=22$ to $z=4$, with an average formation redshift of $z \sim 10.0$, and comprise about… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 9 pages, 6 figures. Submitted to ApJ

  39. arXiv:2502.13575  [pdf, ps, other

    cs.LG

    ETS: Efficient Tree Search for Inference-Time Scaling

    Authors: Coleman Hooper, Sehoon Kim, Suhong Moon, Kerem Dilmen, Monishwaran Maheswaran, Nicholas Lee, Michael W. Mahoney, Sophia Shao, Kurt Keutzer, Amir Gholami

    Abstract: Test-time compute scaling has emerged as a new axis along which to improve model accuracy, where additional computation is used at inference time to allow the model to think longer for more challenging problems. One promising approach for test-time compute scaling is search against a process reward model, where a model generates multiple potential candidates at each step of the search, and these p… ▽ More

    Submitted 11 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: 15 pages

  40. arXiv:2502.08537  [pdf

    cond-mat.str-el cond-mat.mes-hall cond-mat.mtrl-sci physics.optics

    Broken symmetries associated with a Kagome chiral charge order

    Authors: Zi-Jia Cheng, Md Shafayat Hossain, Qi Zhang, Sen Shao, Jinjin Liu, Yilin Zhao, Mohammad Yahyavi, Yu-Xiao Jiang, Jia-Xin Yin, Xian Yang, Yongkai Li, Tyler A. Cochran, Maksim Litskevich, Byunghoon Kim, Junyi Zhang, Yugui Yao, Luis Balicas, Zhiwei Wang, Guoqing Chang, M. Zahid Hasan

    Abstract: Chirality or handedness manifests in all fields of science, ranging from cell biology, molecular interaction, and catalysis to different branches of physics. In condensed matter physics, chirality is intrinsic to enigmatic quantum phases, such as chiral charge density waves and chiral superconductivity. Here, the underlying chiral response is subtle and leads to broken symmetries in the ground sta… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

    Comments: in press

    Journal ref: Nature Communications volume 16, Article number: 3782 (2025)

  41. arXiv:2502.08048  [pdf, other

    physics.optics hep-ex

    Efficiently Laser Driven Terahertz Surface Plasmon Polaritons on Long Metal Wire

    Authors: Shuoting Shao, Xiangbing Wang, Rong Huang, Guangyue Hu, Min Chen, Huibo Tang, Longyu Kuang, Yuxi Liu, Yuqiu Gu, Yongkun Ding, Ruxin Li, Hongbin Zhuo, Mingyang Yu

    Abstract: We experimentally demonstrate a novel scheme for efficiently generating intense terahertz (THz) surface plasmon polaritons (SPPs) on a sub-wavelength-diameter meter-long metal wire. Driven by a subrelativistic femtosecond laser (a0=0.3, 3 mJ) focused at the wire's midpoint, single-cycle ten-megawatt THz SPPs are excited and propagating bidirectionally along it over 25 cm. The measured laser-to-SPP… ▽ More

    Submitted 21 February, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  42. arXiv:2502.07701   

    cs.CV

    Magic 1-For-1: Generating One Minute Video Clips within One Minute

    Authors: Hongwei Yi, Shitong Shao, Tian Ye, Jiantong Zhao, Qingyu Yin, Michael Lingelbach, Li Yuan, Yonghong Tian, Enze Xie, Daquan Zhou

    Abstract: In this technical report, we present Magic 1-For-1 (Magic141), an efficient video generation model with optimized memory consumption and inference latency. The key idea is simple: factorize the text-to-video generation task into two separate easier tasks for diffusion step distillation, namely text-to-image generation and image-to-video generation. We verify that with the same optimization algorit… ▽ More

    Submitted 16 February, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: Serious updates are needed

  43. arXiv:2502.07644  [pdf, other

    cs.AI

    SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models

    Authors: Shihao Xia, Mengting He, Shuai Shao, Tingting Yu, Yiying Zhang, Linhai Song

    Abstract: To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each having a set of rules to guide the behaviors of smart contracts. Violating the ERC rules could cause serious security issues and financial loss, signifying the importance of verifying smart contracts follow ERCs. Today's practices of such verification are to manually audit… ▽ More

    Submitted 12 February, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: 16 pages. arXiv admin note: text overlap with arXiv:2404.04306

  44. arXiv:2501.15509  [pdf, other

    cs.CR cs.AI cs.LG

    FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint

    Authors: Shuo Shao, Haozhe Zhu, Hongwei Yao, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren

    Abstract: Model fingerprinting is a widely adopted approach to safeguard the copyright of open-source models by detecting and preventing their unauthorized reuse without modifying the protected model. However, in this paper, we reveal that existing fingerprinting methods are vulnerable to false claim attacks where adversaries falsely assert ownership of third-party non-reused models. We find that this vulne… ▽ More

    Submitted 23 May, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

  45. arXiv:2501.13260  [pdf

    cond-mat.str-el cond-mat.mes-hall cond-mat.mtrl-sci

    Field induced density wave in a kagome superconductor

    Authors: Md Shafayat Hossain, Qi Zhang, Julian Ingham, Jinjin Liu, Sen Shao, Yangmu Li, Yuxin Wang, Bal K. Pokharel, Zi-Jia Cheng, Yu-Xiao Jiang, Maksim Litskevich, Byunghoon Kim, Xian Yang, Yongkai Li, Tyler A. Cochran, Yugui Yao, Dragana Popović, Zhiwei Wang, Guoqing Chang, Ronny Thomale, Luis Balicas, M. Zahid Hasan

    Abstract: On the kagome lattice, electrons benefit from the simultaneous presence of band topology, flat electronic bands, and van Hove singularities, forming competing or cooperating orders. Understanding the interrelation between these distinct order parameters remains a significant challenge, leaving much of the associated physics unexplored. In the kagome superconductor KV3Sb5, which exhibits a charge d… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  46. arXiv:2501.09520  [pdf, other

    cs.IT

    RWZC: A Model-Driven Approach for Learning-based Robust Wyner-Ziv Coding

    Authors: Yuxuan Shi, Shuo Shao, Yongpeng Wu, Wenjun Zhang, Merouane Debbah

    Abstract: In this paper, a novel learning-based Wyner-Ziv coding framework is considered under a distributed image transmission scenario, where the correlated source is only available at the receiver. Unlike other learnable frameworks, our approach demonstrates robustness to non-stationary source correlation, where the overlapping information between image pairs varies. Specifically, we first model the affi… ▽ More

    Submitted 5 February, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

    Comments: 14 pages, 17 figures, accepted by IEEE Journal on Selected Areas in Communications

  47. arXiv:2501.09282  [pdf, other

    quant-ph

    Diatomic and Polyatomic Heteronuclear Ultralong-Range Rydberg Molecules

    Authors: Qing Li, Shi-Yao Shao, Li-Hua Zhang, Bang Liu, Zheng-Yuan Zhang, Jun Zhang, Qi-Feng Wang, Han-Chao Chen, Yu Ma, Tian-Yu Han, Dong-Sheng Ding, Bao-Sen Shi

    Abstract: Ultra-long-range Rydberg molecules (ULRMs) have attracted significant interest due to their unique electronic properties and potential applications in quantum technologies. We theoretically investigate the formation and characteristics of heteronuclear ULRMs, focusing on Rb-Cs systems. We explore the vibrational energy levels of heteronuclear nD ULRMs and compare them with homonuclear counterparts… ▽ More

    Submitted 15 January, 2025; originally announced January 2025.

  48. arXiv:2501.00051  [pdf, other

    cs.LG cs.AI eess.SY

    DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework

    Authors: Yu-Zheng Lin, Qinxuan Shi, Zhanglong Yang, Banafsheh Saber Latibari, Sicong Shao, Soheil Salehi, Pratik Satam

    Abstract: Digital twin (DT) technology has emerged as a transformative approach to simulate, predict, and optimize the behavior of physical systems, with applications that span manufacturing, healthcare, climate science, and more. However, the development of DT models often faces challenges such as high data requirements, integration complexity, and limited adaptability to dynamic changes in physical system… ▽ More

    Submitted 27 December, 2024; originally announced January 2025.

  49. arXiv:2412.18606  [pdf, other

    cond-mat.str-el hep-th quant-ph

    Lattice T-duality from non-invertible symmetries in quantum spin chains

    Authors: Salvatore D. Pace, Arkya Chatterjee, Shu-Heng Shao

    Abstract: Dualities of quantum field theories are challenging to realize in lattice models of qubits. In this work, we explore one of the simplest dualities, T-duality of the compact boson CFT, and its realization in quantum spin chains. In the special case of the XX model, we uncover an exact lattice T-duality, which is associated with a non-invertible symmetry that exchanges two lattice U(1) symmetries. T… ▽ More

    Submitted 8 April, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

    Comments: 45 pages plus appendices. v2: published version

    Report number: MIT-CTP/5815, YITP-SB-2024-34

    Journal ref: SciPost Phys. 18, 121 (2025)

  50. arXiv:2412.18263  [pdf, other

    cs.LG math-ph physics.chem-ph physics.comp-ph quant-ph

    High-Rank Irreducible Cartesian Tensor Decomposition and Bases of Equivariant Spaces

    Authors: Shihao Shao, Yikang Li, Zhouchen Lin, Qinghua Cui

    Abstract: Irreducible Cartesian tensors (ICTs) play a crucial role in the design of equivariant graph neural networks, as well as in theoretical chemistry and chemical physics. Meanwhile, the design space of available linear operations on tensors that preserve symmetry presents a significant challenge. The ICT decomposition and a basis of this equivariant space are difficult to obtain for high-rank tensors.… ▽ More

    Submitted 19 March, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

    Comments: 48 pages