Skip to main content

Showing 1–50 of 168 results for author: Li, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.01766  [pdf, ps, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface aided Integrated-Navigation-and-Communication in Urban Canyons: A Satellite Selection Approach

    Authors: Tianwei Hou, Da Guan, Xin Sun, Anna Li, Wenqiang Yi, Yuanwei Liu, Arumugam Nallanathan

    Abstract: This study investigates the application of a simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided medium-Earth-orbit (MEO) satellite network for providing both global positioning services and communication services in the urban canyons, where the direct satellite-user links are obstructed. Superposition coding (SC) and successive interference cancellation (S… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2506.19266  [pdf

    q-bio.NC cs.CV eess.IV

    Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans

    Authors: Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang

    Abstract: The organization and connectivity of the arcuate fasciculus (AF) in nonhuman primates remain contentious, especially concerning how its anatomy diverges from that of humans. Here, we combined cross-scale single-neuron tracing - using viral-based genetic labeling and fluorescence micro-optical sectioning tomography in macaques (n = 4; age 3 - 11 years) - with whole-brain tractography from 11.7T dif… ▽ More

    Submitted 2 July, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

    Comments: 34 pages, 6 figures

  3. arXiv:2506.17184  [pdf, ps, other

    cs.RO eess.SY

    Judo: A User-Friendly Open-Source Package for Sampling-Based Model Predictive Control

    Authors: Albert H. Li, Brandon Hung, Aaron D. Ames, Jiuguang Wang, Simon Le Cleac'h, Preston Culbertson

    Abstract: Recent advancements in parallel simulation and successful robotic applications are spurring a resurgence in sampling-based model predictive control. To build on this progress, however, the robotics community needs common tooling for prototyping, evaluating, and deploying sampling-based controllers. We introduce Judo, a software package designed to address this need. To facilitate rapid prototyping… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted at the 2025 RSS Workshop on Fast Motion Planning and Control in the Era of Parallelism. 5 Pages

  4. arXiv:2506.16733  [pdf

    eess.IV cs.CV

    A Prior-Guided Joint Diffusion Model in Projection Domain for PET Tracer Conversion

    Authors: Fang Chen, Weifeng Zhang, Xingyu Ai, BingXuan Li, An Li, Qiegen Liu

    Abstract: Positron emission tomography (PET) is widely used to assess metabolic activity, but its application is limited by the availability of radiotracers. 18F-labeled fluorodeoxyglucose (18F-FDG) is the most commonly used tracer but shows limited effectiveness for certain tumors. In contrast, 6-18F-fluoro-3,4-dihydroxy-L-phenylalanine (18F-DOPA) offers higher specificity for neuroendocrine tumors and neu… ▽ More

    Submitted 22 June, 2025; v1 submitted 20 June, 2025; originally announced June 2025.

  5. arXiv:2506.16537  [pdf

    cs.RO eess.SY

    Agile, Autonomous Spacecraft Constellations with Disruption Tolerant Networking to Monitor Precipitation and Urban Floods

    Authors: Sreeja Roy-Singh, Alan P. Li, Vinay Ravindra, Roderick Lammers, Marc Sanchez Net

    Abstract: Fully re-orientable small spacecraft are now supported by commercial technologies, allowing them to point their instruments in any direction and capture images, with short notice. When combined with improved onboard processing, and implemented on a constellation of inter-communicable satellites, this intelligent agility can significantly increase responsiveness to transient or evolving phenomena.… ▽ More

    Submitted 23 June, 2025; v1 submitted 19 June, 2025; originally announced June 2025.

    Journal ref: Robotics Science and Systems (RSS 2025) - Space Robotics Workshop

  6. arXiv:2506.05171  [pdf, other

    eess.SY cs.AI

    Towards provable probabilistic safety for scalable embodied AI systems

    Authors: Linxuan He, Qing-Shan Jia, Ang Li, Hongyan Sang, Ling Wang, Jiwen Lu, Tao Zhang, Jie Zhou, Yi Zhang, Yisen Wang, Peng Wei, Zhongyuan Wang, Henry X. Liu, Shuo Feng

    Abstract: Embodied AI systems, comprising AI models and physical plants, are increasingly prevalent across various applications. Due to the rarity of system failures, ensuring their safety in complex operating environments remains a major challenge, which severely hinders their large-scale deployment in safety-critical domains, such as autonomous vehicles, medical devices, and robotics. While achieving prov… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  7. arXiv:2505.22923  [pdf, ps, other

    eess.IV cs.CV

    Plug-and-Play Posterior Sampling for Blind Inverse Problems

    Authors: Anqi Li, Weijie Gan, Ulugbek S. Kamilov

    Abstract: We introduce Blind Plug-and-Play Diffusion Models (Blind-PnPDM) as a novel framework for solving blind inverse problems where both the target image and the measurement operator are unknown. Unlike conventional methods that rely on explicit priors or separate parameter estimation, our approach performs posterior sampling by recasting the problem into an alternating Gaussian denoising scheme. We lev… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: arXiv admin note: text overlap with arXiv:2305.12672

  8. arXiv:2505.15200  [pdf, ps, other

    cs.IT eess.SP

    Performance Analysis of Fluid Antenna System under Spatially-Correlated Rician Fading Channels

    Authors: Jiangsheng Huangfu, Zhengyu Song, Tianwei Hou, Anna Li, Yuanwei Liu, Arumugam Nallanathan, Kai-Kit Wong

    Abstract: Fluid antenna systems (FAS) are among the most promising technologies for the sixth generation (6G) mobile communication networks. Unlike traditional fixed-position multiple-input multiple-output (MIMO) systems, a FAS possesses position reconfigurability to switch on-demand among $N$ predefined ports over a prescribed space. This paper explores the performance of a single-input single-output (SISO… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  9. arXiv:2505.10786  [pdf, ps, other

    eess.SP cs.HC

    Bridging BCI and Communications: A MIMO Framework for EEG-to-ECoG Wireless Channel Modeling

    Authors: Jiaheng Wang, Zhenyu Wang, Tianheng Xu, Yuan Si, Ang Li, Ting Zhou, Xi Zhao, Honglin Hu

    Abstract: As a method to connect human brain and external devices, Brain-computer interfaces (BCIs) are receiving extensive research attention. Recently, the integration of communication theory with BCI has emerged as a popular trend, offering potential to enhance system performance and shape next-generation communications. A key challenge in this field is modeling the brain wireless communication channel… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  10. arXiv:2503.22837  [pdf, other

    eess.SY

    A Cooperative Compliance Control Framework for Socially Optimal Mixed Traffic Routing

    Authors: Anni Li, Ting Bai, Yingqing Chen, Christos G. Cassandras, Andreas A. Malikopoulos

    Abstract: In mixed traffic environments, where Connected and Autonomed Vehicles (CAVs) coexist with potentially non-cooperative Human-Driven Vehicles (HDVs), the self-centered behavior of human drivers may compromise the efficiency, optimality, and safety of the overall traffic network. In this paper, we propose a Cooperative Compliance Control (CCC) framework for mixed traffic routing, where a Social Plann… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  11. arXiv:2503.12936  [pdf, other

    eess.AS

    FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks

    Authors: Tong Lei, Qinwen Hu, Ziyao Lin, Andong Li, Rilin Chen, Meng Yu, Dong Yu, Jing Lu

    Abstract: The prevailing method for neural speech enhancement predominantly utilizes fully-supervised deep learning with simulated pairs of far-field noisy-reverberant speech and clean speech. Nonetheless, these models frequently demonstrate restricted generalizability to mixtures recorded in real-world conditions. To address this issue, this study investigates training enhancement models directly on real m… ▽ More

    Submitted 15 April, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

    Comments: 13 pages, 6 figures

  12. arXiv:2503.12601  [pdf, other

    eess.SY

    Routing Guidance for Emerging Transportation Systems with Improved Dynamic Trip Equity

    Authors: Ting Bai, Anni Li, Gehui Xu, Christos G. Cassandras, Andreas A. Malikopoulos

    Abstract: In this paper, we present a dynamic routing guidance system that optimizes route recommendations for individual vehicles within an emerging transportation system while enhancing travelers' trip equity. We develop a framework to quantify trip quality and equity in a dynamic travel environment, providing new insights into how routing guidance influences equity in road transportation. Our approach en… ▽ More

    Submitted 1 April, 2025; v1 submitted 16 March, 2025; originally announced March 2025.

  13. arXiv:2503.12233  [pdf, ps, other

    cs.IT eess.SP

    Robust Full-Space Physical Layer Security for STAR-RIS-Aided Wireless Networks: Eavesdropper with Uncertain Location and Channel

    Authors: Han Xiao, Xiaoyan Hu, Ang Li, Wenjie Wang, Kun Yang

    Abstract: A robust full-space physical layer security (PLS) transmission scheme is proposed in this paper considering the full-space wiretapping challenge of wireless networks supported by simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS). Different from the existing schemes, the proposed PLS scheme takes account of the uncertainty on the eavesdropper's position within t… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

  14. arXiv:2503.00510  [pdf, other

    eess.IV cs.CV

    NeuroSymAD: A Neuro-Symbolic Framework for Interpretable Alzheimer's Disease Diagnosis

    Authors: Yexiao He, Ziyao Wang, Yuning Zhang, Tingting Dan, Tianlong Chen, Guorong Wu, Ang Li

    Abstract: Alzheimer's disease (AD) diagnosis is complex, requiring the integration of imaging and clinical data for accurate assessment. While deep learning has shown promise in brain MRI analysis, it often functions as a black box, limiting interpretability and lacking mechanisms to effectively integrate critical clinical data such as biomarkers, medical history, and demographic information. To bridge this… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

  15. arXiv:2502.13395  [pdf

    cs.SD cs.LG eess.AS eess.SP physics.optics

    Unsupervised CP-UNet Framework for Denoising DAS Data with Decay Noise

    Authors: Tianye Huang, Aopeng Li, Xiang Li, Jing Zhang, Sijing Xian, Qi Zhang, Mingkong Lu, Guodong Chen, Liangming Xiong, Xiangyun Hu

    Abstract: Distributed acoustic sensor (DAS) technology leverages optical fiber cables to detect acoustic signals, providing cost-effective and dense monitoring capabilities. It offers several advantages including resistance to extreme conditions, immunity to electromagnetic interference, and accurate detection. However, DAS typically exhibits a lower signal-to-noise ratio (S/N) compared to geophones and is… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: 13 pages, 8 figures

  16. arXiv:2502.12412  [pdf, other

    cs.LG eess.IV

    Incomplete Graph Learning: A Comprehensive Survey

    Authors: Riting Xia, Huibo Liu, Anchen Li, Xueyan Liu, Yan Zhang, Chunxu Zhang, Bo Yang

    Abstract: Graph learning is a prevalent field that operates on ubiquitous graph data. Effective graph learning methods can extract valuable information from graphs. However, these methods are non-robust and affected by missing attributes in graphs, resulting in sub-optimal outcomes. This has led to the emergence of incomplete graph learning, which aims to process and learn from incomplete graphs to achieve… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  17. arXiv:2502.12002  [pdf, other

    cs.SD cs.CV eess.AS

    NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing

    Authors: Yifan Liang, Fangkun Liu, Andong Li, Xiaodong Li, Chengshi Zheng

    Abstract: Recent advancements in visual speech recognition (VSR) have promoted progress in lip-to-speech synthesis, where pre-trained VSR models enhance the intelligibility of synthesized speech by providing valuable semantic information. The success achieved by cascade frameworks, which combine pseudo-VSR with pseudo-text-to-speech (TTS) or implicitly utilize the transcribed text, highlights the benefits o… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  18. arXiv:2502.00248  [pdf, other

    math.OC cs.LG eess.SY

    Provably-Stable Neural Network-Based Control of Nonlinear Systems

    Authors: Anran Li, John P. Swensen, Mehdi Hosseinzadeh

    Abstract: In recent years, Neural Networks (NNs) have been employed to control nonlinear systems due to their potential capability in dealing with situations that might be difficult for conventional nonlinear control schemes. However, to the best of our knowledge, the current literature on NN-based control lacks theoretical guarantees for stability and tracking performance. This precludes the application of… ▽ More

    Submitted 31 January, 2025; originally announced February 2025.

    Journal ref: Engineering Applications of Artificial Intelligence, volume 138, pages 109252, year 2024

  19. arXiv:2501.16215  [pdf, other

    cs.AI cs.LG eess.SP

    Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models

    Authors: Huayu Li, Xiwen Chen, Ci Zhang, Stuart F. Quan, William D. S. Killgore, Shu-Fen Wung, Chen X. Chen, Geng Yuan, Jin Lu, Ao Li

    Abstract: Large language models (LLMs) exhibit remarkable capabilities in visual inspection of medical time-series data, achieving proficiency comparable to human clinicians. However, their broad scope limits domain-specific precision, and proprietary weights hinder fine-tuning for specialized datasets. In contrast, small specialized models (SSMs) excel in targeted tasks but lack the contextual reasoning re… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

  20. arXiv:2501.13465  [pdf, other

    cs.SD eess.AS

    Neural Vocoders as Speech Enhancers

    Authors: Andong Li, Zhihang Sun, Fengyuan Hao, Xiaodong Li, Chengshi Zheng

    Abstract: Speech enhancement (SE) and neural vocoding are traditionally viewed as separate tasks. In this work, we observe them under a common thread: the rank behavior of these processes. This observation prompts two key questions: \textit{Can a model designed for one task's rank degradation be adapted for the other?} and \textit{Is it possible to address both tasks using a unified model?} Our empirical fi… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

    Comments: 6 pages, 3 figures

  21. arXiv:2501.09799  [pdf, ps, other

    eess.IV

    Scan-Adaptive MRI Undersampling Using Neighbor-based Optimization (SUNO)

    Authors: Siddhant Gautam, Angqi Li, Nicole Seiberlich, Jeffrey A. Fessler, Saiprasad Ravishankar

    Abstract: Accelerated MRI involves collecting partial k-space measurements to reduce acquisition time, patient discomfort, and motion artifacts, and typically uses regular undersampling patterns or hand-designed schemes. Recent works have studied population-adaptive sampling patterns that are learned from a group of patients (or scans) based on population-specific metrics. However, such a general sampling p… ▽ More

    Submitted 9 June, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

  22. arXiv:2412.20023  [pdf

    math.OC cs.LG eess.SY

    Global Search of Optimal Spacecraft Trajectories using Amortization and Deep Generative Models

    Authors: Ryne Beeson, Anjian Li, Amlan Sinha

    Abstract: Preliminary spacecraft trajectory optimization is a parameter dependent global search problem that aims to provide a set of solutions that are of high quality and diverse. In the case of numerical solution, it is dependent on the original optimal control problem, the choice of a control transcription, and the behavior of a gradient based numerical solver. In this paper we formulate the parameteriz… ▽ More

    Submitted 27 December, 2024; originally announced December 2024.

    Comments: 47 pages, 23 figures, initial content of this paper appears in Paper 23-352 at the AAS/AIAA Astrodynamics Specialist Conference, Big Sky, MT, August 13-17 2023

  23. arXiv:2412.19099  [pdf, other

    cs.SD eess.AS

    BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement

    Authors: Cunhang Fan, Enrui Liu, Andong Li, Jianhua Tao, Jian Zhou, Jiahao Li, Chengshi Zheng, Zhao Lv

    Abstract: Although the complex spectrum-based speech enhancement(SE) methods have achieved significant performance, coupling amplitude and phase can lead to a compensation effect, where amplitude information is sacrificed to compensate for the phase that is harmful to SE. In addition, to further improve the performance of SE, many modules are stacked onto SE, resulting in increased model complexity that lim… ▽ More

    Submitted 26 December, 2024; originally announced December 2024.

    Comments: Accepted by AAAI 2025

  24. arXiv:2411.07833  [pdf, other

    cs.RO eess.SY

    Robust Adaptive Safe Robotic Grasping with Tactile Sensing

    Authors: Yitaek Kim, Jeeseop Kim, Albert H. Li, Aaron D. Ames, Christoffer Sloth

    Abstract: Robotic grasping requires safe force interaction to prevent a grasped object from being damaged or slipping out of the hand. In this vein, this paper proposes an integrated framework for grasping with formal safety guarantees based on Control Barrier Functions. We first design contact force and force closure constraints, which are enforced by a safety filter to accomplish safe grasping with finger… ▽ More

    Submitted 12 November, 2024; originally announced November 2024.

  25. arXiv:2410.22028  [pdf, other

    eess.SP

    MU-MIMO Symbol-Level Precoding for QAM Constellations with Maximum Likelihood Receivers

    Authors: X. Tong, A. Li, L. Lei, X. Hu, F. Dong, S. Chatzinotas, C. Masouros

    Abstract: In this paper, we investigate symbol-level precoding (SLP) and efficient decoding techniques for downlink transmission, where we focus on scenarios where the base station (BS) transmits multiple QAM constellation streams to users equipped with multiple receive antennas. We begin by formulating a joint symbol-level transmit precoding and receive combining optimization problem. This coupled problem… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

    Comments: 13 pages,8 figures

  26. arXiv:2410.17574  [pdf, other

    cs.LG cs.SD eess.AS

    Adversarial Domain Adaptation for Metal Cutting Sound Detection: Leveraging Abundant Lab Data for Scarce Industry Data

    Authors: Mir Imtiaz Mostafiz, Eunseob Kim, Adrian Shuai Li, Elisa Bertino, Martin Byung-Guk Jun, Ali Shakouri

    Abstract: Cutting state monitoring in the milling process is crucial for improving manufacturing efficiency and tool life. Cutting sound detection using machine learning (ML) models, inspired by experienced machinists, can be employed as a cost-effective and non-intrusive monitoring method in a complex manufacturing environment. However, labeling industry data for training is costly and time-consuming. More… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: 8 pages, 3 figures, 3 tables, First two named Authors have equal contribution (Co-first author)

  27. arXiv:2410.11270  [pdf, other

    cs.NI eess.SP

    Energy Efficient Transmission Parameters Selection Method Using Reinforcement Learning in Distributed LoRa Networks

    Authors: Ryotai Airiyoshi, Mikio Hasegawa, Tomoaki Ohtsuki, Aohan Li

    Abstract: With the increase in demand for Internet of Things (IoT) applications, the number of IoT devices has drastically grown, making spectrum resources seriously insufficient. Transmission collisions and retransmissions increase power consumption. Therefore, even in long-range (LoRa) networks, selecting appropriate transmission parameters, such as channel and transmission power, is essential to improve… ▽ More

    Submitted 21 January, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: 6 pages, 5 figures, conference

  28. arXiv:2410.11097  [pdf, other

    eess.AS cs.AI cs.SD

    DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis

    Authors: Yingahao Aaron Li, Rithesh Kumar, Zeyu Jin

    Abstract: Diffusion models have demonstrated significant potential in speech synthesis tasks, including text-to-speech (TTS) and voice cloning. However, their iterative denoising processes are computationally intensive, and previous distillation attempts have shown consistent quality degradation. Moreover, existing TTS approaches are limited by non-differentiable components or iterative sampling that preven… ▽ More

    Submitted 19 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

  29. arXiv:2410.06170  [pdf, other

    cs.LG eess.SY

    QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers

    Authors: Haozhe Chen, Ang Li, Ethan Che, Tianyi Peng, Jing Dong, Hongseok Namkoong

    Abstract: Queuing network control determines the allocation of scarce resources to manage congestion, a fundamental problem in manufacturing, communications, and healthcare. Compared to standard RL problems, queueing problems are distinguished by unique challenges: i) a system operating in continuous time, ii) high stochasticity, and iii) long horizons over which the system can become unstable (exploding de… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  30. arXiv:2410.05739  [pdf, other

    cs.SD cs.AI eess.AS

    Array2BR: An End-to-End Noise-immune Binaural Audio Synthesis from Microphone-array Signals

    Authors: Cheng Chi, Xiaoyu Li, Andong Li, Yuxuan Ke, Xiaodong Li, Chengshi Zheng

    Abstract: Telepresence technology aims to provide an immersive virtual presence for remote conference applications, and it is extremely important to synthesize high-quality binaural audio signals for this aim. Because the ambient noise is often inevitable in practical application scenarios, it is highly desired that binaural audio signals without noise can be obtained from microphone-array signals directly.… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  31. arXiv:2410.02976  [pdf, other

    cs.LG eess.SY math.OC

    Learning Optimal Control and Dynamical Structure of Global Trajectory Search Problems with Diffusion Models

    Authors: Jannik Graebner, Anjian Li, Amlan Sinha, Ryne Beeson

    Abstract: Spacecraft trajectory design is a global search problem, where previous work has revealed specific solution structures that can be captured with data-driven methods. This paper explores two global search problems in the circular restricted three-body problem: hybrid cost function of minimum fuel/time-of-flight and transfers to energy-dependent invariant manifolds. These problems display a fundamen… ▽ More

    Submitted 29 December, 2024; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: This paper was presented at the AAS/AIAA Astrodynamics Specialist Conference

  32. arXiv:2410.00392  [pdf, other

    eess.SY cs.AR

    MERIT: Multimodal Wearable Vital Sign Waveform Monitoring

    Authors: Yongyang Tang, Zhe Chen, Ang Li, Tianyue Zheng, Zheng Lin, Jia Xu, Pin Lv, Zhe Sun, Yue Gao

    Abstract: Cardiovascular disease (CVD) is the leading cause of death and premature mortality worldwide, with occupational environments significantly influencing CVD risk, underscoring the need for effective cardiac monitoring and early warning systems. Existing methods of monitoring vital signs require subjects to remain stationary, which is impractical for daily monitoring as individuals are often in motio… ▽ More

    Submitted 21 November, 2024; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: 8 pages, 10 figures

  33. arXiv:2409.10058  [pdf, other

    eess.AS cs.SD

    StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

    Authors: Yinghao Aaron Li, Xilin Jiang, Cong Han, Nima Mesgarani

    Abstract: The rapid development of large-scale text-to-speech (TTS) models has led to significant advancements in modeling diverse speaker prosody and voices. However, these models often face issues such as slow inference speeds, reliance on complex pre-trained neural codec representations, and difficulties in achieving naturalness and high similarity to reference speakers. To address these challenges, this… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

  34. arXiv:2408.11849  [pdf, other

    cs.CL cs.AI eess.AS

    Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation

    Authors: Yinghao Aaron Li, Xilin Jiang, Jordan Darefsky, Ge Zhu, Nima Mesgarani

    Abstract: The rapid advancement of large language models (LLMs) has significantly propelled the development of text-based chatbots, demonstrating their capability to engage in coherent and contextually relevant dialogues. However, extending these advancements to enable end-to-end speech-to-speech conversation bots remains a formidable challenge, primarily due to the extensive dataset and computational resou… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: CoLM 2024

  35. arXiv:2407.09732  [pdf, other

    eess.AS cs.LG cs.SD

    Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis

    Authors: Xilin Jiang, Yinghao Aaron Li, Adrian Nicolas Florea, Cong Han, Nima Mesgarani

    Abstract: It is too early to conclude that Mamba is a better alternative to transformers for speech before comparing Mamba with transformers in terms of both performance and efficiency in multiple speech-related tasks. To reach this conclusion, we propose and evaluate three models for three tasks: Mamba-TasNet for speech separation, ConMamba for speech recognition, and VALL-M for speech synthesis. We compar… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  36. arXiv:2406.17804  [pdf, other

    physics.med-ph cs.AI cs.CV cs.LG eess.IV

    A Review of Electromagnetic Elimination Methods for low-field portable MRI scanner

    Authors: Wanyu Bian, Panfeng Li, Mengyao Zheng, Chihang Wang, Anying Li, Ying Li, Haowei Ni, Zixuan Zeng

    Abstract: This paper analyzes conventional and deep learning methods for eliminating electromagnetic interference (EMI) in MRI systems. We compare traditional analytical and adaptive techniques with advanced deep learning approaches. Key strengths and limitations of each method are highlighted. Recent advancements in active EMI elimination, such as external EMI receiver coils, are discussed alongside deep l… ▽ More

    Submitted 13 November, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted by 2024 5th International Conference on Machine Learning and Computer Application

    Journal ref: Proceedings of the 2024 5th International Conference on Machine Learning and Computer Application (ICMLCA), 2024, pp. 614-618

  37. arXiv:2406.16870  [pdf, other

    eess.SY

    Robust Optimal Lane-changing Control for Connected Autonomous Vehicles in Mixed Traffic

    Authors: Anni Li, Andres S. Chavez Armijos, Christos G. Cassandras

    Abstract: We derive time and energy-optimal policies for a Connected Autonomous Vehicle (CAV) to execute lane change maneuvers in mixed traffic, i.e., in the presence of both CAVs and Human Driven Vehicles (HDVs). These policies are also shown to be robust with respect to the unpredictable behavior of HDVs by exploiting CAV cooperation which can eliminate or greatly reduce the interaction between CAVs and H… ▽ More

    Submitted 15 March, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2303.16948

  38. SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

    Authors: Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu

    Abstract: The proliferation of deep neural networks has spawned the rapid development of acoustic echo cancellation and noise suppression, and plenty of prior arts have been proposed, which yield promising performance. Nevertheless, they rarely consider the deployment generality in different processing scenarios, such as edge devices, and cloud processing. To this end, this paper proposes a general model, t… ▽ More

    Submitted 24 January, 2025; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 8 pages, Accepted to SLT 2024

    Journal ref: 2024 IEEE Spoken Language Technology Workshop (SLT), pp. 317-324, 2024

  39. arXiv:2406.00758  [pdf, other

    eess.IV cs.CV cs.MM

    Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption

    Authors: Anqi Li, Feng Li, Yuxi Liu, Runmin Cong, Yao Zhao, Huihui Bai

    Abstract: Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, termed Control-GIC, the first capa… ▽ More

    Submitted 4 December, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  40. arXiv:2405.17594  [pdf, other

    eess.SY

    Towards Achieving Cooperation Compliance of Human Drivers in Mixed Traffic

    Authors: Anni Li, Christos G. Cassandras

    Abstract: We consider a mixed-traffic environment in transportation systems, where Connected and Automated Vehicles (CAVs) coexist with potentially non-cooperative Human-Driven Vehicles (HDVs). We develop a cooperation compliance control framework to incentivize HDVs to align their behavior with socially optimal objectives using a ``refundable toll'' scheme so as to achieve a desired compliance probability… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  41. arXiv:2405.16446  [pdf, ps, other

    eess.SP

    A New Solution for MU-MISO Symbol-Level Precoding: Extrapolation and Deep Unfolding

    Authors: Mu Liang, Ang Li, Xiaoyan Hu, Christos Masouros

    Abstract: Constructive interference (CI) precoding, which converts the harmful multi-user interference into beneficial signals, is a promising and efficient interference management scheme in multi-antenna communication systems. However, CI-based symbol-level precoding (SLP) experiences high computational complexity as the number of symbol slots increases within a transmission block, rendering it unaffordabl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  42. arXiv:2405.04167  [pdf, other

    cs.CV eess.IV

    Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

    Authors: Aobo Li, Jinjian Wu, Yongxu Liu, Leida Li

    Abstract: The annotation of blind image quality assessment (BIQA) is labor-intensive and time-consuming, especially for authentic images. Training on synthetic data is expected to be beneficial, but synthetically trained models often suffer from poor generalization in real domains due to domain gaps. In this work, we make a key observation that introducing more distortion types in the synthetic dataset may… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR2024

  43. arXiv:2404.15364  [pdf, other

    eess.SP cs.AI cs.CV cs.LG

    MP-DPD: Low-Complexity Mixed-Precision Neural Networks for Energy-Efficient Digital Predistortion of Wideband Power Amplifiers

    Authors: Yizhuo Wu, Ang Li, Mohammadreza Beikmirza, Gagan Deep Singh, Qinyu Chen, Leo C. N. de Vreede, Morteza Alavi, Chang Gao

    Abstract: Digital Pre-Distortion (DPD) enhances signal quality in wideband RF power amplifiers (PAs). As signal bandwidths expand in modern radio systems, DPD's energy consumption increasingly impacts overall system efficiency. Deep Neural Networks (DNNs) offer promising advancements in DPD, yet their high complexity hinders their practical deployment. This paper introduces open-source mixed-precision (MP)… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted to IEEE Microwave and Wireless Technology Letters (MWTL)

  44. arXiv:2403.09096  [pdf, other

    eess.IV cs.CV

    Deep unfolding Network for Hyperspectral Image Super-Resolution with Automatic Exposure Correction

    Authors: Yuan Fang, Yipeng Liu, Jie Chen, Zhen Long, Ao Li, Chong-Yung Chi, Ce Zhu

    Abstract: In recent years, the fusion of high spatial resolution multispectral image (HR-MSI) and low spatial resolution hyperspectral image (LR-HSI) has been recognized as an effective method for HSI super-resolution (HSI-SR). However, both HSI and MSI may be acquired under extreme conditions such as night or poorly illuminating scenarios, which may cause different exposure levels, thereby seriously downgr… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  45. arXiv:2402.15944  [pdf, other

    cs.IT eess.SP

    On A Class of Greedy Sparse Recovery Algorithms

    Authors: Gang Li, Qiuwei Li, Shuang Li, Wu Angela Li

    Abstract: Sparse signal recovery deals with finding the sparest solution of an under-determined linear system $x = Q s$. In this paper, we propose a novel greedy approach to addressing the challenges from such a problem. Such an approach is based on a characterization of solutions to the system, which allows us to work on the sparse recovery in the $s$-space directly with a given measure. With $l_2$-based m… ▽ More

    Submitted 2 March, 2025; v1 submitted 24 February, 2024; originally announced February 2024.

  46. arXiv:2402.04882  [pdf, other

    cs.NE cs.AI cs.LG cs.SD eess.AS

    LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

    Authors: Zeyu Liu, Gourav Datta, Anni Li, Peter Anthony Beerel

    Abstract: Transformer models have demonstrated high accuracy in numerous applications but have high complexity and lack sequential processing capability making them ill-suited for many streaming applications at the edge where devices are heavily resource-constrained. Thus motivated, many researchers have proposed reformulating the transformer models as RNN modules which modify the self-attention computation… ▽ More

    Submitted 19 January, 2024; originally announced February 2024.

    Comments: The 12th International Conference on Learning Representations (ICLR 2024)

  47. arXiv:2402.03710  [pdf, ps, other

    eess.AS cs.CL cs.SD

    Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience

    Authors: Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

    Abstract: In daily life, we encounter a variety of sounds, both desirable and undesirable, with limited control over their presence and volume. Our work introduces "Listen, Chat, and Remix" (LCR), a novel multimodal sound remixer that controls each sound source in a mixture based on user-provided text instructions. LCR distinguishes itself with a user-friendly text interface and its unique ability to remix… ▽ More

    Submitted 10 June, 2025; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Journal of Selected Topics in Signal Processing (JSTSP)

  48. arXiv:2401.00166  [pdf, ps, other

    cs.IT eess.SP

    Block-Level MU-MISO Interference Exploitation Precoding: Optimal Structure and Explicit Duality

    Authors: Junwen Yang, Ang Li, Xuewen Liao, Christos Masouros, A. L. Swindlehurst

    Abstract: This paper investigates block-level interference exploitation (IE) precoding for multi-user multiple-input single-output (MU-MISO) downlink systems. To overcome the need for symbol-level IE precoding to frequently update the precoding matrix, we propose to jointly optimize all the precoders or transmit signals within a transmission block. The resultant precoders only need to be updated once per bl… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: Submitted to IEEE

  49. Patient-Adaptive and Learned MRI Data Undersampling Using Neighborhood Clustering

    Authors: Siddhant Gautam, Angqi Li, Saiprasad Ravishankar

    Abstract: There has been much recent interest in adapting undersampled trajectories in MRI based on training data. In this work, we propose a novel patient-adaptive MRI sampling algorithm based on grouping scans within a training set. Scan-adaptive sampling patterns are optimized together with an image reconstruction network for the training scans. The training optimization alternates between determining th… ▽ More

    Submitted 31 March, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  50. arXiv:2311.16456  [pdf, other

    cs.CV eess.IV

    Spiking Neural Networks with Dynamic Time Steps for Vision Transformers

    Authors: Gourav Datta, Zeyu Liu, Anni Li, Peter A. Beerel

    Abstract: Spiking Neural Networks (SNNs) have emerged as a popular spatio-temporal computing paradigm for complex vision tasks. Recently proposed SNN training algorithms have significantly reduced the number of time steps (down to 1) for improved latency and energy efficiency, however, they target only convolutional neural networks (CNN). These algorithms, when applied on the recently spotlighted vision tra… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Under review