Skip to main content

Showing 1–50 of 104 results for author: Qin, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.16032  [pdf, ps, other

    cs.LG eess.SP math.OC

    A Scalable Factorization Approach for High-Order Structured Tensor Recovery

    Authors: Zhen Qin, Michael B. Wakin, Zhihui Zhu

    Abstract: Tensor decompositions, which represent an $N$-order tensor using approximately $N$ factors of much smaller dimensions, can significantly reduce the number of parameters. This is particularly beneficial for high-order tensors, as the number of entries in a tensor grows exponentially with the order. Consequently, they are widely used in signal recovery and data analysis across domains such as signal… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  2. arXiv:2506.02863  [pdf, ps, other

    eess.AS cs.AI cs.SD

    CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech

    Authors: Helin Wang, Jiarui Hai, Dading Chong, Karan Thakkar, Tiantian Feng, Dongchao Yang, Junhyeok Lee, Laureano Moro Velazquez, Jesus Villalba, Zengyi Qin, Shrikanth Narayanan, Mounya Elhiali, Najim Dehak

    Abstract: Recent advancements in generative artificial intelligence have significantly transformed the field of style-captioned text-to-speech synthesis (CapTTS). However, adapting CapTTS to real-world applications remains challenging due to the lack of standardized, comprehensive datasets and limited research on downstream tasks built upon CapTTS. To address these gaps, we introduce CapSpeech, a new benchm… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2504.13574  [pdf, other

    cs.LG cs.CV eess.IV

    MAAM: A Lightweight Multi-Agent Aggregation Module for Efficient Image Classification Based on the MindSpore Framework

    Authors: Zhenkai Qin, Feng Zhu, Huan Zeng, Xunyi Nong

    Abstract: The demand for lightweight models in image classification tasks under resource-constrained environments necessitates a balance between computational efficiency and robust feature representation. Traditional attention mechanisms, despite their strong feature modeling capability, often struggle with high computational complexity and structural rigidity, limiting their applicability in scenarios with… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

  4. arXiv:2504.01344  [pdf, other

    eess.SY

    IRS Assisted Decentralized Learning for Wideband Spectrum Sensing

    Authors: Sicheng Liu, Qun Wang, Zhuwei Qin, Weishan Zhang, Jingyi Wang, Xiang Ma

    Abstract: The increasing demand for reliable connectivity in industrial environments necessitates effective spectrum utilization strategies, especially in the context of shared spectrum bands. However, the dynamic spectrum-sharing mechanisms often lead to significant interference and critical failures, creating a trade-off between spectrum scarcity and under-utilization. This paper addresses these chall… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  5. arXiv:2503.22064  [pdf, other

    cs.AI eess.SY

    Multi-Task Semantic Communications via Large Models

    Authors: Wanli Ni, Zhijin Qin, Haofeng Sun, Xiaoming Tao, Zhu Han

    Abstract: Artificial intelligence (AI) promises to revolutionize the design, optimization and management of next-generation communication systems. In this article, we explore the integration of large AI models (LAMs) into semantic communications (SemCom) by leveraging their multi-modal data processing and generation capabilities. Although LAMs bring unprecedented abilities to extract semantics from raw data… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 7 pages, 6 figures

  6. arXiv:2503.10641  [pdf, other

    eess.SY cs.AI cs.RO

    Estimating Control Barriers from Offline Data

    Authors: Hongzhan Yu, Seth Farrell, Ryo Yoshimitsu, Zhizhen Qin, Henrik I. Christensen, Sicun Gao

    Abstract: Learning-based methods for constructing control barrier functions (CBFs) are gaining popularity for ensuring safe robot control. A major limitation of existing methods is their reliance on extensive sampling over the state space or online system interaction in simulation. In this work we propose a novel framework for learning neural CBFs through a fixed, sparsely-labeled dataset collected prior to… ▽ More

    Submitted 20 February, 2025; originally announced March 2025.

    Comments: This paper has been accepted to ICRA 2025

  7. arXiv:2503.05794  [pdf, other

    cs.CR cs.AI cs.LG cs.SD eess.AS

    CBW: Towards Dataset Ownership Verification for Speaker Verification via Clustering-based Backdoor Watermarking

    Authors: Yiming Li, Kaiying Yan, Shuo Shao, Tongqing Zhai, Shu-Tao Xia, Zhan Qin, Dacheng Tao

    Abstract: With the increasing adoption of deep learning in speaker verification, large-scale speech datasets have become valuable intellectual property. To audit and prevent the unauthorized usage of these valuable released datasets, especially in commercial or open-source scenarios, we propose a novel dataset ownership verification method. Our approach introduces a clustering-based backdoor watermark (CBW)… ▽ More

    Submitted 5 April, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

    Comments: 14 pages. The journal extension of our ICASSP'21 paper (arXiv:2010.11607)

  8. arXiv:2502.17168  [pdf, other

    eess.SP

    SpikACom: A Neuromorphic Computing Framework for Green Communications

    Authors: Yanzhen Liu, Zhijin Qin, Yongxu Zhu, Geoffrey Ye Li

    Abstract: The ever-growing power consumption of wireless communication systems necessitates more energy-efficient algorithms. This paper introduces SpikACom ({Spik}ing {A}daptive {Com}munication), a neuromorphic computing-based framework for power-intensive wireless communication tasks. SpikACom leverages brain-inspired spiking neural networks (SNNs) for efficient signal processing. It is designed for dynam… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  9. arXiv:2502.07236  [pdf, other

    eess.IV

    Adaptive Sampling and Joint Semantic-Channel Coding under Dynamic Channel Environment

    Authors: Zhiyuan Qi, Yulong Feng, Zhijin Qin

    Abstract: Deep learning enabled semantic communications are attracting extensive attention. However, most works normally ignore the data acquisition process and suffer from robustness issues under dynamic channel environment. In this paper, we propose an adaptive joint sampling-semantic-channel coding (Adaptive-JSSCC) framework. Specifically, we propose a semantic-aware sampling and reconstruction method to… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  10. arXiv:2501.15217  [pdf, other

    cs.LG eess.SY

    Predictive Lagrangian Optimization for Constrained Reinforcement Learning

    Authors: Tianqi Zhang, Puzhen Yuan, Guojian Zhan, Ziyu Lin, Yao Lyu, Zhenzhi Qin, Jingliang Duan, Liping Zhang, Shengbo Eben Li

    Abstract: Constrained optimization is popularly seen in reinforcement learning for addressing complex control tasks. From the perspective of dynamic system, iteratively solving a constrained optimization problem can be framed as the temporal evolution of a feedback control system. Classical constrained optimization methods, such as penalty and Lagrangian approaches, inherently use proportional and integral… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

  11. arXiv:2501.05859  [pdf, other

    eess.AS

    Large Model Empowered Streaming Speech Semantic Communications

    Authors: Zhenzi Weng, Zhijin Qin, Geoffrey Ye Li

    Abstract: In this paper, we introduce a large model-empowered streaming semantic communication system for speech transmission across various languages, named LSSC-ST. Specifically, we devise an edge-device collaborative semantic communication architecture by offloading the intricate semantic extraction and channel coding modules to edge servers, thereby reducing the computational burden on local devices. To… ▽ More

    Submitted 21 February, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

  12. arXiv:2412.16827  [pdf, other

    cs.IT eess.SP

    Optimal Error Analysis of Channel Estimation for IRS-assisted MIMO Systems

    Authors: Zhen Qin, Zhihui Zhu

    Abstract: As intelligent reflecting surface (IRS) has emerged as a new and promising technology capable of configuring the wireless environment favorably, channel estimation for IRS-assisted multiple-input multiple-output (MIMO) systems has garnered extensive attention in recent years. While various algorithms have been proposed to address this challenge, there is a lack of rigorous theoretical error analys… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

  13. arXiv:2412.02538  [pdf, other

    cs.IT cs.LG eess.SP

    On Privacy, Security, and Trustworthiness in Distributed Wireless Large AI Models (WLAM)

    Authors: Zhaohui Yang, Wei Xu, Le Liang, Yuanhao Cui, Zhijin Qin, Merouane Debbah

    Abstract: Combining wireless communication with large artificial intelligence (AI) models can open up a myriad of novel application scenarios. In sixth generation (6G) networks, ubiquitous communication and computing resources allow large AI models to serve democratic large AI models-related services to enable real-time applications like autonomous vehicles, smart cities, and Internet of Things (IoT) ecosys… ▽ More

    Submitted 4 December, 2024; v1 submitted 3 December, 2024; originally announced December 2024.

    Comments: 12 pages, 4 figures

  14. arXiv:2411.04452  [pdf, other

    quant-ph eess.SP math.OC

    Optimal Allocation of Pauli Measurements for Low-rank Quantum State Tomography

    Authors: Zhen Qin, Casey Jameson, Zhexuan Gong, Michael B. Wakin, Zhihui Zhu

    Abstract: The process of reconstructing quantum states from experimental measurements, accomplished through quantum state tomography (QST), plays a crucial role in verifying and benchmarking quantum devices. A key challenge of QST is to find out how the accuracy of the reconstruction depends on the number of state copies used in the measurements. When multiple measurement settings are used, the total number… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

  15. arXiv:2410.21000  [pdf, other

    eess.IV cs.AI cs.CV

    Efficient Bilinear Attention-based Fusion for Medical Visual Question Answering

    Authors: Zhilin Zhang, Jie Wang, Zhanghao Qin, Ruiqi Zhu, Xiaoliang Gong

    Abstract: Medical Visual Question Answering (MedVQA) has attracted growing interest at the intersection of medical image understanding and natural language processing for clinical applications. By interpreting medical images and providing precise answers to relevant clinical inquiries, MedVQA has the potential to support diagnostic decision-making and reduce workload across various fields like radiology. Wh… ▽ More

    Submitted 11 May, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

    Comments: To be published in 2025 International Joint Conference on Neural Networks (IJCNN)

  16. arXiv:2410.20326  [pdf, other

    eess.SY cs.RO

    SEEV: Synthesis with Efficient Exact Verification for ReLU Neural Barrier Functions

    Authors: Hongchao Zhang, Zhizhen Qin, Sicun Gao, Andrew Clark

    Abstract: Neural Control Barrier Functions (NCBFs) have shown significant promise in enforcing safety constraints on nonlinear autonomous systems. State-of-the-art exact approaches to verifying safety of NCBF-based controllers exploit the piecewise-linear structure of ReLU neural networks, however, such approaches still rely on enumerating all of the activation regions of the network near the safety boundar… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

  17. arXiv:2410.17343  [pdf

    eess.SP cs.AI cs.LG

    EEG-DIF: Early Warning of Epileptic Seizures through Generative Diffusion Model-based Multi-channel EEG Signals Forecasting

    Authors: Zekun Jiang, Wei Dai, Qu Wei, Ziyuan Qin, Kang Li, Le Zhang

    Abstract: Multi-channel EEG signals are commonly used for the diagnosis and assessment of diseases such as epilepsy. Currently, various EEG diagnostic algorithms based on deep learning have been developed. However, most research efforts focus solely on diagnosing and classifying current signal data but do not consider the prediction of future trends for early warning. Additionally, since multi-channel EEG c… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: 9 pages, 4 figures, 3 tables, accepted by ACM BCB 2024

  18. arXiv:2410.15224  [pdf, other

    math.OC cs.LG eess.SP

    Robust Low-rank Tensor Train Recovery

    Authors: Zhen Qin, Zhihui Zhu

    Abstract: Tensor train (TT) decomposition represents an $N$-order tensor using $O(N)$ matrices (i.e., factors) of small dimensions, achieved through products among these factors. Due to its compact representation, TT decomposition has found wide applications, including various tensor recovery problems in signal processing and quantum information. In this paper, we study the problem of reconstructing a TT fo… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  19. arXiv:2410.04475  [pdf, other

    cs.IT eess.SP

    Partial reciprocity-based precoding matrix prediction in FDD massive MIMO with mobility

    Authors: Ziao Qin, Haifan Yin

    Abstract: The timely precoding of frequency division duplex (FDD) massive multiple-input multiple-output (MIMO) systems is a substantial challenge in practice, especially in mobile environments. In order to improve the precoding performance and reduce the precoding complexity, we propose a partial reciprocity-based precoding matrix prediction scheme and further reduce its complexity by exploiting the channe… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

    Comments: 5 pages, 4 figures, 1 tabs

  20. arXiv:2410.02583  [pdf, other

    quant-ph cs.IT eess.SP math.OC

    Sample-Efficient Quantum State Tomography for Structured Quantum States in One Dimension

    Authors: Zhen Qin, Casey Jameson, Alireza Goldar, Michael B. Wakin, Zhexuan Gong, Zhihui Zhu

    Abstract: While quantum state tomography (QST) remains the gold standard for benchmarking and verifying quantum devices, it requires an exponentially large number of measurements and classical computational resources for generic quantum many-body systems, making it impractical even for intermediate-size quantum devices. Fortunately, many physical quantum states often exhibit certain low-dimensional structur… ▽ More

    Submitted 1 May, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

  21. arXiv:2410.01070  [pdf, other

    cs.NI eess.SP

    Meta Learning Based Adaptive Cooperative Perception in Nonstationary Vehicular Networks

    Authors: Kaige Qu, Zixiong Qin, Weihua Zhuang

    Abstract: To accommodate high network dynamics in real-time cooperative perception (CP), reinforcement learning (RL) based adaptive CP schemes have been proposed, to allow adaptive switchings between CP and stand-alone perception modes among connected and autonomous vehicles. The traditional offline-training online-execution RL framework suffers from performance degradation under nonstationary network condi… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  22. arXiv:2409.07482  [pdf, other

    eess.SP cs.AI

    VSLLaVA: a pipeline of large multimodal foundation model for industrial vibration signal analysis

    Authors: Qi Li, Jinfeng Huang, Hongliang He, Xinran Zhang, Feibin Zhang, Zhaoye Qin, Fulei Chu

    Abstract: Large multimodal foundation models have been extensively utilized for image recognition tasks guided by instructions, yet there remains a scarcity of domain expertise in industrial vibration signal analysis. This paper presents a pipeline named VSLLaVA that leverages a large language model to integrate expert knowledge for identification of signal parameters and diagnosis of faults. Within this pi… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  23. arXiv:2408.04535  [pdf, other

    eess.IV cs.AI

    Synchronous Multi-modal Semantic Communication System with Packet-level Coding

    Authors: Yun Tian, Jingkai Ying, Zhijin Qin, Ye Jin, Xiaoming Tao

    Abstract: Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction of multimodal semantics have not been well studied. Due to the independent design of semantic encoders, synchronizing multimodal features in both the sem… ▽ More

    Submitted 10 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: 12 pages, 9 figures

  24. arXiv:2407.14140  [pdf, other

    eess.SP

    A Secure and Efficient Distributed Semantic Communication System for Heterogeneous Internet of Things

    Authors: Weihao Zeng, Xinyu Xu, Qianyun Zhang, Jiting Shi, Zhenyu Guan, Shufeng Li, Zhijin Qin

    Abstract: Semantic communications are expected to improve the transmission efficiency in Internet of Things (IoT) networks. However, the distributed nature of networks and heterogeneity of devices challenge the secure utilization of semantic communication systems. In this paper, we develop a distributed semantic communication system that achieves the security and efficiency during update and usage phases. A… ▽ More

    Submitted 11 December, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  25. arXiv:2406.16314  [pdf, other

    eess.AS

    DreamVoice: Text-Guided Voice Conversion

    Authors: Jiarui Hai, Karan Thakkar, Helin Wang, Zengyi Qin, Mounya Elhilali

    Abstract: Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot voice conversion (VC) requires a target recording during inference, limiting ease of usage in generating desired voice timbres. Text-guided generation offers an intuitive solution to convert voices to desired "DreamVoices" according to the users' needs. O… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  26. arXiv:2406.06002  [pdf, other

    cs.LG eess.SP math.OC

    Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition

    Authors: Zhen Qin, Zhihui Zhu

    Abstract: Recently, a tensor-on-tensor (ToT) regression model has been proposed to generalize tensor recovery, encompassing scenarios like scalar-on-tensor regression and tensor-on-vector regression. However, the exponential growth in tensor complexity poses challenges for storage and computation in ToT regression. To overcome this hurdle, tensor decompositions have been introduced, with the tensor train (T… ▽ More

    Submitted 1 May, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.02592

  27. arXiv:2405.12580  [pdf, other

    eess.SP eess.IV

    Hybrid Digital-Analog Semantic Communications

    Authors: Huiqiang Xie, Zhijin Qin, Zhu Han, Khaled B. Letaief

    Abstract: Digital and analog semantic communications (SemCom) face inherent limitations such as data security concerns in analog SemCom, as well as leveling-off and cliff-edge effects in digital SemCom. In order to overcome these challenges, we propose a novel SemCom framework and a corresponding system called HDA-DeepSC, which leverages a hybrid digital-analog approach for multimedia transmission. This is… ▽ More

    Submitted 27 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 13 pages, 8 figures

  28. arXiv:2405.08096  [pdf, ps, other

    eess.AS cs.SD

    Semantic MIMO Systems for Speech-to-Text Transmission

    Authors: Zhenzi Weng, Zhijin Qin, Huiqiang Xie, Xiaoming Tao, Khaled B. Letaief

    Abstract: Semantic communications have been utilized to execute numerous intelligent tasks by transmitting task-related semantic information instead of bits. In this article, we propose a semantic-aware speech-to-text transmission system for the single-user multiple-input multiple-output (MIMO) and multi-user MIMO communication scenarios, named SAC-ST. Particularly, a semantic communication system to serve… ▽ More

    Submitted 5 October, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  29. arXiv:2404.19477  [pdf, other

    eess.SP

    Hybrid Bit and Semantic Communications

    Authors: Kaiwen Yu, Renhe Fan, Gang Wu, Zhijin Qin

    Abstract: Semantic communication technology is regarded as a method surpassing the Shannon limit of bit transmission, capable of effectively enhancing transmission efficiency. However, current approaches that directly map content to transmission symbols are challenging to deploy in practice, imposing significant limitations on the development of semantic communication. To address this challenge, we propose… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  30. arXiv:2404.17867  [pdf, other

    cs.CV eess.IV

    Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics

    Authors: Xiaoshuai Wu, Xin Liao, Bo Ou, Yuling Liu, Zheng Qin

    Abstract: AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  31. arXiv:2403.09222  [pdf, other

    eess.SP

    A Robust Semantic Communication System for Image

    Authors: Xiang Peng, Zhijin Qin, Xiaoming Tao, Jianhua Lu, Khaled B. Letaief

    Abstract: Semantic communications have gained significant attention as a promising approach to address the transmission bottleneck, especially with the continuous development of 6G techniques. Distinct from the well investigated physical channel impairments, this paper focuses on semantic impairments in image, particularly those arising from adversarial perturbations. Specifically, we propose a novel metric… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 6 pages

  32. arXiv:2403.05187  [pdf, ps, other

    eess.AS

    Robust Semantic Communications for Speech Transmission

    Authors: Zhenzi Weng, Zhijin Qin, Geoffrey Ye Li

    Abstract: In this paper, we propose a robust semantic communication system for speech transmission, named Ross-S2T, by delivering the essential semantic information. Specifically, we consider the speech-to-text translation (S2TT) as the transmission goal. First, a new deep semantic encoder is developed to convert speech in the source language to textual features associated with the target language, facilita… ▽ More

    Submitted 4 July, 2025; v1 submitted 8 March, 2024; originally announced March 2024.

  33. Computational Offloading in Semantic-Aware Cloud-Edge-End Collaborative Networks

    Authors: Zelin Ji, Zhijin Qin

    Abstract: The trend of massive connectivity pushes forward the explosive growth of end devices. The emergence of various applications has prompted a demand for pervasive connectivity and more efficient computing paradigms. On the other hand, the lack of computational capacity of the end devices restricts the implementation of the intelligent applications, and becomes a bottleneck of the multiple access for… ▽ More

    Submitted 19 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Submitted to IEEE JSTSP

  34. arXiv:2402.13073  [pdf, other

    eess.SP

    Towards Intelligent Communications: Large Model Empowered Semantic Communications

    Authors: Huiqiang Xie, Zhijin Qin, Xiaoming Tao, Zhu Han

    Abstract: Deep learning enabled semantic communications have shown great potential to significantly improve transmission efficiency and alleviate spectrum scarcity, by effectively exchanging the semantics behind the data. Recently, the emergence of large models, boasting billions of parameters, has unveiled remarkable human-like intelligence, offering a promising avenue for advancing semantic communication… ▽ More

    Submitted 19 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 7 pages, 6 figures

  35. arXiv:2401.02592  [pdf, other

    stat.ML cs.LG eess.SP math.OC

    Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

    Authors: Zhen Qin, Michael B. Wakin, Zhihui Zhu

    Abstract: In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those fact… ▽ More

    Submitted 21 December, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Journal ref: Journal of Machine Learning Research (December 2024)

  36. arXiv:2401.00859  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Federated Multi-View Synthesizing for Metaverse

    Authors: Yiyu Guo, Zhijin Qin, Xiaoming Tao, Geoffrey Ye Li

    Abstract: The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

  37. arXiv:2312.01479  [pdf, other

    cs.SD cs.LG eess.AS

    OpenVoice: Versatile Instant Voice Cloning

    Authors: Zengyi Qin, Wenliang Zhao, Xumin Yu, Xin Sun

    Abstract: We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotio… ▽ More

    Submitted 18 August, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Technical Report

  38. arXiv:2311.04685  [pdf, other

    eess.IV

    An End-Cloud Computing Enabled Surveillance Video Transmission System

    Authors: Dingxi Yang, Zhijin Qin, Liting Wang, Xiaoming Tao, Fang Cui, Hengjiang Wang

    Abstract: The enormous data volume of video poses a significant burden on the network. Particularly, transferring high-definition surveillance videos to the cloud consumes a significant amount of spectrum resources. To address these issues, we propose a surveillance video transmission system enabled by end-cloud computing. Specifically, the cameras actively down-sample the original video and then a redundan… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  39. arXiv:2310.06246  [pdf, other

    eess.IV

    Compression Ratio Learning and Semantic Communications for Video Imaging

    Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

    Abstract: Camera sensors have been widely used in intelligent robotic systems. Developing camera sensors with high sensing efficiency has always been important to reduce the power, memory, and other related resources. Inspired by recent success on programmable sensors and deep optic methods, we design a novel video compressed sensing system with spatially-variant compression ratios, which achieves higher im… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  40. arXiv:2308.12619  [pdf, other

    cs.IT eess.SP

    Low-complexity eigenvector prediction-based precoding matrix prediction in massive MIMO with mobility

    Authors: Ziao Qin, Haifan Yin, Weidong Li

    Abstract: In practical massive multiple-input multiple-output (MIMO) systems, the precoding matrix is often obtained from the eigenvectors of channel matrices and is challenging to update in time due to finite computation resources at the base station, especially in mobile scenarios. In order to reduce the precoding complexity while enhancing the spectral efficiency (SE), a novel precoding matrix prediction… ▽ More

    Submitted 30 June, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 13pages, 8 figures, 1 table, journal

  41. arXiv:2307.03246  [pdf, other

    eess.IV

    Semantic-Aware Image Compressed Sensing

    Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

    Abstract: Deep learning based image compressed sensing (CS) has achieved great success. However, existing CS systems mainly adopt a fixed measurement matrix to images, ignoring the fact the optimal measurement numbers and bases are different for different images. To further improve the sensing efficiency, we propose a novel semantic-aware image CS system. In our system, the encoder first uses a fixed number… ▽ More

    Submitted 10 July, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Modified version

  42. Meta Federated Reinforcement Learning for Distributed Resource Allocation

    Authors: Zelin Ji, Zhijin Qin, Xiaoming Tao

    Abstract: In cellular networks, resource allocation is usually performed in a centralized way, which brings huge computation complexity to the base station (BS) and high transmission overhead. This paper explores a distributed resource allocation method that aims to maximize energy efficiency (EE) while ensuring the quality of service (QoS) for users. Specifically, in order to address wireless channel condi… ▽ More

    Submitted 9 July, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Submitted to TWC

  43. arXiv:2306.09432  [pdf, other

    quant-ph cs.IT eess.SP

    Quantum State Tomography for Matrix Product Density Operators

    Authors: Zhen Qin, Casey Jameson, Zhexuan Gong, Michael B. Wakin, Zhihui Zhu

    Abstract: The reconstruction of quantum states from experimental measurements, often achieved using quantum state tomography (QST), is crucial for the verification and benchmarking of quantum devices. However, performing QST for a generic unstructured quantum state requires an enormous number of state copies that grows \emph{exponentially} with the number of individual quanta in the system, even for the mos… ▽ More

    Submitted 18 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Journal ref: IEEE Transactions on Information Theory (PP. 5030 - 5056, Volume: 70, July 2024)

  44. arXiv:2305.06543  [pdf, other

    cs.IT eess.SP

    QoE-based Semantic-Aware Resource Allocation for Multi-Task Networks

    Authors: Lei Yan, Zhijin Qin, Chunfeng Li, Rui Zhang, Yongzhao Li, Xiaoming Tao

    Abstract: By transmitting task-related information only, semantic communications yield significant performance gains over conventional communications. However, the lack of mature semantic theory about semantic information quantification and performance evaluation makes it challenging to perform resource allocation for semantic communications, especially when multiple tasks coexist in the network. To cope wi… ▽ More

    Submitted 8 April, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: This work has been accepted by IEEE Transactions on Wireless Communications. arXiv admin note: text overlap with arXiv:2205.14530

  45. arXiv:2304.14598  [pdf, other

    cs.IT eess.SP

    A manifold learning-based CSI feedback framework for FDD massive MIMO

    Authors: Yandi Cao, Haifan Yin, Ziao Qin, Weidong Li, Weimin Wu, Mérouane Debbah

    Abstract: Massive multi-input multi-output (MIMO) in Frequency Division Duplex (FDD) mode suffers from heavy feedback overhead for Channel State Information (CSI). In this paper, a novel manifold learning-based CSI feedback framework (MLCF) is proposed to reduce the feedback and improve the spectral efficiency for FDD massive MIMO. Manifold learning (ML) is an effective method for dimensionality reduction.… ▽ More

    Submitted 23 August, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: 14 pages, 7 figures, 2 tables, to appear in IEEE Tans.Commun

  46. arXiv:2304.11838  [pdf, other

    eess.SP

    Dynamic Compressive Sensing based on RLS for Underwater Acoustic Communications

    Authors: Zhen Qin

    Abstract: Sparse structures are widely recognized and utilized in channel estimation. Two typical mechanisms, namely proportionate updating (PU) and zero-attracting (ZA) techniques, achieve better performance, but their computational complexity are higher than non-sparse counterparts. In this paper, we propose a DCS technique based on the recursive least squares (RLS) algorithm which can simultaneously achi… ▽ More

    Submitted 4 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  47. arXiv:2304.09850  [pdf, other

    cs.RO eess.SY

    Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

    Authors: Sander Tonkens, Alex Toofanian, Zhizhen Qin, Sicun Gao, Sylvia Herbert

    Abstract: Safe value functions, such as control barrier functions, characterize a safe set and synthesize a safety filter, overriding unsafe actions, for a dynamic system. While function approximators like neural networks can synthesize approximately safe value functions, they typically lack formal guarantees. In this paper, we propose a local dynamic programming-based approach to "patch" approximately safe… ▽ More

    Submitted 6 September, 2024; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 8 pages, IEEE Conference on Decision and Control (CDC), 2024 (In Press)

  48. arXiv:2303.12335  [pdf, other

    eess.SP

    Semantic Communication with Memory

    Authors: Huiqiang Xie, Zhijin Qin, Geoffrey Ye Li

    Abstract: While semantic communication succeeds in efficiently transmitting due to the strong capability to extract the essential semantic information, it is still far from the intelligent or human-like communications. In this paper, we introduce an essential component, memory, into semantic communications to mimic human communications. Particularly, we investigate a deep learning (DL) based semantic commun… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 12 pages

  49. arXiv:2302.09222  [pdf, other

    cs.IT eess.SP

    A review of codebooks for CSI feedback in 5G new radio and beyond

    Authors: Ziao Qin, Haifan Yin

    Abstract: Codebooks have been indispensable for wireless communication standard since the first release of the Long-Term Evolution in 2009. They offer an efficient way to acquire the channel state information (CSI) for multiple antenna systems. Nowadays, a codebook is not limited to a set of pre-defined precoders, it refers to a CSI feedback framework, which is more and more sophisticated. In this paper, we… ▽ More

    Submitted 13 June, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: 11pages, 7 figures, 1 table, magzine review

  50. arXiv:2302.08645  [pdf, ps, other

    eess.SP eess.IV

    Semantic Communications with Variable-Length Coding for Extended Reality

    Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

    Abstract: Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded i… ▽ More

    Submitted 11 March, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: 1. Update the performance of VL-SCC in Fig8. under new rate allocation architecture 2. Give a fair comparison between VL-SCC and SCC in Fig9. 3. fix the typo of LDPC rate (1/3 changed to 2/3) 4. Reduce L=32 to 16, and update the bpp