Skip to main content

Showing 1–50 of 74 results for author: Shi, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.01624  [pdf, ps, other

    eess.SP

    Frequency-switching Array Enhanced Physical-Layer Security in Terahertz Bands: A Movable Antenna Perspective

    Authors: Cong Zhou, Changsheng You, Shuo Shi, Weidong Mei

    Abstract: In this paper, we propose a new frequency-switching array (FSA) enhanced physical-layer security (PLS) system in terahertz bands, where the carrier frequency can be flexibly switched and small frequency offsets can be imposed on each antenna at Alice, so as to eliminate information wiretapping by undesired eavesdroppers. First, we analytically show that by flexibly controlling the carrier frequenc… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: In this paper, we propose to enhance physical-layer security by using a new frequency-switching array, which is equivalent to movable antennas

  2. arXiv:2505.19744  [pdf, ps, other

    eess.SY

    Scalable quantile predictions of peak loads for non-residential customer segments

    Authors: Shaohong Shi, Jacco Heres, Simon H. Tindemans

    Abstract: Electrical grid congestion has emerged as an immense challenge in Europe, making the forecasting of load and its associated metrics increasingly crucial. Among these metrics, peak load is fundamental. Non-time-resolved models of peak load have their advantages of being simple and compact, and among them Velander's formula (VF) is widely used in distribution network planning. However, several aspec… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: submitted to IEEE PES ISGT (Innovative Smart Grid Technologies) Europe 2025 conference

  3. arXiv:2505.17513  [pdf, ps, other

    cs.LG cs.CL cs.SD eess.AS

    What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection

    Authors: Binh Nguyen, Shuji Shi, Ryan Ofman, Thai Le

    Abstract: Recent advances in text-to-speech technologies have enabled realistic voice generation, fueling audio-based deepfake attacks such as fraud and impersonation. While audio anti-spoofing systems are critical for detecting such threats, prior work has predominantly focused on acoustic-level perturbations, leaving the impact of linguistic variation largely unexplored. In this paper, we investigate the… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 15 pages, 2 fogures

    MSC Class: 53-04

  4. arXiv:2505.15515  [pdf, other

    eess.SY

    From learning to safety: A Direct Data-Driven Framework for Constrained Control

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Ensuring safety in the sense of constraint satisfaction for learning-based control is a critical challenge, especially in the model-free case. While safety filters address this challenge in the model-based setting by modifying unsafe control inputs, they typically rely on predictive models derived from physics or data. This reliance limits their applicability for advanced model-free learning contr… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  5. arXiv:2504.19262  [pdf, other

    eess.SP

    Super-resolution Wideband Beam Training for Near-field Communications with Ultra-low Overhead

    Authors: Cong Zhou, Changsheng You, Shuo Shi, Jiasi Zhou, Chenyu Wu

    Abstract: In this paper, we propose a super-resolution wideband beam training method for near-field communications, which is able to achieve ultra-low overhead. To this end, we first study the multi-beam characteristic of a sparse uniform linear array (S-ULA) in the wideband. Interestingly, we show that this leads to a new beam pattern property, called rainbow blocks, where the S-ULA generates multiple grat… ▽ More

    Submitted 2 May, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

  6. arXiv:2504.04765  [pdf, other

    eess.SY

    Multi-Agent Deep Reinforcement Learning for Multiple Anesthetics Collaborative Control

    Authors: Huijie Li, Yide Yu, Si Shi, Anmin Hu, Jian Huo, Wei Lin, Chaoran Wu, Wuman Luo

    Abstract: Automated control of personalized multiple anesthetics in clinical Total Intravenous Anesthesia (TIVA) is crucial yet challenging. Current systems, including target-controlled infusion (TCI) and closed-loop systems, either rely on relatively static pharmacokinetic/pharmacodynamic (PK/PD) models or focus on single anesthetic control, limiting personalization and collaborative control. To address th… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  7. arXiv:2503.20681  [pdf, other

    eess.AS cs.CV cs.LG cs.SD

    Benchmarking Machine Learning Methods for Distributed Acoustic Sensing

    Authors: Shuaikai Shi, Qijun Zong

    Abstract: Distributed acoustic sensing (DAS) technology represents an innovative fiber-optic-based sensing methodology that enables real-time acoustic signal monitoring through the detection of minute perturbations along optical fibers. This sensing approach offers compelling advantages, including extensive measurement ranges, exceptional spatial resolution, and an expansive dynamic measurement spectrum.… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  8. arXiv:2503.04681  [pdf, other

    eess.SP

    Mixed Near-field and Far-field Target Localization for Low-altitude Economy

    Authors: Cong Zhou, Changsheng You, Chao Zhou, Hongqiang Cheng, Shuo Shi

    Abstract: In this paper, we study efficient mixed near-field and far-field target localization methods for low-altitude economy, by capitalizing on extremely large-scale multiple-input multiple-output (XL-MIMO) communication systems. Compared with existing works, we address three new challenges in localization, arising from 1) half-wavelength antenna spacing constraint, 2) hybrid uniform planar array (UPA)… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: An effective mixed near-field and far-field target localization method by employing typical wireless communication infrastructures is proposed in this paper

  9. arXiv:2412.10625  [pdf, other

    math.OC eess.SY

    Certainty-Equivalence Model Predictive Control: Stability, Performance, and Beyond

    Authors: Changrui Liu, Shengling Shi, Bart De Schutter

    Abstract: Handling model mismatch is a common challenge in model-based controller design, particularly in model predictive control (MPC). While robust MPC is effective in managing uncertainties, its conservatism often makes it less desirable in practice. Certainty-equivalence MPC (CE-MPC), which relies on a nominal model, offers an appealing alternative due to its design simplicity and low computational req… ▽ More

    Submitted 28 March, 2025; v1 submitted 13 December, 2024; originally announced December 2024.

    Comments: 16 pages with some proofs omitted for brevity; simulation is included. Submitted to IEEE Transactions on Automatic Control

  10. arXiv:2412.08577  [pdf, other

    cs.SD cs.MM eess.AS

    Mel-Refine: A Plug-and-Play Approach to Refine Mel-Spectrogram in Audio Generation

    Authors: Hongming Guo, Ruibo Fu, Yizhong Geng, Shuai Liu, Shuchen Shi, Tao Wang, Chunyu Qiang, Chenxing Li, Ya Li, Zhengqi Wen, Yukun Liu, Xuefei Liu

    Abstract: Text-to-audio (TTA) model is capable of generating diverse audio from textual prompts. However, most mainstream TTA models, which predominantly rely on Mel-spectrograms, still face challenges in producing audio with rich content. The intricate details and texture required in Mel-spectrograms for such audio often surpass the models' capacity, leading to outputs that are blurred or lack coherence. I… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

  11. arXiv:2409.20019  [pdf, other

    physics.optics eess.SP physics.app-ph

    Integrated RF Photonic Front-End Capable of Simultaneous Cascaded Functions

    Authors: Shangqing Shi, Kaixuan Ye, Chuangchuang Wei, Martijn van den Berg, Binfeng Yun, David Marpaung

    Abstract: Integrated microwave photonic (MWP) front-ends are capable of ultra-broadband signal reception and processing. However, state-of-the-art demonstrations are limited to performing only one specific functionality at any given time, which fails to meet the demands of advanced radio frequency applications in real-world electromagnetic environments. In this paper, we present a major departure from the c… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  12. arXiv:2409.11909  [pdf, other

    cs.SD eess.AS

    Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0

    Authors: Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xiaopeng Wang, Yuankun Xie, Xin Qi, Shuchen Shi, Yi Lu, Yukun Liu, Chenxing Li, Xuefei Liu, Guanjun Li

    Abstract: Speech synthesis technology has posed a serious threat to speaker verification systems. Currently, the most effective fake audio detection methods utilize pretrained models, and integrating features from various layers of pretrained model further enhances detection performance. However, most of the previously proposed fusion methods require fine-tuning the pretrained models, resulting in exces… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: submitted to ICASSP2025

  13. arXiv:2409.11835  [pdf, other

    cs.SD cs.AI eess.AS

    DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech

    Authors: Xin Qi, Ruibo Fu, Zhengqi Wen, Tao Wang, Chunyu Qiang, Jianhua Tao, Chenxing Li, Yi Lu, Shuchen Shi, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Xuefei Liu, Guanjun Li

    Abstract: In recent years, speech diffusion models have advanced rapidly. Alongside the widely used U-Net architecture, transformer-based models such as the Diffusion Transformer (DiT) have also gained attention. However, current DiT speech models treat Mel spectrograms as general images, which overlooks the specific acoustic properties of speech. To address these limitations, we propose a method called Dir… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: Submitted to ICASSP2025

  14. arXiv:2409.09381  [pdf, other

    eess.AS cs.AI cs.SD

    Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation

    Authors: Chenxu Xiong, Ruibo Fu, Shuchen Shi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chenxing Li, Chunyu Qiang, Yuankun Xie, Xin Qi, Guanjun Li, Zizheng Yang

    Abstract: Current mainstream audio generation methods primarily rely on simple text prompts, often failing to capture the nuanced details necessary for multi-style audio generation. To address this limitation, the Sound Event Enhanced Prompt Adapter is proposed. Unlike traditional static global style transfer, this method extracts style embedding through cross-attention between text and reference audio for… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: 5 pages, 2 figures, submitted to ICASSP 2025

  15. arXiv:2409.03883  [pdf, other

    eess.SY

    Data-informativity conditions for structured linear systems with implications for dynamic networks

    Authors: Paul M. J. Van den Hof, Shengling Shi, Stefanie J. M. Fonken, Karthik R. Ramaswamy, Håkan Hjalmarsson, Arne G. Dankers

    Abstract: When estimating models of a multivariable dynamic system, a typical condition for consistency is to require the input signals to be persistently exciting, which is guaranteed if the input spectrum is positive definite for a sufficient number of frequencies. In this paper it is investigated how such a condition can be relaxed by exploiting prior structural information on the multivariable system, s… ▽ More

    Submitted 20 November, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

    Comments: 16 pages, 4 figures

  16. arXiv:2408.10852  [pdf, other

    cs.SD eess.AS

    EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

    Authors: Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li

    Abstract: In the current era of Artificial Intelligence Generated Content (AIGC), a Low-Rank Adaptation (LoRA) method has emerged. It uses a plugin-based approach to learn new knowledge with lower parameter quantities and computational costs, and it can be plugged in and out based on the specific sub-tasks, offering high flexibility. However, the current application schemes primarily incorporate LoRA into t… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  17. arXiv:2407.12038  [pdf, ps, other

    eess.AS cs.AI

    ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024

    Authors: Ruibo Fu, Rui Liu, Chunyu Qiang, Yingming Gao, Yi Lu, Shuchen Shi, Tao Wang, Ya Li, Zhengqi Wen, Chen Zhang, Hui Bu, Yukun Liu, Xin Qi, Guanjun Li

    Abstract: The Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC 2024) is part of the ISCSLP 2024 Competitions and Challenges track. While current text-to-speech (TTS) technology can generate high-quality audio, its ability to convey complex emotions and controlled detail content remains limited. This constraint leads to a discrepancy between the generated audio and human subjective percept… ▽ More

    Submitted 31 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: ISCSLP 2024 Challenge description and results

  18. arXiv:2407.09972  [pdf, ps, other

    cs.LG cs.CR eess.IV

    MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models

    Authors: Shanghao Shi, Md Shahedul Haque, Abhijeet Parida, Chaoyu Zhang, Marius George Linguraru, Y. Thomas Hou, Syed Muhammad Anwar, Wenjing Lou

    Abstract: Federated learning (FL) allows participants to collaboratively train machine learning models while keeping their data local, making it ideal for collaborations among healthcare institutions on sensitive data. However, in this paper, we propose a novel privacy attack called MedLeak, which allows a malicious FL server to recover high-quality site-specific private medical data from the client model u… ▽ More

    Submitted 29 June, 2025; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted by the IEEE/ACM conference on Connected Health: Applications, Systems and Engineering Technologies 2025 (CHASE'25)

  19. arXiv:2407.05421  [pdf, other

    eess.AS cs.SD

    ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

    Authors: Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Zhiyong Wang, Yi Lu, Xiaopeng Wang, Shuchen Shi, Yukun Liu, Xuefei Liu, Shuai Zhang

    Abstract: Speaker adaptation, which involves cloning voices from unseen speakers in the Text-to-Speech task, has garnered significant interest due to its numerous applications in multi-media fields. Despite recent advancements, existing methods often struggle with inadequate speaker representation accuracy and overfitting, particularly in limited reference speeches scenarios. To address these challenges, we… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: The audio demo is available at https://7xin.github.io/ASRRL/

  20. arXiv:2406.17801  [pdf, other

    cs.SD cs.CL eess.AS

    A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

    Authors: Xiaopeng Wang, Yi Lu, Xin Qi, Zhiyong Wang, Yuankun Xie, Shuchen Shi, Ruibo Fu

    Abstract: This paper presents the development of a speech synthesis system for the LIMMITS'24 Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a multi-speaker, multi-lingual Indic Text-to-Speech system with voice cloning capabilities, covering seven Indian languages with both male and female speakers. The system was trained using challenge data and fine-tuned for few-… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  21. arXiv:2406.14931  [pdf, other

    eess.SP

    Multi-beam Training for Near-field Communications in High-frequency Bands

    Authors: Cong Zhou, Changsheng You, Zixuan Huang, Shuo Shi, Yi Gong, Chan-Byoung Chae, Kaibin Huang

    Abstract: In this paper, we study efficient multi-beam training design for near-field communications to reduce the beam training overhead of conventional single-beam training methods. In particular, the array-division based multi-beam training method, which is widely used in far-field communications, cannot be directly applied to the near-field scenario, since different sub-arrays may observe different user… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: In this paper, a novel near-field multi-beam training scheme is proposed by sparsely activating a portion of antennas to form a sparse linear array

  22. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  23. arXiv:2406.08112  [pdf, other

    cs.SD cs.AI eess.AS

    Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

    Authors: Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi

    Abstract: With the proliferation of Large Language Model (LLM) based deepfake audio, there is an urgent need for effective detection methods. Previous deepfake audio generation methods typically involve a multi-step generation process, with the final step using a vocoder to predict the waveform from handcrafted features. However, LLM-based audio is directly generated from discrete neural codecs in an end-to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. arXiv admin note: substantial text overlap with arXiv:2405.04880

  24. arXiv:2406.04683  [pdf, other

    cs.SD eess.AS

    PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

    Authors: Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang

    Abstract: Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description, playing a crucial role in media production. The text descriptions in TTA datasets lack rich variations and diversity, resulting in a drop in TTA model performance when faced with complex text. To address this issue, we propose a method called Portable Plug-in Prompt Refiner, which utilizes rich knowledge abo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  25. arXiv:2406.04262  [pdf, other

    eess.SP

    Near-field Beam Training with Sparse DFT Codebook

    Authors: Cong Zhou, Chenyu Wu, Changsheng You, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as one promising technology to improve the spectral efficiency and spatial resolution of future sixth generation (6G) wireless systems.The upsurge in the antenna number antennas renders communication users more likely to be located in the near-field region, which requires a more accurate spherical (instead of planar) wavefront propagation modeling… ▽ More

    Submitted 18 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: In this paper, we propose a novel sparse DFT codebook to reduce near-field beam training overhead, which is equivalent to sparsely activating the dense array

  26. arXiv:2406.03247  [pdf, other

    cs.SD eess.AS

    Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

    Authors: Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi

    Abstract: The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new spoofing techniques. Traditional FAD methods often focus solely on distinguishing between genuine and known spoofed audio. We propose a Genuine-Focused Learning (GFL) framework guided, aiming for highly generalized FAD, called GFL-FAD. This method incorporates a Counterfactual Reasoning Enhanced Representation… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  27. arXiv:2406.03237  [pdf, other

    cs.SD eess.AS

    Generalized Fake Audio Detection via Deep Stable Learning

    Authors: Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi

    Abstract: Although current fake audio detection approaches have achieved remarkable success on specific datasets, they often fail when evaluated with datasets from different distributions. Previous studies typically address distribution shift by focusing on using extra data or applying extra loss restrictions during training. However, these methods either require a substantial amount of data or complicate t… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  28. arXiv:2405.04066  [pdf, other

    cs.SI eess.SY

    Characterizing Regional Importance in Cities with Human Mobility Motifs in Metro Networks

    Authors: Shuyang Shi, Ding Lyu, Lin Wang, Xiaofan Wang, Guanrong Chen

    Abstract: Uncovering higher-order spatiotemporal dependencies within human mobility networks offers valuable insights into the analysis of urban structures. In most existing studies, human mobility networks are typically constructed by aggregating all trips without distinguishing who takes which specific trip. Instead, we claim individual mobility motifs, higher-order structures generated by daily trips of… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  29. arXiv:2404.13786  [pdf, other

    eess.SY cs.AI cs.DC cs.LG

    Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

    Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, Jingfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

    Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components ca… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  30. arXiv:2404.07620  [pdf, other

    eess.IV cs.CV

    Diffusion Probabilistic Multi-cue Level Set for Reducing Edge Uncertainty in Pancreas Segmentation

    Authors: Yue Gou, Yuming Xing, Shengzhu Shi, Zhichang Guo

    Abstract: Accurately segmenting the pancreas remains a huge challenge. Traditional methods encounter difficulties in semantic localization due to the small volume and distorted structure of the pancreas, while deep learning methods encounter challenges in obtaining accurate edges because of low contrast and organ overlapping. To overcome these issues, we propose a multi-cue level set method based on the dif… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  31. arXiv:2401.05690  [pdf, other

    cs.IT eess.SP

    Sparse Array Enabled Near-Field Communications: Beam Pattern Analysis and Hybrid Beamforming Design

    Authors: Cong Zhou, Changsheng You, Haodong Zhang, Li Chen, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as a promising technology to enable near-field communications for achieving enhanced spectrum efficiency and spatial resolution, by drastically increasing the number of antennas. However, this also inevitably incurs higher hardware and energy cost, which may not be affordable in future wireless systems. To address this issue, we propose in this pa… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: In this paper, we propose to exploit sparse arrays for enabling near-field communications and characterize its unique beam pattern for facilitating its hybrid beamforming design

  32. arXiv:2312.11255  [pdf, other

    eess.SY

    State-action control barrier functions: Imposing safety on learning-based control with low online computational costs

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Learning-based control with safety guarantees usually requires real-time safety certification and modifications of possibly unsafe learning-based policies. The control barrier function (CBF) method uses a safety filter containing a constrained optimization problem to produce safe policies. However, finding a valid CBF for a general nonlinear system requires a complex function parameterization, whi… ▽ More

    Submitted 24 October, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  33. arXiv:2311.03974  [pdf, ps, other

    cs.IT eess.SP

    NOMA Enabled Multi-Access Edge Computing: A Joint MU-MIMO Precoding and Computation Offloading Design

    Authors: Deyou Zhang, Meng Wang, Shuo Shi, Ming Xiao

    Abstract: This letter investigates computation offloading and transmit precoding co-design for multi-access edge computing (MEC), where multiple MEC users (MUs) equipped with multiple antennas access the MEC server in a non-orthogonal multiple access manner. We aim to minimize the total energy consumption of all MUs while satisfying the latency constraints by jointly optimizing the computational frequency,… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  34. arXiv:2311.02679  [pdf, other

    eess.SY cs.LG

    Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration

    Authors: Archith Athrey, Othmane Mazhar, Meichen Guo, Bart De Schutter, Shengling Shi

    Abstract: In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by… ▽ More

    Submitted 24 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  35. arXiv:2310.15937  [pdf, other

    eess.SY

    A Behavioral Perspective on Models of Linear Dynamical Networks with Manifest Variables

    Authors: Shengling Shi, Zhiyong Sun, Bart De Schutter

    Abstract: Networks of dynamical systems play an important role in various domains and have motivated many studies on the control and analysis of linear dynamical networks. For linear network models considered in these studies, it is typically pre-determined what signal channels are inputs and what are outputs. These models do not capture the practical need to incorporate different experimental situations, w… ▽ More

    Submitted 5 May, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  36. arXiv:2309.00223  [pdf, other

    eess.AS cs.CL cs.SD

    The FruitShell French synthesis system at the Blizzard 2023 Challenge

    Authors: Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi

    Abstract: This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phoneme… ▽ More

    Submitted 25 September, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

  37. arXiv:2307.03423  [pdf, other

    eess.IV cs.CV cs.LG

    Hyperspectral and Multispectral Image Fusion Using the Conditional Denoising Diffusion Probabilistic Model

    Authors: Shuaikai Shi, Lijun Zhang, Jie Chen

    Abstract: Hyperspectral images (HSI) have a large amount of spectral information reflecting the characteristics of matter, while their spatial resolution is low due to the limitations of imaging technology. Complementary to this are multispectral images (MSI), e.g., RGB images, with high spatial resolution but insufficient spectral bands. Hyperspectral and multispectral image fusion is a technique for acqui… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  38. arXiv:2307.03413  [pdf, other

    cs.CV eess.IV

    Unsupervised Hyperspectral and Multispectral Images Fusion Based on the Cycle Consistency

    Authors: Shuaikai Shi, Lijun Zhang, Yoann Altmann, Jie Chen

    Abstract: Hyperspectral images (HSI) with abundant spectral information reflected materials property usually perform low spatial resolution due to the hardware limits. Meanwhile, multispectral images (MSI), e.g., RGB images, have a high spatial resolution but deficient spectral signatures. Hyperspectral and multispectral image fusion can be cost-effective and efficient for acquiring both high spatial resolu… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  39. arXiv:2306.15723  [pdf, other

    eess.SY

    Approximate Dynamic Programming for Constrained Piecewise Affine Systems with Stability and Safety Guarantees

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Infinite-horizon optimal control of constrained piecewise affine (PWA) systems has been approximately addressed by hybrid model predictive control (MPC), which, however, has computational limitations, both in offline design and online implementation. In this paper, we consider an alternative approach based on approximate dynamic programming (ADP), an important class of methods in reinforcement lea… ▽ More

    Submitted 13 December, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

  40. arXiv:2305.19069  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Multi-source adversarial transfer learning for ultrasound image segmentation with limited similarity

    Authors: Yifu Zhang, Hongru Li, Tao Yang, Rui Tao, Zhengyuan Liu, Shimeng Shi, Jiansong Zhang, Ning Ma, Wujin Feng, Zhanhu Zhang, Xinyu Zhang

    Abstract: Lesion segmentation of ultrasound medical images based on deep learning techniques is a widely used method for diagnosing diseases. Although there is a large amount of ultrasound image data in medical centers and other places, labeled ultrasound datasets are a scarce resource, and it is likely that no datasets are available for new tissues/organs. Transfer learning provides the possibility to solv… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Submitted to Applied Soft Computing Journal

  41. arXiv:2305.11438  [pdf, other

    cs.CL eess.AS

    Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

    Authors: Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma

    Abstract: Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features. Deep neural networks are commonly trained to map fluency-related features into the human scores. However, the effectiveness of deep learning-based models is constrained by the limited amount of labeled training samples. To address this, we introduce a self-supervised learning (SSL) approach that take… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  42. arXiv:2305.10983  [pdf, other

    cs.CV eess.IV

    Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

    Authors: Tianhe Wu, Shuwei Shi, Haoming Cai, Mingdeng Cao, Jing Xiao, Yinqiang Zheng, Yujiu Yang

    Abstract: Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information. It is becoming more significant with the increasing advancement of virtual reality (VR) technology. However, the quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipe… ▽ More

    Submitted 10 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  43. arXiv:2305.01871  [pdf

    physics.med-ph eess.IV

    Convolutional neural network-based single-shot speckle tracking for x-ray phase-contrast imaging

    Authors: Serena Qinyun Z. Shi, Nadav Shapira, Peter B. Noël, Sebastian Meyer

    Abstract: X-ray phase-contrast imaging offers enhanced sensitivity for weakly-attenuating materials, such as breast and brain tissue, but has yet to be widely implemented clinically due to high coherence requirements and expensive x-ray optics. Speckle-based phase contrast imaging has been proposed as an affordable and simple alternative; however, obtaining high-quality phase-contrast images requires accura… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  44. arXiv:2302.12511  [pdf, ps, other

    cs.IT eess.SP

    Two-Stage Hierarchical Beam Training for Near-Field Communications

    Authors: Chenyu Wu, Changsheng You, Yuanwei Liu, Li Chen, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as a promising technology to improve the spectrum efficiency and spatial resolution of future wireless systems. However, the huge number of antennas renders the users more likely to locate in the near-field (instead of the far-field) region of the XL-array with spherical wavefront propagation. This inevitably incurs prohibitively high beam trainin… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: We proposed a novel two-stage hierarchical beam training method for near-field communication systems. This paper has been submitted to IEEE for possible publication

  45. arXiv:2302.10444  [pdf, other

    eess.AS cs.SD

    Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

    Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

    Abstract: Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly… ▽ More

    Submitted 13 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP 2023

  46. arXiv:2302.09928  [pdf, other

    eess.AS

    An ASR-free Fluency Scoring Approach with Self-Supervised Learning

    Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

    Abstract: A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, w… ▽ More

    Submitted 13 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP 2023

  47. arXiv:2301.07876  [pdf, ps, other

    eess.SY cs.LG

    Suboptimality analysis of receding horizon quadratic control with unknown linear systems and its applications in learning-based control

    Authors: Shengling Shi, Anastasios Tsiamis, Bart De Schutter

    Abstract: This work analyzes how the trade-off between the modeling error, the terminal value function error, and the prediction horizon affects the performance of a nominal receding-horizon linear quadratic (LQ) controller. By developing a novel perturbation result of the Riccati difference equation, a novel performance upper bound is obtained and suggests that for many cases, the prediction horizon can be… ▽ More

    Submitted 27 June, 2025; v1 submitted 18 January, 2023; originally announced January 2023.

  48. arXiv:2209.08209  [pdf, other

    eess.SY

    RISE-Based Adaptive Control with Mass-Inertia Parameter Estimation for Aerial Transportation of Multi-Rotor UAVs

    Authors: Shuyang Shi, Yuzhu Li, Wei Dong

    Abstract: This paper proposes an adaptive tracking strategy with mass-inertia estimation for aerial transportation problems of multi-rotor UAVs. The dynamic model of multi-rotor UAVs with disturbances is firstly developed with a linearly parameterized form. Subsequently, a cascade controller with the robust integral of the sign of the error (RISE) terms is applied to smooth the control inputs and address bo… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  49. Towards V2I Age-aware Fairness Access: A DQN Based Intelligent Vehicular Node Training and Test Method

    Authors: Qiong Wu, Shuai Shi, Ziyang Wan, Qiang Fan, Pingyi Fan, Cui Zhang

    Abstract: Vehicles on the road exchange data with base station (BS) frequently through vehicle to infrastructure (V2I) communications to ensure the normal use of vehicular applications, where the IEEE 802.11 distributed coordination function (DCF) is employed to allocate a minimum contention window (MCW) for channel access. Each vehicle may change its MCW to achieve more access opportunities at the expense… ▽ More

    Submitted 3 March, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: This paper has been accepted by Chinese Journal of Electronics. Simulation codes have been provided at: https://github.com/qiongwu86/Age-Fairness

  50. arXiv:2207.00792  [pdf, ps, other

    cs.IT eess.SP

    Two-Timescale Design for STAR-RIS Aided NOMA Systems

    Authors: Chenyu Wu, Changsheng You, Yuanwei Liu, Shuo Shi, Marco Di Renzo

    Abstract: Simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) have emerged as a promising technology for achieving full-space coverage. Prior works on STAR-RISs mostly assumed the full and instantaneous channel state information (CSI) is available, which, however, is practically difficult to obtain due to the large number of elements. To address it, we investigate STAR… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: 30 pages, 10 figures