Skip to main content

Showing 1–50 of 200 results for author: Han, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.06833  [pdf, ps, other

    eess.SP

    Enhancing Environment Generalizability for Deep Learning-Based CSI Feedback

    Authors: Haoyu Wang, Shuangfeng Han, Xiaoyun Wang, Zhi Sun

    Abstract: Accurate and low-overhead channel state information (CSI) feedback is essential to boost the capacity of frequency division duplex (FDD) massive multiple-input multiple-output (MIMO) systems. Deep learning-based CSI feedback significantly outperforms conventional approaches. Nevertheless, current deep learning-based CSI feedback algorithms exhibit limited generalizability to unseen environments, w… ▽ More

    Submitted 9 July, 2025; originally announced July 2025.

  2. arXiv:2507.03609  [pdf, ps, other

    eess.SP

    Implicit Neural Representation of Beamforming for Continuous Aperture Array (CAPA) System

    Authors: Shiyong Chen, Jia Guo, Shengqian Han

    Abstract: In this paper, a learning-based approach for optimizing downlink beamforming in continuous aperture array (CAPA) systems is proposed, where a MIMO scenario that both the base station (BS) and the user are equipped with CAPA is considered. As the beamforming in the CAPA system is a function that maps a coordinate on the aperture to the beamforming weight at the coordinate, a DNN called BeaINR is pr… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

    Comments: 5 pages, 3 figures

  3. arXiv:2506.16741  [pdf, ps, other

    eess.AS cs.AI

    RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow Matching

    Authors: Hyun Joon Park, Jeongmin Liu, Jin Sob Kim, Jeong Yeol Yang, Sung Won Han, Eunwoo Song

    Abstract: We introduce RapFlow-TTS, a rapid and high-fidelity TTS acoustic model that leverages velocity consistency constraints in flow matching (FM) training. Although ordinary differential equation (ODE)-based TTS generation achieves natural-quality speech, it typically requires a large number of generation steps, resulting in a trade-off between quality and inference speed. To address this challenge, Ra… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted on Interspeech 2025

  4. arXiv:2506.09375  [pdf, ps, other

    cs.CL cs.SD eess.AS

    CoLMbo: Speaker Language Model for Descriptive Profiling

    Authors: Massa Baali, Shuo Han, Syed Abdul Hannan, Purusottam Samal, Karanveer Singh, Soham Deshmukh, Rita Singh, Bhiksha Raj

    Abstract: Speaker recognition systems are often limited to classification tasks and struggle to generate detailed speaker characteristics or provide context-rich descriptions. These models primarily extract embeddings for speaker identification but fail to capture demographic attributes such as dialect, gender, and age in a structured manner. This paper introduces CoLMbo, a Speaker Language Model (SLM) that… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  5. arXiv:2506.06400  [pdf, ps, other

    eess.IV cs.CV

    ResPF: Residual Poisson Flow for Efficient and Physically Consistent Sparse-View CT Reconstruction

    Authors: Changsheng Fang, Yongtong Liu, Bahareh Morovati, Shuo Han, Yu Shi, Li Zhou, Shuyi Fan, Hengyong Yu

    Abstract: Sparse-view computed tomography (CT) is a practical solution to reduce radiation dose, but the resulting ill-posed inverse problem poses significant challenges for accurate image reconstruction. Although deep learning and diffusion-based methods have shown promising results, they often lack physical interpretability or suffer from high computational costs due to iterative sampling starting from ra… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  6. arXiv:2506.02197  [pdf, ps, other

    eess.IV cs.CV

    NTIRE 2025 Challenge on RAW Image Restoration and Super-Resolution

    Authors: Marcos V. Conde, Radu Timofte, Zihao Lu, Xiangyu Kong, Xiaoxia Xing, Fan Wang, Suejin Han, MinKyu Park, Tianyu Zhang, Xin Luo, Yeda Chen, Dong Liu, Li Pang, Yuhang Yang, Hongzhong Wang, Xiangyong Cao, Ruixuan Jiang, Senyan Xu, Siyuan Jiang, Xueyang Fu, Zheng-Jun Zha, Tianyu Hao, Yuhong He, Ruoqi Li, Yueqi Yang , et al. (14 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2025 RAW Image Restoration and Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Restoration and Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. The goal of this challenge is two fold, (i) restore RAW images with blur and… ▽ More

    Submitted 4 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

    Comments: CVPR 2025 - New Trends in Image Restoration and Enhancement (NTIRE)

  7. arXiv:2506.01460  [pdf, ps, other

    cs.SD eess.AS

    Few-step Adversarial Schrödinger Bridge for Generative Speech Enhancement

    Authors: Seungu Han, Sungho Lee, Juheon Lee, Kyogu Lee

    Abstract: Deep generative models have recently been employed for speech enhancement to generate perceptually valid clean speech on large-scale datasets. Several diffusion models have been proposed, and more recently, a tractable Schrödinger Bridge has been introduced to transport between the clean and noisy speech distributions. However, these models often suffer from an iterative reverse process and requir… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to Interspeech 2025

  8. arXiv:2505.13867  [pdf, other

    eess.SP

    Generalizable Learning for Frequency-Domain Channel Extrapolation under Distribution Shift

    Authors: Haoyu Wang, Zhi Sun, Shuangfeng Han, Xiaoyun Wang, Zhaocheng Wang

    Abstract: Frequency-domain channel extrapolation is effective in reducing pilot overhead for massive multiple-input multiple-output (MIMO) systems. Recently, Deep learning (DL) based channel extrapolator has become a promising candidate for modeling complex frequency-domain dependency. Nevertheless, current DL extrapolators fail to operate in unseen environments under distribution shift, which poses challen… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  9. arXiv:2504.19522  [pdf, other

    eess.SP

    A Model-based DNN for Learning HMIMO Beamforming

    Authors: Shiyong Chen, Shengqian Han

    Abstract: Holographic MIMO (HMIMO) is a promising technique for large-scale MIMO systems to enhance spectral efficiency while maintaining low hardware cost and power consumption. Existing alternating optimization algorithms can effectively optimize the hybrid beamforming of HMIMO to improve the system performance, while their high computational complexity hinders real-time application. In this paper, we pro… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: 5 pages,4figures

    MSC Class: 68T07; 90B18; 94A05

  10. arXiv:2504.13276  [pdf, other

    eess.SY

    Strategic Planning of Stealthy Backdoor Attacks in Markov Decision Processes

    Authors: Xinyi Wei, Shuo Han, Ahmed H. Hemida, Charles A. Kamhoua, Jie Fu

    Abstract: This paper investigates backdoor attack planning in stochastic control systems modeled as Markov Decision Processes (MDPs). In a backdoor attack, the adversary provides a control policy that behaves well in the original MDP to pass the testing phase. However, when such a policy is deployed with a trigger policy, which perturbs the system dynamics at runtime, it optimizes the attacker's objective i… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  11. arXiv:2504.04097  [pdf, other

    eess.SY cs.RO

    Risk-Aware Robot Control in Dynamic Environments Using Belief Control Barrier Functions

    Authors: Shaohang Han, Matti Vahs, Jana Tumova

    Abstract: Ensuring safety for autonomous robots operating in dynamic environments can be challenging due to factors such as unmodeled dynamics, noisy sensor measurements, and partial observability. To account for these limitations, it is common to maintain a belief distribution over the true state. This belief could be a non-parametric, sample-based representation to capture uncertainty more flexibly. In th… ▽ More

    Submitted 5 April, 2025; originally announced April 2025.

  12. arXiv:2504.00361  [pdf, ps, other

    eess.SP

    Adaptive Radar Detection in joint Range and Azimuth based on the Hierarchical Latent Variable Model

    Authors: Linjie Yan, Chengpeng Hao, Sudan Han, Giuseppe Ricci, Zhanhao Hu, Danilo Orlando

    Abstract: This paper focuses on the design of a robust decision scheme capable of operating in target-rich scenarios with unknown signal signatures (including their range positions, angles of arrival, and number) in a background of Gaussian disturbance. To solve the problem at hand, a novel estimation procedure is conceived resorting to the expectation-maximization algorithm in conjunction with the hierarch… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

  13. arXiv:2503.20490  [pdf, other

    eess.SY

    Model Predictive Control for Tracking Bounded References With Arbitrary Dynamics

    Authors: Shibo Han, Bonan Hou, Yuhao Zhang, Xiaotong Shi, Xingwei Zhao

    Abstract: In this article, a model predictive control (MPC) method is proposed for constrained linear systems to track bounded references with arbitrary dynamics. Besides control inputs to be determined, artificial reference is introduced as additional decision variable, which serves as an intermediate target to cope with sudden changes of reference and enlarges domain of attraction. Cost function penalizes… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  14. arXiv:2503.12308  [pdf

    eess.SP

    AI-driven 6G Air Interface: Technical Usage Scenarios and Balanced Design Methodology

    Authors: Xiaoyun Wang, Shuangfeng Han, Zhiming Liu, Qixing Wang, Jiangzhou Wang, Chih-Lin I

    Abstract: This paper systematically analyzes the typical application scenarios and key technical challenges of AI in 6G air interface transmission, covering important areas such as performance enhancement of single functional modules, joint optimization of multiple functional modules, and low-complexity solutions to complex mathematical problems. Innovatively, a three-dimensional joint optimization design c… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: 19 pages, in Chinese language, 1 figure, 20 references

  15. arXiv:2503.11787  [pdf, ps, other

    cs.CV eess.IV

    ECLARE: Efficient cross-planar learning for anisotropic resolution enhancement

    Authors: Samuel W. Remedios, Shuwen Wei, Shuo Han, Jinwei Zhang, Aaron Carass, Kurt G. Schilling, Dzung L. Pham, Jerry L. Prince, Blake E. Dewey

    Abstract: In clinical imaging, magnetic resonance (MR) image volumes are often acquired as stacks of 2D slices with decreased scan times, improved signal-to-noise ratio, and image contrasts unique to 2D MR pulse sequences. While this is sufficient for clinical evaluation, automated algorithms designed for 3D analysis perform poorly on multi-slice 2D MR volumes, especially those with thick slices and gaps be… ▽ More

    Submitted 21 May, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

  16. arXiv:2503.09398  [pdf, other

    eess.SP cs.LG eess.SY math.GR

    Precoder Learning by Leveraging Unitary Equivariance Property

    Authors: Yilun Ge, Shuyao Liao, Shengqian Han, Chenyang Yang

    Abstract: Incorporating mathematical properties of a wireless policy to be learned into the design of deep neural networks (DNNs) is effective for enhancing learning efficiency. Multi-user precoding policy in multi-antenna system, which is the mapping from channel matrix to precoding matrix, possesses a permutation equivariance property, which has been harnessed to design the parameter sharing structure of… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  17. arXiv:2503.08125  [pdf, other

    eess.SP

    Quantization Design for Deep Learning-Based CSI Feedback

    Authors: Manru Yin, Shengqian Han, Chenyang Yang

    Abstract: Deep learning-based autoencoders have been employed to compress and reconstruct channel state information (CSI) in frequency-division duplex systems. Practical implementations require judicious quantization of encoder outputs for digital transmission. In this paper, we propose a novel quantization module with bit allocation among encoder outputs and develop a method for joint training the module a… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  18. arXiv:2503.06875  [pdf, other

    eess.SP

    Distributed Resource Block Allocation for Wideband Cell-free System

    Authors: Yang Ma, Shengqian Han, Chenyang Yang

    Abstract: This paper studies distributed resource block (RB) allocation in wideband orthogonal frequency-division multiplexing (OFDM) cell-free systems. We propose a novel distributed sequential algorithm and its two variants, which optimize RB allocation based on the information obtained through over-the-air (OTA) transmissions between access points (APs) and user equipments, enabling local decision update… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  19. arXiv:2503.06638  [pdf, other

    eess.SP

    Learning of Uplink Resource Allocation with Multiuser QoS Constraints

    Authors: Manru Yin, Shengqian Han, Chenyang Yang

    Abstract: In the paper the joint optimization of uplink multiuser power and resource block (RB) allocation are studied, where each user has quality of service (QoS) constraints on both long- and short-blocklength transmissions. The objective is to minimize the consumption of RBs for meeting the QoS requirements, leading to a mixed-integer nonlinear programming (MINLP) problem. We resort to deep learning to… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  20. arXiv:2503.06077  [pdf, other

    eess.SP

    Gradient-Driven Graph Neural Networks for Learning Digital and Hybrid Precoder

    Authors: Lin Zhang, Shengqian Han, Chenyang Yang

    Abstract: The optimization of multi-user multi-input multi-output (MU-MIMO) precoders is a widely recognized challenging problem. Existing work has demonstrated the potential of graph neural networks (GNNs) in learning precoding policies. However, existing GNNs often exhibit poor generalizability for the numbers of users or antennas. In this paper, we develop a gradient-driven GNN design method for the lear… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

  21. arXiv:2503.04497  [pdf, other

    eess.SP cs.LG

    Precoder Learning for Weighted Sum Rate Maximization

    Authors: Mingyu Deng, Shengqian Han

    Abstract: Weighted sum rate maximization (WSRM) for precoder optimization effectively balances performance and fairness among users. Recent studies have demonstrated the potential of deep learning in precoder optimization for sum rate maximization. However, the WSRM problem necessitates a redesign of neural network architectures to incorporate user weights into the input. In this paper, we propose a novel d… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  22. arXiv:2503.04233  [pdf, other

    eess.SP

    Learning Wideband User Scheduling and Hybrid Precoding with Graph Neural Networks

    Authors: Shengjie Liu, Chenyang Yang, Shengqian Han

    Abstract: Spatial-frequency scheduling and hybrid precoding in wideband multi-user multi-antenna systems have never been learned jointly due to the challenges arising from the massive user combinations on resource blocks (RBs) and the shared analog precoder among RBs. In this paper, we strive to jointly learn the scheduling and precoding policies with graph neural networks (GNNs), which have emerged as a po… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  23. arXiv:2502.20311  [pdf, other

    cs.LG cs.SD eess.AS

    Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications

    Authors: Marcus Yu Zhe Wee, Justin Juin Hng Wong, Lynus Lim, Joe Yu Wei Tan, Prannaya Gupta, Dillion Lim, En Hao Tew, Aloysius Keng Siew Han, Yong Zhi Lim

    Abstract: Effective communication in Air Traffic Control (ATC) is critical to maintaining aviation safety, yet the challenges posed by accented English remain largely unaddressed in Automatic Speech Recognition (ASR) systems. Existing models struggle with transcription accuracy for Southeast Asian-accented (SEA-accented) speech, particularly in noisy ATC environments. This study presents the development of… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  24. arXiv:2502.18777  [pdf, other

    eess.IV

    Hyperspectral image reconstruction by deep learning with super-Rayleigh speckles

    Authors: Ziyan Chen, Zhentao Liu, Jianrong Wu, Shensheng Han

    Abstract: Ghost imaging via sparsity constraints (GISC) spectral camera modulates the three-dimensional (3D) hyperspectral image into a two-dimensional (2D) compressive image with speckles in a single shot. It obtains a 3D hyperspectral image (HSI) by reconstruction algorithms. The rapid development of deep learning has provided a new method for 3D HSI reconstruction. Moreover, the imaging performance of th… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  25. arXiv:2502.17502  [pdf, ps, other

    cs.OH cs.NI eess.SY

    Complex Electromagnetic Space Combat System-of-systems Modeling and Key Node Identification Method

    Authors: Xiao Liu, Sudan Han, Jinlin Peng

    Abstract: With the application of advanced science and technology in the military field, modern warfare has developed into a confrontation between systems. The combat system-of-systems (CSoS) has numerous nodes, multiple attributes and complex interactions, and its research and analysis are facing great difficulties. Electromagnetic space is an important dimension of modern warfare. Modeling and analyzing t… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: conference paper,already accepted but not published

  26. arXiv:2502.07065  [pdf, other

    eess.SY

    Active Inference through Incentive Design in Markov Decision Processes

    Authors: Xinyi Wei, Chongyang Shi, Shuo Han, Ahmed H. Hemida, Charles A. Kamhoua, Jie Fu

    Abstract: We present a method for active inference with partial observations in stochastic systems through incentive design, also known as the leader-follower game. Consider a leader agent who aims to infer a follower agent's type given a finite set of possible types. Different types of followers differ in either the dynamical model, the reward function, or both. We assume the leader can partially observe a… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: 8 pages

  27. arXiv:2502.04476  [pdf, other

    cs.SD cs.AI eess.AS

    ADIFF: Explaining audio difference using natural language

    Authors: Soham Deshmukh, Shuo Han, Rita Singh, Bhiksha Raj

    Abstract: Understanding and explaining differences between audio recordings is crucial for fields like audio forensics, quality assessment, and audio generation. This involves identifying and describing audio events, acoustic scenes, signal characteristics, and their emotional impact on listeners. This paper stands out as the first work to comprehensively study the task of explaining audio differences and t… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted at ICLR 2025. Dataset and checkpoints are available at: https://github.com/soham97/ADIFF

  28. arXiv:2501.15116  [pdf, other

    eess.SP

    Path Evolution Model for Endogenous Channel Digital Twin towards 6G Wireless Networks

    Authors: Haoyu Wang, Zhi Sun, Shuangfeng Han, Xiaoyun Wang, Shidong Zhou, Zhaocheng Wang

    Abstract: Massive Multiple Input Multiple Output (MIMO) is critical for boosting 6G wireless network capacity. Nevertheless, high dimensional Channel State Information (CSI) acquisition becomes the bottleneck of 6G massive MIMO system. Recently, Channel Digital Twin (CDT), which replicates physical entities in wireless channels, has been proposed, providing site-specific prior knowledge for CSI acquisition.… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

  29. arXiv:2501.09877  [pdf, other

    eess.AS cs.LG

    CLAP-S: Support Set Based Adaptation for Downstream Fiber-optic Acoustic Recognition

    Authors: Jingchen Sun, Shaobo Han, Wataru Kohno, Changyou Chen

    Abstract: Contrastive Language-Audio Pretraining (CLAP) models have demonstrated unprecedented performance in various acoustic signal recognition tasks. Fiber-optic-based acoustic recognition is one of the most important downstream tasks and plays a significant role in environmental sensing. Adapting CLAP for fiber-optic acoustic recognition has become an active research area. As a non-conventional acoustic… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

    Comments: Accepted to ICASSP 2025

  30. arXiv:2501.06176  [pdf, other

    cs.NI eess.SP

    GR-WiFi: A GNU Radio based WiFi Platform with Single-User and Multi-User MIMO Capability

    Authors: Natong Lin, Zelin Yun, Shengli Zhou, Song Han

    Abstract: Since its first release, WiFi has been highly successful in providing wireless local area networks. The ever-evolving IEEE 802.11 standards continue to add new features to keep up with the trend of increasing numbers of mobile devices and the growth of Internet of Things (IoT) applications. Unfortunately, the lack of open-source IEEE 802.11 testbeds in the community limits the development and perf… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 11 pages, 18 figures

  31. arXiv:2501.02572  [pdf, other

    cs.NI cs.AI eess.SY

    Energy Optimization of Multi-task DNN Inference in MEC-assisted XR Devices: A Lyapunov-Guided Reinforcement Learning Approach

    Authors: Yanzan Sun, Jiacheng Qiu, Guangjin Pan, Shugong Xu, Shunqing Zhang, Xiaoyun Wang, Shuangfeng Han

    Abstract: Extended reality (XR), blending virtual and real worlds, is a key application of future networks. While AI advancements enhance XR capabilities, they also impose significant computational and energy challenges on lightweight XR devices. In this paper, we developed a distributed queue model for multi-task DNN inference, addressing issues of resource competition and queue coupling. In response to th… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

    Comments: 13 pages, 7 figures. This work has been submitted to the IEEE for possible publication

  32. arXiv:2501.00842  [pdf, other

    cs.CR eess.IV eess.SP

    A Survey of Secure Semantic Communications

    Authors: Rui Meng, Song Gao, Dayu Fan, Haixiao Gao, Yining Wang, Xiaodong Xu, Bizhu Wang, Suyu Lv, Zhidi Zhang, Mengying Sun, Shujun Han, Chen Dong, Xiaofeng Tao, Ping Zhang

    Abstract: Semantic communication (SemCom) is regarded as a promising and revolutionary technology in 6G, aiming to transcend the constraints of ``Shannon's trap" by filtering out redundant information and extracting the core of effective data. Compared to traditional communication paradigms, SemCom offers several notable advantages, such as reducing the burden on data transmission, enhancing network managem… ▽ More

    Submitted 26 March, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

    Comments: 160 pages, 27 figures

  33. arXiv:2412.05322  [pdf, other

    eess.IV cs.AI cs.CV

    $ρ$-NeRF: Leveraging Attenuation Priors in Neural Radiance Field for 3D Computed Tomography Reconstruction

    Authors: Li Zhou, Changsheng Fang, Bahareh Morovati, Yongtong Liu, Shuo Han, Yongshun Xu, Hengyong Yu

    Abstract: This paper introduces $ρ$-NeRF, a self-supervised approach that sets a new standard in novel view synthesis (NVS) and computed tomography (CT) reconstruction by modeling a continuous volumetric radiance field enriched with physics-based attenuation priors. The $ρ$-NeRF represents a three-dimensional (3D) volume through a fully-connected neural network that takes a single continuous four-dimensiona… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

    Comments: The paper was submitted to CVPR 2025

  34. arXiv:2412.02985  [pdf, other

    eess.SY

    Robust Model Predictive Control for Constrained Uncertain Systems Based on Concentric Container and Varying Tube

    Authors: Shibo Han, Yuhao Zhang, Xiaotong Shi, Xingwei Zhao

    Abstract: This paper proposes a novel robust model predictive control (RMPC) method for the stabilization of constrained systems subject to additive disturbance (AD) and multiplicative disturbance (MD). Concentric containers are introduced to facilitate the characterization of MD, and varying tubes are constructed to bound reachable states. By restricting states and the corresponding inputs in containers wi… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

    Comments: 13 pages, 6 figures

  35. arXiv:2411.12776  [pdf, other

    eess.IV cs.CR cs.MM

    Cross-Layer Encrypted Semantic Communication Framework for Panoramic Video Transmission

    Authors: Haixiao Gao, Mengying Sun, Xiaodong Xu, Bingxuan Xu, Shujun Han, Bizhu Wang, Sheng Jiang, Chen Dong, Ping Zhang

    Abstract: In this paper, we propose a cross-layer encrypted semantic communication (CLESC) framework for panoramic video transmission, incorporating feature extraction, encoding, encryption, cyclic redundancy check (CRC), and retransmission processes to achieve compatibility between semantic communication and traditional communication systems. Additionally, we propose an adaptive cross-layer transmission me… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

  36. arXiv:2411.09906  [pdf, other

    cs.CR eess.SY

    A Survey of Machine Learning-based Physical-Layer Authentication in Wireless Communications

    Authors: Rui Meng, Bingxuan Xu, Xiaodong Xu, Mengying Sun, Bizhu Wang, Shujun Han, Suyu Lv, Ping Zhang

    Abstract: To ensure secure and reliable communication in wireless systems, authenticating the identities of numerous nodes is imperative. Traditional cryptography-based authentication methods suffer from issues such as low compatibility, reliability, and high complexity. Physical-Layer Authentication (PLA) is emerging as a promising complement due to its exploitation of unique properties in wireless environ… ▽ More

    Submitted 3 December, 2024; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: 111 pages, 9 figures

  37. arXiv:2411.04833  [pdf, other

    eess.SY

    Finding Control Invariant Sets via Lipschitz Constants of Linear Programs

    Authors: Matti Vahs, Shaohang Han, Jana Tumova

    Abstract: Control invariant sets play an important role in safety-critical control and find broad application in numerous fields such as obstacle avoidance for mobile robots. However, finding valid control invariant sets of dynamical systems under input limitations is notoriously difficult. We present an approach to safely expand an initial set while always guaranteeing that the set is control invariant. Sp… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

  38. arXiv:2410.12160  [pdf, other

    cs.LG eess.SY

    When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter

    Authors: Yansong Li, Zeyu Dong, Ertai Luo, Yu Wu, Shuo Wu, Shuo Han

    Abstract: Reinforcement learning (RL) algorithms can be divided into two classes: model-free algorithms, which are sample-inefficient, and model-based algorithms, which suffer from model bias. Dyna-style algorithms combine these two approaches by using simulated data from an estimated environmental model to accelerate model-free training. However, their efficiency is compromised when the estimated model is… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  39. arXiv:2410.10758  [pdf

    eess.SP cs.AI

    Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix

    Authors: Seungwoo Han

    Abstract: With the advancements in graph neural network, there has been increasing interest in applying this network to ECG signal analysis. In this study, we generated an adjacency matrix using correlation matrix of extracted features and applied a graph neural network to classify arrhythmias. The proposed model was compared with existing approaches from the literature. The results demonstrated that precis… ▽ More

    Submitted 10 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Corrected typos

  40. arXiv:2410.06767  [pdf, ps, other

    cs.IT eess.SP

    On the Achievable Error Rate Performance of Pilot-Aided Simultaneous Communication and Localisation

    Authors: Shuaishuai Han, Emad Alsusa, Mohammad Ahmad Al-Jarrah, Mahmoud AlaaEldin

    Abstract: This paper investigates the symbol error rate (SER) performance of the pilot-aided simultaneous communication and localisation (PASCAL) system. A scenario where multiple drones transmit communication signals to a base station (BS), which needs to simultaneously decode the signals and continuously locate the drones' positions during the communication session, is considered. The BS operates in two s… ▽ More

    Submitted 26 November, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

    Comments: 13 pages, 10 figures

  41. arXiv:2409.16439  [pdf, other

    eess.SY

    Active Perception with Initial-State Uncertainty: A Policy Gradient Method

    Authors: Chongyang Shi, Shuo Han, Michael Dorothy, Jie Fu

    Abstract: This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon co… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  42. arXiv:2409.15596  [pdf, other

    eess.SP

    Computational Ghost Imaging with Low-Density Parity-Check Code

    Authors: Shuang Liu, Yunkai Hu, Jinquan Qi, Shensheng Han, Zihuai Lin

    Abstract: Ghost imaging (GI) is a high-resolution imaging technology that has been a subject of interest to many fields in the past 20 years. Most GI researchers focus on the reconstruction of signal under-sampling, nevertheless, how to use information redundancy to improve the result's belief in a complex environment has hardly been studied. Motivated by this, we propose a computational GI system based on… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  43. arXiv:2409.15353  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Contextualization of ASR with LLM using phonetic retrieval-based augmentation

    Authors: Zhihong Lei, Xingyu Na, Mingbin Xu, Ernest Pusateri, Christophe Van Gysel, Yuanyuan Zhang, Shiyi Han, Zhen Huang

    Abstract: Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or textual response given a speech input. However, it remains a challenge for the model to recognize personal named entities, such as contacts in a phone book, when the input modality is speech. In this work, we start with a speech recognition tas… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  44. arXiv:2409.07770  [pdf, other

    eess.AS cs.AI

    Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification

    Authors: Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han

    Abstract: Recent advancements in automatic speaker verification (ASV) studies have been achieved by leveraging large-scale pretrained networks. In this study, we analyze the approaches toward such a paradigm and underline the significance of interlayer information processing as a result. Accordingly, we present a novel approach for exploiting the multilayered nature of pretrained models for ASV, which compr… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

    Comments: Preprint

  45. arXiv:2409.07467  [pdf, other

    cs.SD cs.MM eess.AS

    Flexible Control in Symbolic Music Generation via Musical Metadata

    Authors: Sangjun Han, Jiwon Ham, Chaeeun Lee, Heejin Kim, Soojong Do, Sihyuk Yi, Jun Seo, Seoyoon Kim, Yountae Jung, Woohyung Lim

    Abstract: In this work, we introduce the demonstration of symbolic music generation, focusing on providing short musical motifs that serve as the central theme of the narrative. For the generation, we adopt an autoregressive model which takes musical metadata as inputs and generates 4 bars of multitrack MIDI sequences. During training, we randomly drop tokens from the musical metadata to guarantee flexible… ▽ More

    Submitted 28 August, 2024; originally announced September 2024.

  46. arXiv:2409.06137  [pdf, other

    eess.AS cs.SD eess.SP

    DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing

    Authors: Kuang Yuan, Shuo Han, Swarun Kumar, Bhiksha Raj

    Abstract: The quality of audio recordings in outdoor environments is often degraded by the presence of wind. Mitigating the impact of wind noise on the perceptual quality of single-channel speech remains a significant challenge due to its non-stationary characteristics. Prior work in noise suppression treats wind noise as a general background noise without explicit modeling of its characteristics. In this p… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  47. arXiv:2409.01465  [pdf, other

    eess.SY math.OC

    Terminal Soft Landing Guidance Law Using Analytic Gravity Turn Trajectory

    Authors: Seungyeop Han, Byeong-Un Jo, Koki Ho

    Abstract: This paper presents an innovative terminal landing guidance law that utilizes an analytic solution derived from the gravity turn trajectory. The characteristics of the derived solution are thoroughly investigated, and the solution is employed to generate a reference velocity vector that satisfies terminal landing conditions. A nonlinear control law is applied to effectively track the reference vel… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Journal ref: Journal of Guidance, Control, and Dynamics, 47(6), 2024, 1-14

  48. arXiv:2409.00986  [pdf, other

    cs.CV cs.CL eess.AS eess.IV

    Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

    Authors: Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim, Hyeongseop Rha, Seunghee Han, Wen-Huang Cheng, Yong Man Ro

    Abstract: Lip reading aims to predict spoken language by analyzing lip movements. Despite advancements in lip reading technologies, performance degrades when models are applied to unseen speakers due to their sensitivity to variations in visual information such as lip appearances. To address this challenge, speaker adaptive lip reading technologies have advanced by focusing on effectively adapting a lip rea… ▽ More

    Submitted 1 January, 2025; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: Code available: https://github.com/JeongHun0716/Personalized-Lip-Reading

  49. arXiv:2408.15789  [pdf, other

    eess.SY

    A Stochastic Robust Adaptive Systems Level Approach to Stabilizing Large-Scale Uncertain Markovian Jump Linear Systems

    Authors: SooJean Han, Minwoo M. Kim, Ieun Choo

    Abstract: We propose a unified framework for robustly and adaptively stabilizing large-scale networked uncertain Markovian jump linear systems (MJLS) under external disturbances and mode switches that can change the network's topology. Adaptation is achieved by using minimal information on the disturbance to identify modes that are consistent with observable data. Robust control is achieved by extending the… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Full version of accepted paper to 63rd IEEE Conference on Decision and Control (CDC) 2024

  50. arXiv:2408.10235  [pdf, other

    eess.SP cs.HC cs.LG

    Multi-Source EEG Emotion Recognition via Dynamic Contrastive Domain Adaptation

    Authors: Yun Xiao, Yimeng Zhang, Xiaopeng Peng, Shuzheng Han, Xia Zheng, Dingyi Fang, Xiaojiang Chen

    Abstract: Electroencephalography (EEG) provides reliable indications of human cognition and mental states. Accurate emotion recognition from EEG remains challenging due to signal variations among individuals and across measurement sessions. We introduce a multi-source dynamic contrastive domain adaptation method (MS-DCDA) based on differential entropy (DE) features, in which coarse-grained inter-domain and… ▽ More

    Submitted 23 December, 2024; v1 submitted 3 August, 2024; originally announced August 2024.

    Journal ref: Biomedical Signal Processing and Control, vol. 102, p. 107337, Apr. 2025