-
FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment
Authors:
Lee Qi Zun,
Oscar Wong Jin Hao,
Nor Anita Binti Che Omar,
Zalifa Zakiah Binti Asnir,
Mohamad Sabri bin Sinal Zainal,
Goh Man Fye
Abstract:
Automated fundus image quality assessment (FIQA) remains a challenge due to variations in image acquisition and subjective expert evaluations. We introduce FundaQ-8, a novel expert-validated framework for systematically assessing fundus image quality using eight critical parameters, including field coverage, anatomical visibility, illumination, and image artifacts. Using FundaQ-8 as a structured s…
▽ More
Automated fundus image quality assessment (FIQA) remains a challenge due to variations in image acquisition and subjective expert evaluations. We introduce FundaQ-8, a novel expert-validated framework for systematically assessing fundus image quality using eight critical parameters, including field coverage, anatomical visibility, illumination, and image artifacts. Using FundaQ-8 as a structured scoring reference, we develop a ResNet18-based regression model to predict continuous quality scores in the 0 to 1 range. The model is trained on 1800 fundus images from real-world clinical sources and Kaggle datasets, using transfer learning, mean squared error optimization, and standardized preprocessing. Validation against the EyeQ dataset and statistical analyses confirm the framework's reliability and clinical interpretability. Incorporating FundaQ-8 into deep learning models for diabetic retinopathy grading also improves diagnostic robustness, highlighting the value of quality-aware training in real-world screening applications.
△ Less
Submitted 25 June, 2025;
originally announced June 2025.
-
Intelligent Reflecting Surfaces for THz Communications: Fundamentals, Key Solutions, and System Prototyping
Authors:
Qingqing Wu,
Yanze Zhu,
Qiaoyan Peng,
Wanming Hao,
Yanzhao Hou,
Fengyuan Yang,
Wencai Yan,
Guoning Wang,
Wen Chen,
Chi Qiu
Abstract:
Intelligent reflecting surfaces (IRSs) have emerged as a cost-effective technology for terahertz (THz) communications by enabling programmable control of the wireless environment. This paper provides a comprehensive overview of IRSs-aided THz communications, covering hardware designs, advanced signal processing techniques, and practical deployment strategies. It first examines key THz reconfigurab…
▽ More
Intelligent reflecting surfaces (IRSs) have emerged as a cost-effective technology for terahertz (THz) communications by enabling programmable control of the wireless environment. This paper provides a comprehensive overview of IRSs-aided THz communications, covering hardware designs, advanced signal processing techniques, and practical deployment strategies. It first examines key THz reconfigurable metasurface architectures, including electronic, optical, phase-change material, and micro-electromechanical systems (MEMS)-based implementations, highlighting their reconfiguration mechanisms and challenges. Then, fundamental effects including near field and beam squint in wideband THz systems are analyzed, along with their impacts on system performance. The paper further explores conventional and beam-squint-assisted channel estimation methods, innovative beam management strategies, and deployment considerations across large- and small-scale scenarios. Practical experiments at 220 gigahertz (GHz) validate the effectiveness of IRS in improving signal strength and communication reliability for both single-user and multi-user setups.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Recent Advances in Near-Field Beam Training and Channel Estimation for XL-MIMO Systems
Authors:
Ming Zeng,
Ji Wang,
Xingwang Li,
Wanming Hao,
Zheng Chu,
Wenwu Xie,
Xianbin Wang,
Quoc-Viet Pham
Abstract:
Extremely large-scale multiple-input multiple-output (XL-MIMO) is a key technology for next-generation wireless communication systems. By deploying significantly more antennas than conventional massive MIMO systems, XL-MIMO promises substantial improvements in spectral efficiency. However, due to the drastically increased array size, the conventional planar wave channel model is no longer accurate…
▽ More
Extremely large-scale multiple-input multiple-output (XL-MIMO) is a key technology for next-generation wireless communication systems. By deploying significantly more antennas than conventional massive MIMO systems, XL-MIMO promises substantial improvements in spectral efficiency. However, due to the drastically increased array size, the conventional planar wave channel model is no longer accurate, necessitating a transition to a near-field spherical wave model. This shift challenges traditional beam training and channel estimation methods, which were designed for planar wave propagation. In this article, we present a comprehensive review of state-of-the-art beam training and channel estimation techniques for XL-MIMO systems. We analyze the fundamental principles, key methodologies, and recent advancements in this area, highlighting their respective strengths and limitations in addressing the challenges posed by the near-field propagation environment. Furthermore, we explore open research challenges that remain unresolved to provide valuable insights for researchers and engineers working toward the development of next-generation XL-MIMO communication systems.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
A Distributed Deep Koopman Learning Algorithm for Control
Authors:
Wenjian Hao,
Zehui Lu,
Devesh Upadhyay,
Shaoshuai Mou
Abstract:
This paper proposes a distributed data-driven framework to address the challenge of dynamics learning from a large amount of training data for optimal control purposes, named distributed deep Koopman learning for control (DDKC). Suppose a system states-inputs trajectory and a multi-agent system (MAS), the key idea of DDKC is to assign each agent in MAS an offline partial trajectory, and each agent…
▽ More
This paper proposes a distributed data-driven framework to address the challenge of dynamics learning from a large amount of training data for optimal control purposes, named distributed deep Koopman learning for control (DDKC). Suppose a system states-inputs trajectory and a multi-agent system (MAS), the key idea of DDKC is to assign each agent in MAS an offline partial trajectory, and each agent approximates the unknown dynamics linearly relying on the deep neural network (DNN) and Koopman operator theory by communicating information with other agents to reach a consensus of the approximated dynamics for all agents in MAS. Simulations on a surface vehicle first show that the proposed method achieves the consensus in terms of the learned dynamics and the learned dynamics from each agent can achieve reasonably small estimation errors over the testing data. Furthermore, simulations in combination with model predictive control (MPC) to drive the surface vehicle for goal-tracking and station-keeping tasks demonstrate the learned dynamics from DDKC are precise enough to be used for the optimal control design.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
Authors:
Zirui Zhang,
Wei Hao,
Aroon Sankoh,
William Lin,
Emanuel Mendiola-Ortiz,
Junfeng Yang,
Chengzhi Mao
Abstract:
Recent advances in AI-generated voices have intensified the challenge of detecting deepfake audio, posing risks for scams and the spread of disinformation. To tackle this issue, we establish the largest public voice dataset to date, named DeepFakeVox-HQ, comprising 1.3 million samples, including 270,000 high-quality deepfake samples from 14 diverse sources. Despite previously reported high accurac…
▽ More
Recent advances in AI-generated voices have intensified the challenge of detecting deepfake audio, posing risks for scams and the spread of disinformation. To tackle this issue, we establish the largest public voice dataset to date, named DeepFakeVox-HQ, comprising 1.3 million samples, including 270,000 high-quality deepfake samples from 14 diverse sources. Despite previously reported high accuracy, existing deepfake voice detectors struggle with our diversely collected dataset, and their detection success rates drop even further under realistic corruptions and adversarial attacks. We conduct a holistic investigation into factors that enhance model robustness and show that incorporating a diversified set of voice augmentations is beneficial. Moreover, we find that the best detection models often rely on high-frequency features, which are imperceptible to humans and can be easily manipulated by an attacker. To address this, we propose the F-SAT: Frequency-Selective Adversarial Training method focusing on high-frequency components. Empirical results demonstrate that using our training dataset boosts baseline model performance (without robust training) by 33%, and our robust training further improves accuracy by 7.7% on clean samples and by 29.3% on corrupted and attacked samples, over the state-of-the-art RawNet3 model.
△ Less
Submitted 31 October, 2024;
originally announced November 2024.
-
Beamforming Design for Intelligent Reffecting Surface Aided Near-Field THz Communications
Authors:
Chi Qiu,
Qingqing Wu,
Wen Chen,
Meng Hua,
Wanming Hao,
Mengnan Jian,
Fen Hou
Abstract:
Intelligent reflecting surface (IRS) operating in the terahertz (THz) band has recently gained considerable interest due to its high spectrum bandwidth. Due to the exploitation of large scale of IRS, there is a high probability that the transceivers will be situated within the near-field region of the IRS. Thus, the near-field beam split effect poses a major challenge for the design of wideband IR…
▽ More
Intelligent reflecting surface (IRS) operating in the terahertz (THz) band has recently gained considerable interest due to its high spectrum bandwidth. Due to the exploitation of large scale of IRS, there is a high probability that the transceivers will be situated within the near-field region of the IRS. Thus, the near-field beam split effect poses a major challenge for the design of wideband IRS beamforming, which causes the radiation beam to deviate from its intended location, leading to significant gain losses and limiting the efficient use of available bandwidths. While delay-based IRS has emerged as a potential solution, current beamforming schemes generally assume unbounded range time delays (TDs). In this letter, we first investigate the near-field beam split issue at the IRS. Then, we extend the piece-wise far-field model to the IRS, based on which, a double-layer delta-delay (DLDD) IRS beamforming scheme is proposed. Specifically, we employ an element-grouping strategy and the TD imposed on each sub-surface of IRS is achieved by a series of TD modules. This method significantly reduces the required range of TDs. Numerical results show that the proposed DLDD IRS beamforming scheme can effectively mitigate the near-field beam split and achieve near-optimal performance.
△ Less
Submitted 27 May, 2025; v1 submitted 10 October, 2024;
originally announced October 2024.
-
Distributed Deep Koopman Learning for Nonlinear Dynamics
Authors:
Wenjian Hao,
Lili Wang,
Ayush Rai,
Shaoshuai Mou
Abstract:
Koopman operator theory has proven to be highly significant in system identification, even for challenging scenarios involving nonlinear time-varying systems (NTVS). In this context, we examine a network of connected agents, each with limited observation capabilities, aiming to estimate the dynamics of an NTVS collaboratively. Drawing inspiration from Koopman operator theory, deep neural networks,…
▽ More
Koopman operator theory has proven to be highly significant in system identification, even for challenging scenarios involving nonlinear time-varying systems (NTVS). In this context, we examine a network of connected agents, each with limited observation capabilities, aiming to estimate the dynamics of an NTVS collaboratively. Drawing inspiration from Koopman operator theory, deep neural networks, and distributed consensus, we introduce a distributed algorithm for deep Koopman learning of the dynamics of an NTVS. This approach enables individual agents to approximate the entire dynamics despite having access to only partial state observations. We guarantee consensus not only on the estimated dynamics but also on its structure, i.e., the matrices encountered in the linear equation of the lifted Koopman system. We provide theoretical insights into the convergence of the learning process and accompanying numerical simulations.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Authors:
Ye Bai,
Haonan Chen,
Jitong Chen,
Zhuo Chen,
Yi Deng,
Xiaohong Dong,
Lamtharn Hantrakul,
Weituo Hao,
Qingqing Huang,
Zhongyi Huang,
Dongya Jia,
Feihu La,
Duc Le,
Bochen Li,
Chumin Li,
Hui Li,
Xingxing Li,
Shouda Liu,
Wei-Tsung Lu,
Yiqing Lu,
Andrew Shaw,
Janne Spijkervet,
Yakun Sun,
Bo Wang,
Ju-Chiang Wang
, et al. (13 additional authors not shown)
Abstract:
We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing. For controlled music generation, our system enables vocal music gene…
▽ More
We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing. For controlled music generation, our system enables vocal music generation with performance controls from multi-modal inputs, including style descriptions, audio references, musical scores, and voice prompts. For post-production editing, it offers interactive tools for editing lyrics and vocal melodies directly in the generated audio.
We encourage readers to listen to demo audio examples at https://team.doubao.com/seed-music "https://team.doubao.com/seed-music".
△ Less
Submitted 19 September, 2024; v1 submitted 13 September, 2024;
originally announced September 2024.
-
Resource Management for IRS-Assisted Full-Duplex Integrated Sensing, Communication and Computing Systems
Authors:
Wanming Hao,
Xue Wu,
Xingwang Li,
Gangcan Sun,
Qingqing Wu,
Liang Yang
Abstract:
In this paper, we investigate an intelligent reflecting surface (IRS) assisted full-duplex (FD) integrated sensing, communication and computing system. Specifically, an FD base station (BS) provides service for uplink and downlink transmission, and a local cache is connected to the BS through a backhaul link to store data. Meanwhile, active sensing elements are deployed on the IRS to receive targe…
▽ More
In this paper, we investigate an intelligent reflecting surface (IRS) assisted full-duplex (FD) integrated sensing, communication and computing system. Specifically, an FD base station (BS) provides service for uplink and downlink transmission, and a local cache is connected to the BS through a backhaul link to store data. Meanwhile, active sensing elements are deployed on the IRS to receive target echo signals. On this basis, in order to evaluate the overall performance of the system under consideration, we propose a system utility maximization problem while ensuring the sensing quality, expressed as the difference between the sum of communication throughput, total computation bits (offloading bits and local computation bits) and the total backhaul cost for content delivery. This makes the problem difficult to solve due to the highly non-convex coupling of the optimization variables. To effectively solve this problem, we first design the most effective caching strategy. Then, we develop an algorithm based on weighted minimum mean square error, alternative direction method of multipliers, majorization-minimization framework, semi-definite relaxation techniques, and several complex transformations to jointly solve the optimization variables. Finally, simulation results are provided to verify the utility performance of the proposed algorithm and demonstrate the advantages of the proposed scheme compared with the baseline scheme.
△ Less
Submitted 31 August, 2024;
originally announced September 2024.
-
Latency Minimization for IRS-enhanced Wideband MEC Networks with Practical Reflection Model
Authors:
N. Li,
W. Hao,
X. Li,
Z. Zhu,
Z. Tang,
S. Yang
Abstract:
Intelligent reflecting surface (IRS) has been considered as an efficient way to boost the computation capability of mobile edge computing (MEC) system, especially when the communication links is blocked or the communication signal is weak. However, most existing works are restricted to narrow-band channel and ideal IRS reflection model, which is not practical and may lead to significant performanc…
▽ More
Intelligent reflecting surface (IRS) has been considered as an efficient way to boost the computation capability of mobile edge computing (MEC) system, especially when the communication links is blocked or the communication signal is weak. However, most existing works are restricted to narrow-band channel and ideal IRS reflection model, which is not practical and may lead to significant performance degradation in realistic systems. To further exploit the benefits of IRS in MEC system, we consider an IRS-enhanced wideband MEC system with practical IRS reflection model. With the aim of minimizing the weighted latency of all devices, the offloading data volume, edge computing resource, BS's receiving vector, and IRS passive beamforming are jointly optimized. Since the formulated problem is non-convex, we employ the block coordinate descent (BCD) technique to decouple it into two subproblems for alternatively optimizing computing and communication settings. The effectiveness and convergence of the proposed algorithm are validate via numerical analyses. In addition, simulation results demonstrate that the proposed algorithm can achieve lower latency compared to that based on the ideal IRS reflection model, which confirms the necessary of considering practical model when designing an IRS-enhanced wideband MEC system.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Music Era Recognition Using Supervised Contrastive Learning and Artist Information
Authors:
Qiqi He,
Xuchen Song,
Weituo Hao,
Ju-Chiang Wang,
Wei-Tsung Lu,
Wei Li
Abstract:
Does popular music from the 60s sound different than that of the 90s? Prior study has shown that there would exist some variations of patterns and regularities related to instrumentation changes and growing loudness across multi-decadal trends. This indicates that perceiving the era of a song from musical features such as audio and artist information is possible. Music era information can be an im…
▽ More
Does popular music from the 60s sound different than that of the 90s? Prior study has shown that there would exist some variations of patterns and regularities related to instrumentation changes and growing loudness across multi-decadal trends. This indicates that perceiving the era of a song from musical features such as audio and artist information is possible. Music era information can be an important feature for playlist generation and recommendation. However, the release year of a song can be inaccessible in many circumstances. This paper addresses a novel task of music era recognition. We formulate the task as a music classification problem and propose solutions based on supervised contrastive learning. An audio-based model is developed to predict the era from audio. For the case where the artist information is available, we extend the audio-based model to take multimodal inputs and develop a framework, called MultiModal Contrastive (MMC) learning, to enhance the training. Experimental result on Million Song Dataset demonstrates that the audio-based model achieves 54% in accuracy with a tolerance of 3-years range; incorporating the artist information with the MMC framework for training leads to 9% improvement further.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Deep Koopman Learning using Noisy Data
Authors:
Wenjian Hao,
Devesh Upadhyay,
Shaoshuai Mou
Abstract:
This paper proposes a data-driven framework to learn a finite-dimensional approximation of a Koopman operator for approximating the state evolution of a dynamical system under noisy observations. To this end, our proposed solution has two main advantages. First, the proposed method only requires the measurement noise to be bounded. Second, the proposed method modifies the existing deep Koopman ope…
▽ More
This paper proposes a data-driven framework to learn a finite-dimensional approximation of a Koopman operator for approximating the state evolution of a dynamical system under noisy observations. To this end, our proposed solution has two main advantages. First, the proposed method only requires the measurement noise to be bounded. Second, the proposed method modifies the existing deep Koopman operator formulations by characterizing the effect of the measurement noise on the Koopman operator learning and then mitigating it by updating the tunable parameter of the observable functions of the Koopman operator, making it easy to implement. The performance of the proposed method is demonstrated on several standard benchmarks. We then compare the presented method with similar methods proposed in the latest literature on Koopman learning.
△ Less
Submitted 21 May, 2025; v1 submitted 26 May, 2024;
originally announced May 2024.
-
Towards Consistent Object Detection via LiDAR-Camera Synergy
Authors:
Kai Luo,
Hao Wu,
Kefu Yi,
Kailun Yang,
Wei Hao,
Rongdong Hu
Abstract:
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. Currently, there is no existing model capable of detecting an object's position in both point clouds and images while also determining their corresponding relati…
▽ More
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. Currently, there is no existing model capable of detecting an object's position in both point clouds and images while also determining their corresponding relationship. This information is invaluable for human-machine interactions, offering new possibilities for their enhancement. In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation. Furthermore, to assess the accuracy of the object correlation between point clouds and images, this paper proposes a new evaluation metric, Consistency Precision (CP). To verify the effectiveness of the proposed framework, an extensive set of experiments has been conducted on the KITTI and DAIR-V2X datasets. The study also explored how the proposed consistency detection method performs on images when the calibration parameters between images and point clouds are disturbed, compared to existing post-processing methods. The experimental results demonstrate that the proposed method exhibits excellent detection performance and robustness, achieving end-to-end consistency detection. The source code will be made publicly available at https://github.com/xifen523/COD.
△ Less
Submitted 9 August, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Vertiport Navigation Requirements and Multisensor Architecture Considerations for Urban Air Mobility
Authors:
Omar Garcia Crespillo,
Chen Zhu,
Maximilian Simonetti,
Daniel Gerbeth,
Young-Hee Lee,
Wenhan Hao
Abstract:
Communication, Navigation and Surveillance (CNS) technologies are key enablers for future safe operation of drones in urban environments. However, the design of navigation technologies for these new applications is more challenging compared to e.g., civil aviation. On the one hand, the use cases and operations in urban environments are expected to have stringent requirements in terms of accuracy,…
▽ More
Communication, Navigation and Surveillance (CNS) technologies are key enablers for future safe operation of drones in urban environments. However, the design of navigation technologies for these new applications is more challenging compared to e.g., civil aviation. On the one hand, the use cases and operations in urban environments are expected to have stringent requirements in terms of accuracy, integrity, continuity and availability. On the other hand, airborne sensors may not be based on high-quality equipment as in civil aviation and solutions need to rely on tighter multisensor solutions, whose safety is difficult to assess. In this work, we first provide some initial navigation requirements related to precision approach operations based on recently proposed vertiport designs. Then, we provide an overview of a possible multisensor navigation architecture solution able to support these types of operations and we comment on the challenges of each of the subsystems. Finally, initial proof of concept for some navigation sensor subsystems is presented based on flight trials performed during the German Aerospace Center (DLR) project HorizonUAM.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Resource Allocation for RIS-Empowered Wireless Communications: Low-Complexity and Robust Designs
Authors:
Ming Zeng,
Wanming Hao,
Zhangjie Peng,
Zheng Chu,
Xingwang Li,
Changsheng You,
Cunhua Pan
Abstract:
This article delves into advancements in resource allocation techniques tailored for systems utilizing reconfigurable intelligent surfaces (RIS), with a primary focus on achieving low-complexity and resilient solutions. The investigation of low-complexity approaches for RIS holds significant relevance, primarily owing to the intricate characteristics inherent in RIS-based systems and the need of d…
▽ More
This article delves into advancements in resource allocation techniques tailored for systems utilizing reconfigurable intelligent surfaces (RIS), with a primary focus on achieving low-complexity and resilient solutions. The investigation of low-complexity approaches for RIS holds significant relevance, primarily owing to the intricate characteristics inherent in RIS-based systems and the need of deploying large-scale RIS arrays. Concurrently, the exploration of robust solutions aims to address the issue of hardware impairments occurring at both the transceivers and RIS components in practical RIS-assisted systems. In the realm of both low-complexity and robust resource allocation, this article not only elucidates the fundamental techniques underpinning these methodologies but also offers comprehensive numerical results for illustrative purposes. The necessity of adopting resource allocation strategies that are both low in complexity and resilient is thoroughly established. Ultimately, this article provides prospective research avenues in the domain of low-complexity and robust resource allocation techniques tailored for RIS-assisted systems.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Wideband Beamforming for STAR-RIS-assisted THz Communications with Three-Side Beam Split
Authors:
Wencai Yan,
Wanming Hao,
Gangcan Sun,
Chongwen Huang,
Qingqing Wu
Abstract:
In this paper, we consider the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-assisted THz communications with three-side beam split. Except for the beam split at the base station (BS), we analyze the double-side beam split at the STAR-RIS for the first time. To relieve the double-side beam split effect, we propose a time delayer (TD)-based fully-connected…
▽ More
In this paper, we consider the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-assisted THz communications with three-side beam split. Except for the beam split at the base station (BS), we analyze the double-side beam split at the STAR-RIS for the first time. To relieve the double-side beam split effect, we propose a time delayer (TD)-based fully-connected structure at the STAR-RIS. As a further advance, a low-hardware complexity and low-power consumption sub-connected structure is developed, where multiple STAR-RIS elements share one TD. Meanwhile, considering the practical scenario, we investigate a multi-STAR-RIS and multi-user communication system, and a sum rate maximization problem is formulated by jointly optimizing the hybrid analog/digital beamforming, time delays at the BS as well as the double-layer phase-shift coefficients, time delays and amplitude coefficients at the STAR-RISs. Based on this, we first allocate users for each STAR-RIS, and then derive the analog beamforming, time delays at the BS, and the double-layer phase-shift coefficients, time delays at each STAR-RIS. Next, we develop an alternative optimization algorithm to calculate the digital beamforming at the BS and amplitude coefficients at the STAR-RISs. Finally, the numerical results verify the effectiveness of the proposed schemes.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Beamforming Design for the Distributed RISs-aided THz Communications with Double-Layer True Time Delays
Authors:
Gangcan Sun,
Wencai Yan,
Wanming Hao,
Chongwen Huang,
Chau Yuen
Abstract:
In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-r…
▽ More
In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-range delay devices, which involve additional insertion loss and amplification circuitry. Next, we analyze the system performance under the proposed double-layer TTD scheme. To relieve the beam split of the RIS, we consider multiple distributed RISs to replace an ultra-large size RIS. Based on this, we formulate an achievable rate maximization problem for the distributed RISs-aided THz communications via jointly optimizing the hybrid analog/digital beamforming, time delays of the double-layer TTD network and reflection coefficients of RISs. Considering the practical hardware limitation, the finite-resolution phase shift, time delay and reflection phase are constrained. To solve the formulated problem, we first design an analog beamforming scheme including optimizing phase shift and time delay based on the RISs' locations. Then, an alternatively optimization algorithm is proposed to obtain the digital beamforming and reflection coefficients based on the minimum mean square error and coordinate update techniques. Finally, simulation results show the effectiveness of the proposed scheme.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Resource Management for IRS-assisted WP-MEC Networks with Practical Phase Shift Model
Authors:
Nana Li,
Wanming Hao,
Fuhui Zhou,
Zheng Chu,
Shouyi Yang,
Pei Xiao
Abstract:
Wireless powered mobile edge computing (WP-MEC) has been recognized as a promising solution to enhance the computational capability and sustainable energy supply for low-power wireless devices (WDs). However, when the communication links between the hybrid access point (HAP) and WDs are hostile, the energy transfer efficiency and task offloading rate are compromised. To tackle this problem, we pro…
▽ More
Wireless powered mobile edge computing (WP-MEC) has been recognized as a promising solution to enhance the computational capability and sustainable energy supply for low-power wireless devices (WDs). However, when the communication links between the hybrid access point (HAP) and WDs are hostile, the energy transfer efficiency and task offloading rate are compromised. To tackle this problem, we propose to employ multiple intelligent reflecting surfaces (IRSs) to WP-MEC networks. Based on the practical IRS phase shift model, we formulate a total computation rate maximization problem by jointly optimizing downlink/uplink IRSs passive beamforming, downlink energy beamforming and uplink multi-user detection (MUD) vector at HAPs, task offloading power and local computing frequency of WDs, and the time slot allocation. Specifically, we first derive the optimal time allocation for downlink wireless energy transmission (WET) to IRSs and the corresponding energy beamforming. Next, with fixed time allocation for the downlink WET to WDs, the original optimization problem can be divided into two independent subproblems. For the WD charging subproblem, the optimal IRSs passive beamforming is derived by utilizing the successive convex approximation (SCA) method and the penalty-based optimization technique, and for the offloading computing subproblem, we propose a joint optimization framework based on the fractional programming (FP) method. Finally, simulation results validate that our proposed optimization method based on the practical phase shift model can achieve a higher total computation rate compared to the baseline schemes.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models
Authors:
Bing Han,
Junyu Dai,
Weituo Hao,
Xinyan He,
Dong Guo,
Jitong Chen,
Yuxuan Wang,
Yanmin Qian,
Xuchen Song
Abstract:
Music editing primarily entails the modification of instrument tracks or remixing in the whole, which offers a novel reinterpretation of the original piece through a series of operations. These music processing methods hold immense potential across various applications but demand substantial expertise. Prior methodologies, although effective for image and audio modifications, falter when directly…
▽ More
Music editing primarily entails the modification of instrument tracks or remixing in the whole, which offers a novel reinterpretation of the original piece through a series of operations. These music processing methods hold immense potential across various applications but demand substantial expertise. Prior methodologies, although effective for image and audio modifications, falter when directly applied to music. This is attributed to music's distinctive data nature, where such methods can inadvertently compromise the intrinsic harmony and coherence of music. In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. Our framework fortifies the U-Net with multi-scale aggregation in order to maintain consistency before and after editing. In addition, we introduce chord progression matrix as condition information and incorporate it in the semantic space to improve melodic harmony while editing. For accommodating extended musical pieces, InstructME employs a chunk transformer, enabling it to discern long-term temporal dependencies within music sequences. We tested InstructME in instrument-editing, remixing, and multi-round editing. Both subjective and objective evaluations indicate that our proposed method significantly surpasses preceding systems in music quality, text relevance and harmony. Demo samples are available at https://musicedit.github.io/
△ Less
Submitted 12 December, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Joint Beamforming Optimization for Active STAR-RIS-Assisted ISAC Systems
Authors:
Shuang Zhang,
Wanming Hao,
Gangcan Sun,
Chongwen Huang,
Zhengyu Zhu,
Xingwang Li,
Chau Yuen
Abstract:
In this paper, we investigate an active simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted integrated sensing and communications (ISAC) system, where the dual-function base station (DFBS) operates in full-duplex (FD) mode to provide communication services and performs targets sensing simultaneously. Meanwhile, we consider multiple targets and multiple…
▽ More
In this paper, we investigate an active simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted integrated sensing and communications (ISAC) system, where the dual-function base station (DFBS) operates in full-duplex (FD) mode to provide communication services and performs targets sensing simultaneously. Meanwhile, we consider multiple targets and multiple users scenario as well as the self-interference at the FD DFBS. Through jointly optimizing the DFBS and active STAR-RIS beamforming under different work modes, our purpose is to achieve the maximum communication sum-rate, while satisfying the minimum radar signal-to-interference-plus-noise ratio (SINR) constraint, the active STAR-RIS hardware constraints and the total power constraint of DFBS and active STAR-RIS. To tackle the complex non-convex optimization problem formulated, an efficient alternating optimization algorithm is proposed. Specifically, the fractional programming method is first leveraged to turn the original problem into a more tractable one, and subsequently the transformed problem is decomposed into several sub-problems. Next, we develop a derivation method to obtain the closed-form expression of the radar receiving beamforming, and then the DFBS transmit beamforming is optimized under the radar SINR requirement and total power constraints. After that, the active STAR-RIS reflection and transmission beamforming are optimized by majorization minimization, complex circle manifold and convex optimization techniques. Finally, the proposed schemes are conducted through numerical simulations to show their benefits and efficiency.
△ Less
Submitted 13 November, 2024; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Physical Layer Security for NOMA Systems: Requirements, Issues, and Recommendations
Authors:
Saeid Pakravan,
Jean-Yves Chouinard,
Xingwang Li,
Ming Zeng,
Wanming Hao,
Quoc-Viet Pham,
Octavia A. Dobre
Abstract:
Non-orthogonal multiple access (NOMA) has been viewed as a potential candidate for the upcoming generation of wireless communication systems. Comparing to traditional orthogonal multiple access (OMA), multiplexing users in the same time-frequency resource block can increase the number of served users and improve the efficiency of the systems in terms of spectral efficiency. Nevertheless, from a se…
▽ More
Non-orthogonal multiple access (NOMA) has been viewed as a potential candidate for the upcoming generation of wireless communication systems. Comparing to traditional orthogonal multiple access (OMA), multiplexing users in the same time-frequency resource block can increase the number of served users and improve the efficiency of the systems in terms of spectral efficiency. Nevertheless, from a security view-point, when multiple users are utilizing the same time-frequency resource, there may be concerns regarding keeping information confidential. In this context, physical layer security (PLS) has been introduced as a supplement of protection to conventional encryption techniques by making use of the random nature of wireless transmission media for ensuring communication secrecy. The recent years have seen significant interests in PLS being applied to NOMA networks. Numerous scenarios have been investigated to assess the security of NOMA systems, including when active and passive eavesdroppers are present, as well as when these systems are combined with relay and reconfigurable intelligent surfaces (RIS). Additionally, the security of the ambient backscatter (AmB)-NOMA systems are other issues that have lately drawn a lot of attention. In this paper, a thorough analysis of the PLS-assisted NOMA systems research state-of-the-art is presented. In this regard, we begin by outlining the foundations of NOMA and PLS, respectively. Following that, we discuss the PLS performances for NOMA systems in four categories depending on the type of the eavesdropper, the existence of relay, RIS, and AmB systems in different conditions. Finally, a thorough explanation of the most recent PLS-assisted NOMA systems is given.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Adaptive Policy Learning to Additional Tasks
Authors:
Wenjian Hao,
Zehui Lu,
Zihao Liang,
Tianyu Zhou,
Shaoshuai Mou
Abstract:
This paper develops a policy learning method for tuning a pre-trained policy to adapt to additional tasks without altering the original task. A method named Adaptive Policy Gradient (APG) is proposed in this paper, which combines Bellman's principle of optimality with the policy gradient approach to improve the convergence rate. This paper provides theoretical analysis which guarantees the converg…
▽ More
This paper develops a policy learning method for tuning a pre-trained policy to adapt to additional tasks without altering the original task. A method named Adaptive Policy Gradient (APG) is proposed in this paper, which combines Bellman's principle of optimality with the policy gradient approach to improve the convergence rate. This paper provides theoretical analysis which guarantees the convergence rate and sample complexity of $\mathcal{O}(1/T)$ and $\mathcal{O}(1/ε)$, respectively, where $T$ denotes the number of iterations and $ε$ denotes the accuracy of the resulting stationary policy. Furthermore, several challenging numerical simulations, including cartpole, lunar lander, and robot arm, are provided to show that APG obtains similar performance compared to existing deterministic policy gradient methods while utilizing much less data and converging at a faster rate.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Optimal Control of Nonlinear Systems with Unknown Dynamics
Authors:
Wenjian Hao,
Paulo C. Heredia,
Shaoshuai Mou
Abstract:
This paper presents a data-driven method for finding a closed-loop optimal controller, which minimizes a specified infinite-horizon cost function for systems with unknown dynamics given any arbitrary initial state. Suppose the closed-loop optimal controller can be parameterized by a given class of functions, hereafter referred to as the policy. The proposed method introduces a novel gradient estim…
▽ More
This paper presents a data-driven method for finding a closed-loop optimal controller, which minimizes a specified infinite-horizon cost function for systems with unknown dynamics given any arbitrary initial state. Suppose the closed-loop optimal controller can be parameterized by a given class of functions, hereafter referred to as the policy. The proposed method introduces a novel gradient estimation framework, which approximates the gradient of the cost function with respect to the policy parameters via integrating the Koopman operator with the classical concept of actor-critic. This enables the policy parameters to be tuned iteratively using gradient descent to achieve an optimal controller, leveraging the linearity of the Koopman operator. The convergence analysis of the proposed framework is provided. The effectiveness of the method is demonstrated through comparisons with a model-free reinforcement learning approach, and its control performance is further evaluated through simulations against model-based optimal control methods that solve the same optimal control problem utilizing the exact system dynamics.
△ Less
Submitted 21 May, 2025; v1 submitted 24 May, 2023;
originally announced May 2023.
-
A Data-Driven Approach for Inverse Optimal Control
Authors:
Zihao Liang,
Wenjian Hao,
Shaoshuai Mou
Abstract:
This paper proposes a data-driven, iterative approach for inverse optimal control (IOC), which aims to learn the objective function of a nonlinear optimal control system given its states and inputs. The approach solves the IOC problem in a challenging situation when the system dynamics is unknown. The key idea of the proposed approach comes from the deep Koopman representation of the unknown syste…
▽ More
This paper proposes a data-driven, iterative approach for inverse optimal control (IOC), which aims to learn the objective function of a nonlinear optimal control system given its states and inputs. The approach solves the IOC problem in a challenging situation when the system dynamics is unknown. The key idea of the proposed approach comes from the deep Koopman representation of the unknown system, which employs a deep neural network to represent observables for the Koopman operator. By assuming the objective function to be learned is parameterized as a linear combination of features with unknown weights, the proposed approach for IOC is able to achieve a Koopman representation of the unknown dynamics and the unknown weights in objective function together. Simulation is provided to verify the proposed approach.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
ALCAP: Alignment-Augmented Music Captioner
Authors:
Zihao He,
Weituo Hao,
Wei-Tsung Lu,
Changyou Chen,
Kristina Lerman,
Xuchen Song
Abstract:
Music captioning has gained significant attention in the wake of the rising prominence of streaming media platforms. Traditional approaches often prioritize either the audio or lyrics aspect of the music, inadvertently ignoring the intricate interplay between the two. However, a comprehensive understanding of music necessitates the integration of both these elements. In this study, we delve into t…
▽ More
Music captioning has gained significant attention in the wake of the rising prominence of streaming media platforms. Traditional approaches often prioritize either the audio or lyrics aspect of the music, inadvertently ignoring the intricate interplay between the two. However, a comprehensive understanding of music necessitates the integration of both these elements. In this study, we delve into this overlooked realm by introducing a method to systematically learn multimodal alignment between audio and lyrics through contrastive learning. This not only recognizes and emphasizes the synergy between audio and lyrics but also paves the way for models to achieve deeper cross-modal coherence, thereby producing high-quality captions. We provide both theoretical and empirical results demonstrating the advantage of the proposed method, which achieves new state-of-the-art on two music captioning datasets.
△ Less
Submitted 21 October, 2023; v1 submitted 21 December, 2022;
originally announced December 2022.
-
Deep Koopman Learning of Nonlinear Time-Varying Systems
Authors:
Wenjian Hao,
Bowen Huang,
Wei Pan,
Di Wu,
Shaoshuai Mou
Abstract:
This paper presents a data-driven approach to approximate the dynamics of a nonlinear time-varying system (NTVS) by a linear time-varying system (LTVS), which is resulted from the Koopman operator and deep neural networks. Analysis of the approximation error between states of the NTVS and the resulting LTVS is presented. Simulations on a representative NTVS show that the proposed method achieves s…
▽ More
This paper presents a data-driven approach to approximate the dynamics of a nonlinear time-varying system (NTVS) by a linear time-varying system (LTVS), which is resulted from the Koopman operator and deep neural networks. Analysis of the approximation error between states of the NTVS and the resulting LTVS is presented. Simulations on a representative NTVS show that the proposed method achieves small approximation errors, even when the system changes rapidly. Furthermore, simulations in an example of quadcopters demonstrate the computational efficiency of the proposed approach.
△ Less
Submitted 21 June, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
The Far-/Near-Field Beam Squint and Solutions for THz Intelligent Reflecting Surface Communications
Authors:
Wanming Hao,
Xiaobei You,
Fuhui Zhou,
Zheng Chu,
Gangcan Sun,
Pei Xiao
Abstract:
Terahertz (THz) and intelligent reflecting surface (IRS) have been regarded as two promising technologies to improve the capacity and coverage for future 6G networks. Generally, IRS is usually equipped with large-scale elements when implemented at THz frequency. In this case, the near-field model and beam squint should be considered. Therefore, in this paper, we investigate the far-field and near-…
▽ More
Terahertz (THz) and intelligent reflecting surface (IRS) have been regarded as two promising technologies to improve the capacity and coverage for future 6G networks. Generally, IRS is usually equipped with large-scale elements when implemented at THz frequency. In this case, the near-field model and beam squint should be considered. Therefore, in this paper, we investigate the far-field and near-field beam squint problems in THz IRS communications for the first time. The far-field and near-field channel models are constructed based on the different electromagnetic radiation characteristics. Next, we first analyze the far-field beam squint and its effect for the beam gain based on the cascaded base station (BS)-IRS-user channel model, and then the near-field case is studied. To overcome the far-field and near-field beam squint effects, we propose to apply delay adjustable metasurface (DAM) to IRS, and develop a scheme of optimizing the reflecting phase shifts and time delays of IRS elements, which effectively eliminates the beam gain loss caused by beam squint. Finally, simulations are conducted to demonstrate the effectiveness of our proposed schemes in combating the near and far field beam squint.
△ Less
Submitted 25 August, 2022;
originally announced August 2022.
-
Beamforming Analysis and Design for Wideband THz Reconfigurable Intelligent Surface Communications
Authors:
Wencai Yan,
Wanming Hao,
Chongwen Huang,
Gangcan Sun,
Osamu Muta,
Haris Gacanin,
Chau Yuen
Abstract:
Reconfigurable intelligent surface (RIS)-aided terahertz (THz) communications have been regarded as a promising candidate for future 6G networks because of its ultra-wide bandwidth and ultra-low power consumption. However, there exists the beam split problem, especially when the base station (BS) or RIS owns the large-scale antennas, which may lead to serious array gain loss. Therefore, in this pa…
▽ More
Reconfigurable intelligent surface (RIS)-aided terahertz (THz) communications have been regarded as a promising candidate for future 6G networks because of its ultra-wide bandwidth and ultra-low power consumption. However, there exists the beam split problem, especially when the base station (BS) or RIS owns the large-scale antennas, which may lead to serious array gain loss. Therefore, in this paper, we investigate the beam split and beamforming design problems in the THz RIS communications. Specifically, we first analyze the beam split effect caused by different RIS sizes, shapes and deployments. On this basis, we apply the fully connected time delayer phase shifter hybrid beamforming architecture at the BS and deploy distributed RISs to cooperatively mitigate the beam split effect. We aim to maximize the achievable sum rate by jointly optimizing the hybrid analog/digital beamforming, time delays at the BS and reflection coefficients at the RISs. To solve the formulated problem, we first design the analog beamforming and time delays based on different RISs physical directions, and then it is transformed into an optimization problem by jointly optimizing the digital beamforming and reflection coefficients. Next, we propose an alternatively iterative optimization algorithm to deal with it. Specifically, for given the reflection coefficients, we propose an iterative algorithm based on the minimum mean square error technique to obtain the digital beamforming. After, we apply LDR and MCQT methods to transform the original problem to a QCQP, which can be solved by ADMM technique to obtain the reflection coefficients. Finally, the digital beamforming and reflection coefficients are obtained via repeating the above processes until convergence. Simulation results verify that the proposed scheme can effectively alleviate the beam split effect and improve the system capacity.
△ Less
Submitted 23 June, 2023; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Identification of cancer-keeping genes as therapeutic targets by finding network control hubs
Authors:
Xizhe Zhang,
Chunyu Pan,
Xinru Wei,
Meng Yu,
Shuangjie Liu,
Jun An,
Jieping Yang,
Baojun Wei,
Wenjun Hao,
Yang Yao,
Yuyan Zhu,
Weixiong Zhang
Abstract:
Finding cancer driver genes has been a focal theme of cancer research and clinical studies. One of the recent approaches is based on network structural controllability that focuses on finding a control scheme and driver genes that can steer the cell from an arbitrary state to a designated state. While theoretically sound, this approach is impractical for many reasons, e.g., the control scheme is o…
▽ More
Finding cancer driver genes has been a focal theme of cancer research and clinical studies. One of the recent approaches is based on network structural controllability that focuses on finding a control scheme and driver genes that can steer the cell from an arbitrary state to a designated state. While theoretically sound, this approach is impractical for many reasons, e.g., the control scheme is often not unique and half of the nodes may be driver genes for the cell. We developed a novel approach that transcends structural controllability. Instead of considering driver genes for one control scheme, we considered control hub genes that reside in the middle of a control path of every control scheme. Control hubs are the most vulnerable spots for controlling the cell and exogenous stimuli on them may render the cell uncontrollable. We adopted control hubs as cancer-keep genes (CKGs) and applied them to a gene regulatory network of bladder cancer (BLCA). All the genes on the cell cycle and p53 singling pathways in BLCA are CKGs, confirming the importance of these genes and the two pathways in cancer. A smaller set of 35 sensitive CKGs (sCKGs) for BLCA was identified by removing network links. Six sCKGs (RPS6KA3, FGFR3, N-cadherin (CDH2), EP300, caspase-1, and FN1) were subjected to small-interferencing-RNA knockdown in four cell lines to validate their effects on the proliferation or migration of cancer cells. Knocking down RPS6KA3 in a mouse model of BLCA significantly inhibited the growth of tumor xenografts in the mouse model. Combined, our results demonstrated the value of CKGs as therapeutic targets for cancer therapy and the potential of CKGs as an effective means for studying and characterizing cancer etiology.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Min-Max Latency Optimization for IRS-aided Cell-Free Mobile Edge Computing Systems
Authors:
Nana Li,
Wanming Hao,
Fuhui Zhou,
Shouyi Yang,
Naofal Al-Dhahir
Abstract:
Mobile-edge computing (MEC) is expected to provide low-latency computation service for wireless devices (WDs). However, when WDs are located at cell edge or communication links between base stations (BSs) and WDs are blocked, the offloading latency will be large. To address this issue, we propose an intelligent reflecting surface (IRS)-assisted cell-free MEC system consisting of multiple BSs and I…
▽ More
Mobile-edge computing (MEC) is expected to provide low-latency computation service for wireless devices (WDs). However, when WDs are located at cell edge or communication links between base stations (BSs) and WDs are blocked, the offloading latency will be large. To address this issue, we propose an intelligent reflecting surface (IRS)-assisted cell-free MEC system consisting of multiple BSs and IRSs for improving the transmission environment. Consequently, we formulate a min-max latency optimization problem by jointly designing multi-user detection (MUD) matrices, IRSs' reflecting beamforming vectors, WDs' transmit power and edge computing resource, subject to constraints on edge computing capability and IRSs phase shifts. To solve it, an alternating optimization algorithm based on the block coordinate descent (BCD) technique is proposed, in which the original non-convex problem is decoupled into two subproblems for alternately optimizing computing and communication parameters. In particular, we optimize the MUD matrix based on the second-order cone programming (SOCP) technique, and then develop two efficient algorithms to optimize IRSs' reflecting vectors based on the semi-definite relaxation (SDR) and successive convex approximation (SCA) techniques, respectively. Numerical results show that employing IRSs in cell-free MEC systems outperforms conventional MEC systems, resulting in up to about 60% latency reduction can be attained. Moreover, numerical results confirm that our proposed algorithms enjoy a fast convergence, which is beneficial for practical implementation.
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
Securing Reconfigurable Intelligent Surface-Aided Cell-Free Networks
Authors:
Wanming Hao,
Junjie Li,
Gangcan Sun,
Ming Zeng,
Octavia A. Dobre
Abstract:
In this paper, we investigate the physical layer security in the reconfigurable intelligent surface (RIS)-aided cell-free networks. A maximum weighted sum secrecy rate problem is formulated by jointly optimizing the active beamforming (BF) at the base stations and passive BF at the RISs. To handle this non-trivial problem, we adopt the alternating optimization to decouple the original problem into…
▽ More
In this paper, we investigate the physical layer security in the reconfigurable intelligent surface (RIS)-aided cell-free networks. A maximum weighted sum secrecy rate problem is formulated by jointly optimizing the active beamforming (BF) at the base stations and passive BF at the RISs. To handle this non-trivial problem, we adopt the alternating optimization to decouple the original problem into two sub-ones, which are solved using the semidefinite relaxation and continuous convex approximation theory. To decrease the complexity for obtaining overall channel state information (CSI), we extend the proposed framework to the case that only requires part of the RIS' CSI. This is achieved via deliberately discarding the RIS that has a small contribution to the user's secrecy rate. Based on this, we formulate a mixed integer non-linear programming problem, and the linear conic relaxation is used to obtained the solutions. Finally, the simulation results show that the proposed schemes can obtain a higher secrecy rate than the existing ones.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Ultra Wide Band THz IRS Communications: Applications, Challenges, Key Techniques, and Research Opportunities
Authors:
Wanming Hao,
Fuhui Zhou,
Ming Zeng,
Octavia A. Dobre,
Naofal Al-Dhahir
Abstract:
Terahertz (THz) communication is a promising technology for future wireless networks due to its ultra-wide bandwidth. However, THz signals suffer from severe attenuation and poor diffraction capability, making it vulnerable to blocking obstacles. To compensate for these two shortcomings and improve the system performance, an intelligent reflecting surface (IRS) can be exploited to change the propa…
▽ More
Terahertz (THz) communication is a promising technology for future wireless networks due to its ultra-wide bandwidth. However, THz signals suffer from severe attenuation and poor diffraction capability, making it vulnerable to blocking obstacles. To compensate for these two shortcomings and improve the system performance, an intelligent reflecting surface (IRS) can be exploited to change the propagation direction and enhance the signal strength. In this article, we investigate this promising ultra wide band (UWB) THz IRS communication paradigm. We start by motivating our research and describing several potential application scenarios. Then, we identify major challenges faced by UWB THz IRS communications. To overcome these challenges, several effective key techniques are developed, i.e., the time delayer-based sparse radio frequency antenna structure, delay hybrid precoding and IRS deployment. Simulation results are also presented to compare the system performance for these proposed techniques, thus demonstrating their effectiveness. Finally, we highlight several open issues and research opportunities for UWB THz IRS communications.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Intelligent Reflecting Surface Assisted Integrated Sensing and Communications for mmWave Channels
Authors:
Zhengyu Zhu,
Zheng Li,
Zheng Chu,
Gangcan Sun,
Wanming Hao,
Pei Xiao,
Inkyu Lee
Abstract:
This paper proposes an intelligent reflecting surface (IRS) assisted integrated sensing and communication (ISAC) system operating at the millimeter-wave (mmWave) band. Specifically, the ISAC system combines communication and radar operations and performs, detecting and communicating simultaneously with multiple targets and users. The IRS dynamically controls the amplitude or phase of the radio sig…
▽ More
This paper proposes an intelligent reflecting surface (IRS) assisted integrated sensing and communication (ISAC) system operating at the millimeter-wave (mmWave) band. Specifically, the ISAC system combines communication and radar operations and performs, detecting and communicating simultaneously with multiple targets and users. The IRS dynamically controls the amplitude or phase of the radio signal via reflecting elements to reconfigure the radio propagation environment and enhance the transmission rate of the ISAC system. By jointly designing the radar signal covariance (RSC) matrix, the beamforming vector of the communication system, and the IRS phase shift, the ISAC system transmission rate can be improved while matching the desired waveform for radar. The problem is non-convex due to multivariate coupling, and thus we decompose it into two separate subproblems. First, a closed-form solution of the RSC matrix is derived from the desired radar waveform. Next, the quadratic transformation (QT) technique is applied to the subproblem, and then alternating optimization (AO) is employed to determine the communication beamforming vector and the IRS phase shift. For computing the IRS phase shift, we adopt both the majorization minimization (MM) and the manifold optimization (MO). Also, we derive a closed-form solution for the formulated problem, effectively decreasing computational complexity. Furthermore, a trade-off factor is introduced to balance the performance of communication and sensing. Finally, the simulations verify the effectiveness of the algorithm and demonstrate that the IRS can improve the performance of the ISAC system.
△ Less
Submitted 8 April, 2024; v1 submitted 5 January, 2022;
originally announced February 2022.
-
A Novel Transmission Policy for Intelligent Reflecting Surface Assisted Wireless Powered Sensor Networks
Authors:
Zheng Chu,
Pei Xiao,
De Mi,
Wanming Hao,
Mohsen Khalily,
Lie-Liang Yang
Abstract:
This paper proposes a novel transmission policy for an intelligent reflecting surface (IRS) assisted wireless powered sensor network (WPSN). An IRS is deployed to enhance the performance of wireless energy transfer (WET) and wireless information transfer (WIT) by intelligently adjusting phase shifts of each reflecting elements. To achieve its self-sustainability, the IRS needs to collect energy fr…
▽ More
This paper proposes a novel transmission policy for an intelligent reflecting surface (IRS) assisted wireless powered sensor network (WPSN). An IRS is deployed to enhance the performance of wireless energy transfer (WET) and wireless information transfer (WIT) by intelligently adjusting phase shifts of each reflecting elements. To achieve its self-sustainability, the IRS needs to collect energy from energy station to support its control circuit operation. Our proposed policy for the considered WPSN is called IRS assisted harvest-then-transmit time switching, which is able to schedule the transmission time slots by switching between energy collection and energy reflection modes. We study the achievable sum throughput of the proposed transmission policy and investigate a joint design of the transmission time slots, the power allocation, as well as the discrete phase shifts of the WET and WIT. This formulates the problem as a mixed-integer non-linear program, which is NP-hard and non-convex. We first relax it to one with continuous phase shifts, and then propose a two-step approach and decompose the original problem into two sub-problems. We solve the first sub-problem with respect to the phase shifts of the WIT in terms of closed-form expression. For the second sub-problem, we consider a special case without the circuit power of each sensor node, the Lagrange dual method and the KKT conditions are applied to derive the optimal closed-form transmission time slots, power allocation, and phase shift of the WET. Then we generalise the case with the circuit power of each sensor node, which can be solved via employing a semi-definite programming relaxation. The optimal discrete phase shifts can be obtained by quantizing the continuous values. Numerical results demonstrate the effectiveness of the proposed policy and validate the beneficial role of the IRS in comparison to the benchmark schemes.
△ Less
Submitted 28 April, 2021;
originally announced April 2021.
-
NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results
Authors:
Ren Yang,
Radu Timofte,
Jing Liu,
Yi Xu,
Xinjian Zhang,
Minyi Zhao,
Shuigeng Zhou,
Kelvin C. K. Chan,
Shangchen Zhou,
Xiangyu Xu,
Chen Change Loy,
Xin Li,
Fanglong Liu,
He Zheng,
Lielin Jiang,
Qi Zhang,
Dongliang He,
Fu Li,
Qingqing Dang,
Yibin Huang,
Matteo Maggioni,
Zhongqian Fu,
Shuai Xiao,
Cheng li,
Thomas Tanay
, et al. (47 additional authors not shown)
Abstract:
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at…
▽ More
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at a fixed bit-rate. Besides, the quality enhancement of Tracks 1 and 3 targets at improving the fidelity (PSNR), and Track 2 targets at enhancing the perceptual quality. The three tracks totally attract 482 registrations. In the test phase, 12 teams, 8 teams and 11 teams submitted the final results of Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of video quality enhancement. The homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh
△ Less
Submitted 31 August, 2022; v1 submitted 21 April, 2021;
originally announced April 2021.
-
Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning
Authors:
Siyang Yuan,
Pengyu Cheng,
Ruiyi Zhang,
Weituo Hao,
Zhe Gan,
Lawrence Carin
Abstract:
Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a ch…
▽ More
Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a challenging problem. We propose a novel zero-shot voice transfer method via disentangled representation learning. The proposed method first encodes speaker-related style and voice content of each input voice into separated low-dimensional embedding spaces, and then transfers to a new voice by combining the source content embedding and target style embedding through a decoder. With information-theoretic guidance, the style and content embedding spaces are representative and (ideally) independent of each other. On real-world VCTK datasets, our method outperforms other baselines and obtains state-of-the-art results in terms of transfer accuracy and voice naturalness for voice style transfer experiments under both many-to-many and zero-shot setups.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Deep Learning of Koopman Representation for Control
Authors:
Yiqiang Han,
Wenjian Hao,
Umesh Vaidya
Abstract:
We develop a data-driven, model-free approach for the optimal control of the dynamical system. The proposed approach relies on the Deep Neural Network (DNN) based learning of Koopman operator for the purpose of control. In particular, DNN is employed for the data-driven identification of basis function used in the linear lifting of nonlinear control system dynamics. The controller synthesis is pur…
▽ More
We develop a data-driven, model-free approach for the optimal control of the dynamical system. The proposed approach relies on the Deep Neural Network (DNN) based learning of Koopman operator for the purpose of control. In particular, DNN is employed for the data-driven identification of basis function used in the linear lifting of nonlinear control system dynamics. The controller synthesis is purely data-driven and does not rely on a priori domain knowledge. The OpenAI Gym environment, employed for Reinforcement Learning-based control design, is used for data generation and learning of Koopman operator in control setting. The method is applied to two classic dynamical systems on OpenAI Gym environment to demonstrate the capability.
△ Less
Submitted 15 October, 2020;
originally announced October 2020.
-
Cell A* for Navigation of Unmanned Aerial Vehicles in Partially-known Environments
Authors:
Wenjian Hao,
Rongyao Wang,
Alexander Krolicki,
Yiqiang Han
Abstract:
Proper path planning is the first step of robust and efficient autonomous navigation for mobile robots. Meanwhile, it is still challenging for robots to work in a complex environment without complete prior information. This paper presents an extension to the A* search algorithm and its variants to make the path planning stable with less computational burden while handling long-distance tasks. The…
▽ More
Proper path planning is the first step of robust and efficient autonomous navigation for mobile robots. Meanwhile, it is still challenging for robots to work in a complex environment without complete prior information. This paper presents an extension to the A* search algorithm and its variants to make the path planning stable with less computational burden while handling long-distance tasks. The implemented algorithm is capable of online searching for a collision-free and smooth path when heading to the defined goal position. This paper deploys the algorithm on the autonomous drone platform and implements it on a remote control car for algorithm efficiency validation.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
Power Minimization for Multi-cell Uplink NOMA with Imperfect SIC
Authors:
M. Zeng,
W. Hao,
O. A. Dobre,
Z. Ding,
H. V. Poor
Abstract:
In this paper, we investigate a multi-cell uplink non-orthogonal multiple access (NOMA) system with imperfect successive interference cancellation (SIC). The objective of the formulated optimization problem is to minimize the total power consumption under users' quality-of-service constraints. The considered problem is first transformed into a linear programming problem, upon which centralized and…
▽ More
In this paper, we investigate a multi-cell uplink non-orthogonal multiple access (NOMA) system with imperfect successive interference cancellation (SIC). The objective of the formulated optimization problem is to minimize the total power consumption under users' quality-of-service constraints. The considered problem is first transformed into a linear programming problem, upon which centralized and distributed optimal solutions are proposed. Numerical results are presented to verify the performance of the proposed solutions and evaluate the impact of imperfect SIC on the system performance.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Data Driven Control with Learned Dynamics: Model-Based versus Model-Free Approach
Authors:
Wenjian Hao,
Yiqiang Han
Abstract:
This paper compares two different types of data-driven control methods, representing model-based and model-free approaches. One is a recently proposed method - Deep Koopman Representation for Control (DKRC), which utilizes a deep neural network to map an unknown nonlinear dynamical system to a high-dimensional linear system, which allows for employing state-of-the-art control strategy. The other o…
▽ More
This paper compares two different types of data-driven control methods, representing model-based and model-free approaches. One is a recently proposed method - Deep Koopman Representation for Control (DKRC), which utilizes a deep neural network to map an unknown nonlinear dynamical system to a high-dimensional linear system, which allows for employing state-of-the-art control strategy. The other one is a classic model-free control method based on an actor-critic architecture - Deep Deterministic Policy Gradient (DDPG), which has been proved to be effective in various dynamical systems. The comparison is carried out in OpenAI Gym, which provides multiple control environments for benchmark purposes. Two examples are provided for comparison, i.e., classic Inverted Pendulum and Lunar Lander Continuous Control. From the results of the experiments, we compare these two methods in terms of control strategies and the effectiveness under various initialization conditions. We also examine the learned dynamic model from DKRC with the analytical model derived from the Euler-Lagrange Linearization method, which demonstrates the accuracy in the learned model for unknown dynamics from a data-driven sample-efficient approach.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Sum Rate Maximization for IRS-assisted Uplink NOMA
Authors:
M. Zeng,
X. Li,
G. Li,
W. Hao,
O. A. Dobre
Abstract:
An intelligent reflecting surface (IRS) consists of a large number of low-cost reflecting elements, which can steer the incident signal collaboratively by passive beamforming. This way, IRS reconfigures the wireless environment to boost the system performance. In this paper, we consider an IRS-assisted uplink non-orthogonal multiple access (NOMA) system. The objective is to maximize the sum rate o…
▽ More
An intelligent reflecting surface (IRS) consists of a large number of low-cost reflecting elements, which can steer the incident signal collaboratively by passive beamforming. This way, IRS reconfigures the wireless environment to boost the system performance. In this paper, we consider an IRS-assisted uplink non-orthogonal multiple access (NOMA) system. The objective is to maximize the sum rate of all users under individual power constraint. The considered problem requires a joint power control at the users and beamforming design at the IRS, and is nonconvex. To handle it, semidefinite relaxation is employed, which provides a near-optimal solution. Presented numerical results show that the proposed NOMA-based scheme achieves a larger sum rate than orthogonal multiple access (OMA)-based one. Moreover, the impact of the number of reflecting elements on the sum rate is revealed.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Energy-Efficient Hybrid Precoding Design for Integrated Multicast-Unicast Millimeter Wave Communications with SWIPT
Authors:
Wanming Hao,
Gangcan Sun,
Fuhui Zhou,
De Mi,
Jia Shi,
Pei Xiao,
Victor C. M. Leung
Abstract:
In this paper, we investigate the energy-efficient hybrid precoding design for integrated multicast-unicast millimeter wave (mmWave) system, where the simultaneous wireless information and power transform is considered at receivers. We adopt two sparse radio frequency chain antenna structures at the base station (BS), i.e., fully-connected and subarray structures, and design the codebook-based ana…
▽ More
In this paper, we investigate the energy-efficient hybrid precoding design for integrated multicast-unicast millimeter wave (mmWave) system, where the simultaneous wireless information and power transform is considered at receivers. We adopt two sparse radio frequency chain antenna structures at the base station (BS), i.e., fully-connected and subarray structures, and design the codebook-based analog precoding according to the different structures. Then, we formulate a joint digital multicast, unicast precoding and power splitting ratio optimization problem to maximize the energy efficiency of the system, while the maximum transmit power at the BS and minimum harvested energy at receivers are considered. Due to its difficulty to directly solve the formulated problem, we equivalently transform the fractional objective function into a subtractive form one and propose a two-loop iterative algorithm to solve it. For the outer loop, the classic Bi-section iterative algorithm is applied. For the inner loop, we transform the formulated problem into a convex one by successive convex approximation techniques and propose an iterative algorithm to solve it. Meanwhile, to reduce the complexity of the inner loop, we develop a zero forcing (ZF) technique-based low complexity iterative algorithm. Specifically, the ZF technique is applied to cancel the inter-unicast interference and the first order Taylor approximation is used for the convexification of the non-convex constraints in the original problem. Finally, simulation results are provided to compare the performance of the proposed algorithms under different schemes.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Edge Cache-assisted Secure Low-Latency Millimeter Wave Transmission
Authors:
Wanming Hao,
Ming Zeng,
Gangcan Sun,
Pei Xiao
Abstract:
In this paper, we consider an edge cache-assisted millimeter wave cloud radio access network (C-RAN). Each remote radio head (RRH) in the C-RAN has a local cache, which can pre-fetch and store the files requested by the actuators. Multiple RRHs form a cluster to cooperatively serve the actuators, which acquire their required files either from the local caches or from the central processor via mult…
▽ More
In this paper, we consider an edge cache-assisted millimeter wave cloud radio access network (C-RAN). Each remote radio head (RRH) in the C-RAN has a local cache, which can pre-fetch and store the files requested by the actuators. Multiple RRHs form a cluster to cooperatively serve the actuators, which acquire their required files either from the local caches or from the central processor via multicast fronthaul links. For such a scenario, we formulate a beamforming design problem to minimize the secure transmission delay under transmit power constraint of each RRH. Due to the difficulty of directly solving the formulated problem, we divide it into two independent ones: {\textit{i)}} minimizing the fronthaul transmission delay by jointly optimizing the transmit and receive beamforming; {\textit{ii)}} minimizing the maximum access transmission delay by jointly designing cooperative beamforming among RRHs. An alternatively iterative algorithm is proposed to solve the first optimization problem. For the latter, we first design the analog beamforming based on the channel state information of the actuators. Then, with the aid of successive convex approximation and $S$-procedure techniques, a semidefinite program (SDP) is formulated, and an iterative algorithm is proposed through SDP relaxation. Finally, simulation results are provided to verify the performance of the proposed schemes.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
A machine learning method correlating pulse pressure wave data with pregnancy
Authors:
Jianhong Chen,
Huang Huang,
Wenrui Hao,
Jinchao Xu
Abstract:
Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key f…
▽ More
Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key features in TCM, and pregnancy by using deep learning technology. This computational approach shows that the accuracy of pregnancy detection by the PPW is 84% with an AUC of 91%. Our study is a proof of concept of pulse diagnosis and will also motivate further sophisticated investigations on pulse waves.
△ Less
Submitted 3 October, 2019;
originally announced October 2019.
-
Codebook-Based Max-Min Energy-Efficient Resource Allocation for Uplink mmWave MIMO-NOMA Systems
Authors:
Wanming Hao,
Ming Zeng,
Gangcan Sun,
Osamu Muta,
Octavia A. Dobre,
Shouyi Yang,
Haris Gacanin
Abstract:
In this paper, we investigate the energy-efficient resource allocation problem in an uplink non-orthogonal multiple access (NOMA) millimeter wave system, where the fully-connected-based sparse radio frequency chain antenna structure is applied at the base station (BS). To relieve the pilot overhead for channel estimation, we propose a codebook-based analog beam design scheme, which only requires t…
▽ More
In this paper, we investigate the energy-efficient resource allocation problem in an uplink non-orthogonal multiple access (NOMA) millimeter wave system, where the fully-connected-based sparse radio frequency chain antenna structure is applied at the base station (BS). To relieve the pilot overhead for channel estimation, we propose a codebook-based analog beam design scheme, which only requires to obtain the equivalent channel gain. On this basis, users belonging to the same analog beam are served via NOMA. Meanwhile, an advanced NOMA decoding scheme is proposed by exploiting the global information available at the BS. Under predefined minimum rate and maximum transmit power constraints for each user, we formulate a max-min user energy efficiency (EE) optimization problem by jointly optimizing the detection matrix at the BS and transmit power at the users. We first transform the original fractional objective function into a subtractive one. Then, we propose a two-loop iterative algorithm to solve the reformulated problem. Specifically, the inner loop updates the detection matrix and transmit power iteratively, while the outer loop adopts the bisection method. Meanwhile, to decrease the complexity of the inner loop, we propose a zero-forcing (ZF)-based iterative algorithm, where the detection matrix is designed via the ZF technique. Finally, simulation results show that the proposed schemes obtain a better performance in terms of spectral efficiency and EE than the conventional schemes
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
An Output Feedback Stabilizer for MIMO Nonlinear Systems with Uncertain Input Gain: Nonlinear Nominal Model
Authors:
Wonseok Ha,
Juhoon Back
Abstract:
This paper deals with the output feedback stabilization problem of nonlinear multi-input multi-output systems having an uncertain input gain matrix. It is assumed that the system has a well-defined vector relative degree and that the zero dynamics is input-to-state stable. Based on the assumption that there exists a state feedback controller which globally asymptotically stabilizes the origin of t…
▽ More
This paper deals with the output feedback stabilization problem of nonlinear multi-input multi-output systems having an uncertain input gain matrix. It is assumed that the system has a well-defined vector relative degree and that the zero dynamics is input-to-state stable. Based on the assumption that there exists a state feedback controller which globally asymptotically stabilizes the origin of the nominal closed-loop system, we present an output feedback stabilizer which recovers the stability of the nominal closed-loop system in the semi-global practical sense. Compared to previous results, we allow that the nominal system can have a nonlinear input gain matrix that is a function of state and this is done by modifying the structure of the disturbance observer-based robust output feedback controller. It is expected that the proposed controller can be well applied to the case when the system's nonlinearity is to be exploited rather than canceled.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.
-
Adaptive Power System Emergency Control using Deep Reinforcement Learning
Authors:
Qiuhua Huang,
Renke Huang,
Weituo Hao,
Jie Tan,
Rui Fan,
Zhenyu Huang
Abstract:
Power system emergency control is generally regarded as the last safety net for grid security and resiliency. Existing emergency control schemes are usually designed off-line based on either the conceived "worst" case scenario or a few typical operation scenarios. These schemes are facing significant adaptiveness and robustness issues as increasing uncertainties and variations occur in modern elec…
▽ More
Power system emergency control is generally regarded as the last safety net for grid security and resiliency. Existing emergency control schemes are usually designed off-line based on either the conceived "worst" case scenario or a few typical operation scenarios. These schemes are facing significant adaptiveness and robustness issues as increasing uncertainties and variations occur in modern electrical grids. To address these challenges, for the first time, this paper developed novel adaptive emergency control schemes using deep reinforcement learning (DRL), by leveraging the high-dimensional feature extraction and non-linear generalization capabilities of DRL for complex power systems. Furthermore, an open-source platform named RLGC has been designed for the first time to assist the development and benchmarking of DRL algorithms for power system control. Details of the platform and DRL-based emergency control schemes for generator dynamic braking and under-voltage load shedding are presented. Extensive case studies performed in both two-area four-machine system and IEEE 39-Bus system have demonstrated the excellent performance and robustness of the proposed schemes.
△ Less
Submitted 22 April, 2019; v1 submitted 8 March, 2019;
originally announced March 2019.