-
UDF-GMA: Uncertainty Disentanglement and Fusion for General Movement Assessment
Authors:
Zeqi Luo,
Ali Gooya,
Edmond S. L. Ho
Abstract:
General movement assessment (GMA) is a non-invasive tool for the early detection of brain dysfunction through the qualitative assessment of general movements, and the development of automated methods can broaden its application. However, mainstream pose-based automated GMA methods are prone to uncertainty due to limited high-quality data and noisy pose estimation, hindering clinical reliability wi…
▽ More
General movement assessment (GMA) is a non-invasive tool for the early detection of brain dysfunction through the qualitative assessment of general movements, and the development of automated methods can broaden its application. However, mainstream pose-based automated GMA methods are prone to uncertainty due to limited high-quality data and noisy pose estimation, hindering clinical reliability without reliable uncertainty measures. In this work, we introduce UDF-GMA which explicitly models epistemic uncertainty in model parameters and aleatoric uncertainty from data noise for pose-based automated GMA. UDF-GMA effectively disentangles uncertainties by directly modelling aleatoric uncertainty and estimating epistemic uncertainty through Bayesian approximation. We further propose fusing these uncertainties with the embedded motion representation to enhance class separation. Extensive experiments on the Pmi-GMA benchmark dataset demonstrate the effectiveness and generalisability of the proposed approach in predicting poor repertoire.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Relaxation-Free Min-k-Partition for PCI Assignment in 5G Networks
Authors:
Yeqing Qiu,
Chengpiao Huang,
Ye Xue,
Zhipeng Jiang,
Qingjiang Shi,
Dong Zhang,
Zhi-Quan Luo
Abstract:
Physical Cell Identity (PCI) is a critical parameter in 5G networks. Efficient and accurate PCI assignment is essential for mitigating mod-3 interference, mod-30 interference, collisions, and confusions among cells, which directly affect network reliability and user experience. In this paper, we propose a novel framework for PCI assignment by decomposing the problem into Min-3-Partition, Min-10-Pa…
▽ More
Physical Cell Identity (PCI) is a critical parameter in 5G networks. Efficient and accurate PCI assignment is essential for mitigating mod-3 interference, mod-30 interference, collisions, and confusions among cells, which directly affect network reliability and user experience. In this paper, we propose a novel framework for PCI assignment by decomposing the problem into Min-3-Partition, Min-10-Partition, and a graph coloring problem, leveraging the Chinese Remainder Theorem (CRT). Furthermore, we develop a relaxation-free approach to the general Min-k-Partition problem by reformulating it as a quadratic program with a norm-equality constraint and solving it using a penalized mirror descent (PMD) algorithm. The proposed method demonstrates superior computational efficiency and scalability, significantly reducing interference while eliminating collisions and confusions in large-scale 5G networks. Numerical evaluations on real-world datasets show that our approach reduces computational time by up to 20 times compared to state-of-the-art methods, making it highly practical for real-time PCI optimization in large-scale networks. These results highlight the potential of our method to improve network performance and reduce deployment costs in modern 5G systems.
△ Less
Submitted 13 June, 2025; v1 submitted 12 June, 2025;
originally announced June 2025.
-
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models
Authors:
Zifan Peng,
Yule Liu,
Zhen Sun,
Mingchen Li,
Zeren Luo,
Jingyi Zheng,
Wenhan Dong,
Xinlei He,
Xuechao Wang,
Yingjie Xue,
Shengmin Xu,
Xinyi Huang
Abstract:
Audio Language Models (ALMs) have made significant progress recently. These models integrate the audio modality directly into the model, rather than converting speech into text and inputting text to Large Language Models (LLMs). While jailbreak attacks on LLMs have been extensively studied, the security of ALMs with audio modalities remains largely unexplored. Currently, there is a lack of an adve…
▽ More
Audio Language Models (ALMs) have made significant progress recently. These models integrate the audio modality directly into the model, rather than converting speech into text and inputting text to Large Language Models (LLMs). While jailbreak attacks on LLMs have been extensively studied, the security of ALMs with audio modalities remains largely unexplored. Currently, there is a lack of an adversarial audio dataset and a unified framework specifically designed to evaluate and compare attacks and ALMs. In this paper, we present JALMBench, the \textit{first} comprehensive benchmark to assess the safety of ALMs against jailbreak attacks. JALMBench includes a dataset containing 2,200 text samples and 51,381 audio samples with over 268 hours. It supports 12 mainstream ALMs, 4 text-transferred and 4 audio-originated attack methods, and 5 defense methods. Using JALMBench, we provide an in-depth analysis of attack efficiency, topic sensitivity, voice diversity, and attack representations. Additionally, we explore mitigation strategies for the attacks at both the prompt level and the response level.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results
Authors:
Xin Li,
Yeying Jin,
Xin Jin,
Zongwei Wu,
Bingchen Li,
Yufei Wang,
Wenhan Yang,
Yu Li,
Zhibo Chen,
Bihan Wen,
Robby T. Tan,
Radu Timofte,
Qiyu Rong,
Hongyuan Jing,
Mengmeng Zhang,
Jinglong Li,
Xiangyu Lu,
Yi Ren,
Yuting Liu,
Meng Zhang,
Xiang Chen,
Qiyuan Guan,
Jiangxin Dong,
Jinshan Pan,
Conglin Gou
, et al. (112 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includ…
▽ More
This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includes day raindrop-focused, day background-focused, night raindrop-focused, and night background-focused degradations. This dataset is divided into three subsets for competition: 14,139 images for training, 240 images for validation, and 731 images for testing. The primary objective of this challenge is to establish a new and powerful benchmark for the task of removing raindrops under varying lighting and focus conditions. There are a total of 361 participants in the competition, and 32 teams submitting valid solutions and fact sheets for the final testing phase. These submissions achieved state-of-the-art (SOTA) performance on the Raindrop Clarity dataset. The project can be found at https://lixinustc.github.io/CVPR-NTIRE2025-RainDrop-Competition.github.io/.
△ Less
Submitted 19 April, 2025; v1 submitted 17 April, 2025;
originally announced April 2025.
-
AUTOBargeSim: MATLAB(R) toolbox for the design and analysis of the guidance and control system for autonomous inland vessels
Authors:
Abhishek Dhyani,
Amirreza Haqshenas Mojaveri,
Chengqian Zhang,
Dhanika Mahipala,
Hoang Anh Tran,
Yan-Yun Zhang,
Zhongbi Luo,
Vasso Reppa
Abstract:
This paper introduces AUTOBargeSim, a simulation toolbox for autonomous inland vessel guidance and control system design. AUTOBargeSim is developed using MATLAB and provides an easy-to-use introduction to various aspects of autonomous inland navigation, including mapping, modelling, control design, and collision avoidance, through examples and extensively documented code. Applying modular design p…
▽ More
This paper introduces AUTOBargeSim, a simulation toolbox for autonomous inland vessel guidance and control system design. AUTOBargeSim is developed using MATLAB and provides an easy-to-use introduction to various aspects of autonomous inland navigation, including mapping, modelling, control design, and collision avoidance, through examples and extensively documented code. Applying modular design principles in the simulator structure allows it to be easily modified according to the user's requirements. Furthermore, a GUI interface facilitates a simple and quick execution. Key performance indices for evaluating the performance of the controller and collision avoidance method in confined space are also provided. The current version of AUTOBargeSim attempts to improve reproducibility in the design and simulation of marine systems while serving as a foundation for simulating and evaluating vessel behaviour considering operational, system, and environmental constraints.
△ Less
Submitted 18 June, 2025; v1 submitted 27 March, 2025;
originally announced March 2025.
-
A Note on Structural Controllability and Observability Indices
Authors:
Yuan Zhang,
Ranbo Cheng,
Ziyuan Luo,
Yuanqing Xia
Abstract:
In this note, we investigate the structural controllability and observability indices of structured systems. We provide counter-examples showing that an existing graph-theoretic characterization for the structural controllability index (SCOI) may not hold, even for systems with self-loop at every state node. We further demonstrate that this characterization actually provides upper bounds, and exte…
▽ More
In this note, we investigate the structural controllability and observability indices of structured systems. We provide counter-examples showing that an existing graph-theoretic characterization for the structural controllability index (SCOI) may not hold, even for systems with self-loop at every state node. We further demonstrate that this characterization actually provides upper bounds, and extend them to new graph-theoretic characterizations applicable to systems that are not necessarily structurally controllable. Additionally, we reveal that an existing method may fail to obtain the exact SCOI. Consequently, complete graph-theoretic characterizations and polynomial-time computation of SCOI remain open. Given this, we present an efficiently computable tight lower bound, whose tightness is validated by numerical simulations. All these results apply to the structural observability index by the duality between controllability and observability.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
SegX: Improving Interpretability of Clinical Image Diagnosis with Segmentation-based Enhancement
Authors:
Yuhao Zhang,
Mingcheng Zhu,
Zhiyao Luo
Abstract:
Deep learning-based medical image analysis faces a significant barrier due to the lack of interpretability. Conventional explainable AI (XAI) techniques, such as Grad-CAM and SHAP, often highlight regions outside clinical interests. To address this issue, we propose Segmentation-based Explanation (SegX), a plug-and-play approach that enhances interpretability by aligning the model's explanation ma…
▽ More
Deep learning-based medical image analysis faces a significant barrier due to the lack of interpretability. Conventional explainable AI (XAI) techniques, such as Grad-CAM and SHAP, often highlight regions outside clinical interests. To address this issue, we propose Segmentation-based Explanation (SegX), a plug-and-play approach that enhances interpretability by aligning the model's explanation map with clinically relevant areas leveraging the power of segmentation models. Furthermore, we introduce Segmentation-based Uncertainty Assessment (SegU), a method to quantify the uncertainty of the prediction model by measuring the 'distance' between interpretation maps and clinically significant regions. Our experiments on dermoscopic and chest X-ray datasets show that SegX improves interpretability consistently across mortalities, and the certainty score provided by SegU reliably reflects the correctness of the model's predictions. Our approach offers a model-agnostic enhancement to medical image diagnosis towards reliable and interpretable AI in clinical decision-making.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills
Authors:
Tairan He,
Jiawei Gao,
Wenli Xiao,
Yuanhang Zhang,
Zi Wang,
Jiashun Wang,
Zhengyi Luo,
Guanqi He,
Nikhil Sobanbab,
Chaoyi Pan,
Zeji Yi,
Guannan Qu,
Kris Kitani,
Jessica Hodgins,
Linxi "Jim" Fan,
Yuke Zhu,
Changliu Liu,
Guanya Shi
Abstract:
Humanoid robots hold the potential for unparalleled versatility in performing human-like, whole-body skills. However, achieving agile and coordinated whole-body motions remains a significant challenge due to the dynamics mismatch between simulation and the real world. Existing approaches, such as system identification (SysID) and domain randomization (DR) methods, often rely on labor-intensive par…
▽ More
Humanoid robots hold the potential for unparalleled versatility in performing human-like, whole-body skills. However, achieving agile and coordinated whole-body motions remains a significant challenge due to the dynamics mismatch between simulation and the real world. Existing approaches, such as system identification (SysID) and domain randomization (DR) methods, often rely on labor-intensive parameter tuning or result in overly conservative policies that sacrifice agility. In this paper, we present ASAP (Aligning Simulation and Real-World Physics), a two-stage framework designed to tackle the dynamics mismatch and enable agile humanoid whole-body skills. In the first stage, we pre-train motion tracking policies in simulation using retargeted human motion data. In the second stage, we deploy the policies in the real world and collect real-world data to train a delta (residual) action model that compensates for the dynamics mismatch. Then, ASAP fine-tunes pre-trained policies with the delta action model integrated into the simulator to align effectively with real-world dynamics. We evaluate ASAP across three transfer scenarios: IsaacGym to IsaacSim, IsaacGym to Genesis, and IsaacGym to the real-world Unitree G1 humanoid robot. Our approach significantly improves agility and whole-body coordination across various dynamic motions, reducing tracking error compared to SysID, DR, and delta dynamics learning baselines. ASAP enables highly agile motions that were previously difficult to achieve, demonstrating the potential of delta action learning in bridging simulation and real-world dynamics. These results suggest a promising sim-to-real direction for developing more expressive and agile humanoids.
△ Less
Submitted 25 April, 2025; v1 submitted 3 February, 2025;
originally announced February 2025.
-
Preventing output saturation in active noise control: An output-constrained Kalman filter approach
Authors:
Junwei Ji,
Dongyuan Shi,
Boxiang Wang,
Xiaoyi Shen,
Zhengding Luo,
Woon-Seng Gan
Abstract:
The Kalman filter (KF)-based active noise control (ANC) system demonstrates superior tracking and faster convergence compared to the least mean square (LMS) method, particularly in dynamic noise cancellation scenarios. However, in environments with extremely high noise levels, the power of the control signal can exceed the system's rated output power due to hardware limitations, leading to output…
▽ More
The Kalman filter (KF)-based active noise control (ANC) system demonstrates superior tracking and faster convergence compared to the least mean square (LMS) method, particularly in dynamic noise cancellation scenarios. However, in environments with extremely high noise levels, the power of the control signal can exceed the system's rated output power due to hardware limitations, leading to output saturation and subsequent non-linearity. To mitigate this issue, a modified KF with an output constraint is proposed. In this approach, the disturbance treated as an measurement is re-scaled by a constraint factor, which is determined by the system's rated power, the secondary path gain, and the disturbance power. As a result, the output power of the system, i.e. the control signal, is indirectly constrained within the maximum output of the system, ensuring stability. Simulation results indicate that the proposed algorithm not only achieves rapid suppression of dynamic noise but also effectively prevents non-linearity due to output saturation, highlighting its practical significance.
△ Less
Submitted 25 December, 2024;
originally announced December 2024.
-
CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
Authors:
Jiquan Wang,
Sha Zhao,
Zhiling Luo,
Yangxuan Zhou,
Haiteng Jiang,
Shijian Li,
Tao Li,
Gang Pan
Abstract:
Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EE…
▽ More
Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EEG foundation models. However, these studies still leave challenges: Firstly, most of existing EEG foundation models employ full EEG modeling strategy. It models the spatial and temporal dependencies between all EEG patches together, but ignores that the spatial and temporal dependencies are heterogeneous due to the unique structural characteristics of EEG signals. Secondly, existing EEG foundation models have limited generalizability on a wide range of downstream BCI tasks due to varying formats of EEG data, making it challenging to adapt to. To address these challenges, we propose a novel foundation model called CBraMod. Specifically, we devise a criss-cross transformer as the backbone to thoroughly leverage the structural characteristics of EEG signals, which can model spatial and temporal dependencies separately through two parallel attention mechanisms. And we utilize an asymmetric conditional positional encoding scheme which can encode positional information of EEG patches and be easily adapted to the EEG with diverse formats. CBraMod is pre-trained on a very large corpus of EEG through patch-based masked EEG reconstruction. We evaluate CBraMod on up to 10 downstream BCI tasks (12 public datasets). CBraMod achieves the state-of-the-art performance across the wide range of tasks, proving its strong capability and generalizability. The source code is publicly available at https://github.com/wjq-learning/CBraMod.
△ Less
Submitted 13 April, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
QoS-Aware and Routing-Flexible Network Slicing for Service-Oriented Networks
Authors:
Wei-Kun Chen,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
In this paper, we consider the network slicing (NS) problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and manage network resources to meet diverse quality of service (QoS) requirements. We propose a mixed-integer nonlinear programming (MINLP) formulation for the considered NS problem that can flexibly route t…
▽ More
In this paper, we consider the network slicing (NS) problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and manage network resources to meet diverse quality of service (QoS) requirements. We propose a mixed-integer nonlinear programming (MINLP) formulation for the considered NS problem that can flexibly route the traffic flow of the services on multiple paths and provide end-to-end delay and reliability guarantees for all services. To overcome the computational difficulty due to the intrinsic nonlinearity in the MINLP formulation, we transform the MINLP formulation into an equivalent mixed-integer linear programming (MILP) formulation and further show that their continuous relaxations are equivalent. In sharp contrast to the continuous relaxation of the MINLP formulation which is a nonconvex nonlinear programming problem, the continuous relaxation of the MILP formulation is a polynomial-time solvable linear programming problem, which significantly facilitates the algorithmic design. Based on the newly proposed MILP formulation, we develop a customized column generation (cCG) algorithm for solving the NS problem. The proposed cCG algorithm is a decomposition-based algorithm and is particularly suitable for solving large-scale NS problems. Numerical results demonstrate the efficacy of the proposed formulations and the proposed cCG algorithm.
△ Less
Submitted 2 July, 2025; v1 submitted 20 September, 2024;
originally announced September 2024.
-
Transferable Selective Virtual Sensing Active Noise Control Technique Based on Metric Learning
Authors:
Boxiang Wang,
Dongyuan Shi,
Zhengding Luo,
Xiaoyi Shen,
Junwei Ji,
Woon-Seng Gan
Abstract:
Virtual sensing (VS) technology enables active noise control (ANC) systems to attenuate noise at virtual locations distant from the physical error microphones. Appropriate auxiliary filters (AF) can significantly enhance the effectiveness of VS approaches. The selection of appropriate AF for various types of noise can be automatically achieved using convolutional neural networks (CNNs). However, t…
▽ More
Virtual sensing (VS) technology enables active noise control (ANC) systems to attenuate noise at virtual locations distant from the physical error microphones. Appropriate auxiliary filters (AF) can significantly enhance the effectiveness of VS approaches. The selection of appropriate AF for various types of noise can be automatically achieved using convolutional neural networks (CNNs). However, training the CNN model for different ANC systems is often labour-intensive and time-consuming. To tackle this problem, we propose a novel method, Transferable Selective VS, by integrating metric-learning technology into CNN-based VS approaches. The Transferable Selective VS method allows a pre-trained CNN to be applied directly to new ANC systems without requiring retraining, and it can handle unseen noise types. Numerical simulations demonstrate the effectiveness of the proposed method in attenuating sudden-varying broadband noises and real-world noises.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Enhancing Multi-Stream Beamforming Through CQIs For 5G NR FDD Massive MIMO Communications: A Tuning-Free Scheme
Authors:
Kai Li,
Ying Li,
Lei Cheng,
Zhi-Quan Luo
Abstract:
In the fifth-generation new radio (5G NR) frequency division duplex (FDD) massive multiple-input and multiple-output (MIMO) systems, downlink beamforming relies on the acquisition of downlink channel state information (CSI). Codebook based limited feedback schemes have been proposed and widely used in practice to recover the downlink CSI with low communication overhead. In such schemes, the perfor…
▽ More
In the fifth-generation new radio (5G NR) frequency division duplex (FDD) massive multiple-input and multiple-output (MIMO) systems, downlink beamforming relies on the acquisition of downlink channel state information (CSI). Codebook based limited feedback schemes have been proposed and widely used in practice to recover the downlink CSI with low communication overhead. In such schemes, the performance of downlink beamforming is determined by the codebook design and the codebook indicator feedback. However, limited by the quantization quality of the codebook, directly utilizing the codeword indicated by the feedback as the beamforming vector cannot achieve high performance. Therefore, other feedback values, such as channel qualification indicator (CQI), should be considered to enhance beamforming. In this paper, we present the relation between CQI and the optimal beamforming vectors, based on which an empirical Bayes based intelligent tuning-free algorithm is devised to learn the optimal beamforming vector and the associated regularization parameter. The proposed algorithm can handle different communication scenarios of MIMO systems, including single stream and multiple streams data transmission scenarios. Numerical results have shown the excellent performance of the proposed algorithm in terms of both beamforming vector acquisition and regularization parameter learning.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware Acceleration
Authors:
Zhongyao Luo,
Hao Wu,
Zhao Ge,
Ming Tang
Abstract:
Fiber-optic sensing, especially distributed optical fiber vibration (DVS) sensing, is gaining importance in internet of things (IoT) applications, such as industrial safety monitoring and intrusion detection. Despite their wide application, existing post-processing methods that rely on deep learning models for event recognition in DVS systems face challenges with real-time processing of large samp…
▽ More
Fiber-optic sensing, especially distributed optical fiber vibration (DVS) sensing, is gaining importance in internet of things (IoT) applications, such as industrial safety monitoring and intrusion detection. Despite their wide application, existing post-processing methods that rely on deep learning models for event recognition in DVS systems face challenges with real-time processing of large sample data volumes, particularly in long-distance applications. To address this issue, we propose to use a four-layer convolutional neural network (CNN) with ResNet as the teacher model for knowledge distillation. This results in a significant improvement in accuracy, from 83.41% to 95.39%, on data from previously untrained environments. Additionally, we propose a novel hardware design based on field-programmable gate arrays (FPGA) to further accelerate model inference. This design replaces multiplication with binary shift operations and quantizes model weights, enabling high parallelism and low latency. Our implementation achieves an inference time of 0.083 ms for a spatial-temporal sample covering a 12.5 m fiber length and 0.256 s time frame. This performance enables real-time signal processing over approximately 38.55 km of fiber, about $2.14\times$ the capability of an Nvidia GTX 4090 GPU. The proposed method greatly enhances the efficiency of vibration pattern recognition, promoting the use of DVS as a smart IoT system. The data and code are available at https://github.com/HUST-IOF/Efficient-DVS.
△ Less
Submitted 22 August, 2024; v1 submitted 7 August, 2024;
originally announced August 2024.
-
Blind Beamforming for Coverage Enhancement with Intelligent Reflecting Surface
Authors:
Fan Xu,
Jiawei Yao,
Wenhai Lai,
Kaiming Shen,
Xin Li,
Xin Chen,
Zhi-Quan Luo
Abstract:
Conventional policy for configuring an intelligent reflecting surface (IRS) typically requires channel state information (CSI), thus incurring substantial overhead costs and facing incompatibility with the current network protocols. This paper proposes a blind beamforming strategy in the absence of CSI, aiming to boost the minimum signal-to-noise ratio (SNR) among all the receiver positions, namel…
▽ More
Conventional policy for configuring an intelligent reflecting surface (IRS) typically requires channel state information (CSI), thus incurring substantial overhead costs and facing incompatibility with the current network protocols. This paper proposes a blind beamforming strategy in the absence of CSI, aiming to boost the minimum signal-to-noise ratio (SNR) among all the receiver positions, namely the coverage enhancement. Although some existing works already consider the IRS-assisted coverage enhancement without CSI, they assume certain position-channel models through which the channels can be recovered from the geographic locations. In contrast, our approach solely relies on the received signal power data, not assuming any position-channel model. We examine the achievability and converse of the proposed blind beamforming method. If the IRS has $N$ reflective elements and there are $U$ receiver positions, then our method guarantees the minimum SNR of $Ω(N^2/U)$ -- which is fairly close to the upper bound $O(N+N^2\sqrt{\ln (NU)}/\sqrt[4]{U})$. Aside from the simulation results, we justify the practical use of blind beamforming in a field test at 2.6 GHz. According to the real-world experiment, the proposed blind beamforming method boosts the minimum SNR across seven random positions in a conference room by 18.22 dB, while the position-based method yields a boost of 12.08 dB.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Authors:
Ziyuan Luo,
Boxin Shi,
Haoliang Li,
Renjie Wan
Abstract:
Electromagnetic Inverse Scattering Problems (EISP) have gained wide applications in computational imaging. By solving EISP, the internal relative permittivity of the scatterer can be non-invasively determined based on the scattered electromagnetic fields. Despite previous efforts to address EISP, achieving better solutions to this problem has remained elusive, due to the challenges posed by invers…
▽ More
Electromagnetic Inverse Scattering Problems (EISP) have gained wide applications in computational imaging. By solving EISP, the internal relative permittivity of the scatterer can be non-invasively determined based on the scattered electromagnetic fields. Despite previous efforts to address EISP, achieving better solutions to this problem has remained elusive, due to the challenges posed by inversion and discretization. This paper tackles those challenges in EISP via an implicit approach. By representing the scatterer's relative permittivity as a continuous implicit representation, our method is able to address the low-resolution problems arising from discretization. Further, optimizing this implicit representation within a forward framework allows us to conveniently circumvent the challenges posed by inverse estimation. Our approach outperforms existing methods on standard benchmark datasets. Project page: https://luo-ziyuan.github.io/Imaging-Interiors
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
Authors:
Tairan He,
Zhengyi Luo,
Xialin He,
Wenli Xiao,
Chong Zhang,
Weinan Zhang,
Kris Kitani,
Changliu Liu,
Guanya Shi
Abstract:
We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy. Using kinematic pose as a universal control interface, OmniH2O enables various ways for a human to control a full-sized humanoid with dexterous hands, including using real-time teleoperation through VR headset, verbal instruction, and RGB camera. OmniH2O also enables full autono…
▽ More
We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy. Using kinematic pose as a universal control interface, OmniH2O enables various ways for a human to control a full-sized humanoid with dexterous hands, including using real-time teleoperation through VR headset, verbal instruction, and RGB camera. OmniH2O also enables full autonomy by learning from teleoperated demonstrations or integrating with frontier models such as GPT-4. OmniH2O demonstrates versatility and dexterity in various real-world whole-body tasks through teleoperation or autonomy, such as playing multiple sports, moving and manipulating objects, and interacting with humans. We develop an RL-based sim-to-real pipeline, which involves large-scale retargeting and augmentation of human motion datasets, learning a real-world deployable policy with sparse sensor input by imitating a privileged teacher policy, and reward designs to enhance robustness and stability. We release the first humanoid whole-body control dataset, OmniH2O-6, containing six everyday tasks, and demonstrate humanoid whole-body skill learning from teleoperated datasets.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
A Survey of Integrating Wireless Technology into Active Noise Control
Authors:
Xiaoyi Shen,
Dongyuan Shi,
Zhengding Luo,
Junwei Ji,
Woon-Seng Gan
Abstract:
Active Noise Control (ANC) is a widely adopted technology for reducing environmental noise across various scenarios. This paper focuses on enhancing noise reduction performance, particularly through the refinement of signal quality fed into ANC systems. We discuss the main wireless technique integrated into the ANC system, equipped with some innovative algorithms, in diverse environments. Instead…
▽ More
Active Noise Control (ANC) is a widely adopted technology for reducing environmental noise across various scenarios. This paper focuses on enhancing noise reduction performance, particularly through the refinement of signal quality fed into ANC systems. We discuss the main wireless technique integrated into the ANC system, equipped with some innovative algorithms, in diverse environments. Instead of using microphone arrays, which increase the computation complexity of the ANC system, to isolate multiple noise sources to improve noise reduction performance, the application of the wireless technique avoids extra computation demand. Wireless transmissions of reference, error, and control signals are also applied to improve the convergence performance of the ANC system. Furthermore, this paper lists some wireless ANC applications, such as earbuds, headphones, windows, and headrests, underscoring their adaptability and efficiency in various settings.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
POPDG: Popular 3D Dance Generation with PopDanceSet
Authors:
Zhenye Luo,
Min Ren,
Xuecai Hu,
Yongzhen Huang,
Li Yao
Abstract:
Generating dances that are both lifelike and well-aligned with music continues to be a challenging task in the cross-modal domain. This paper introduces PopDanceSet, the first dataset tailored to the preferences of young audiences, enabling the generation of aesthetically oriented dances. And it surpasses the AIST++ dataset in music genre diversity and the intricacy and depth of dance movements. M…
▽ More
Generating dances that are both lifelike and well-aligned with music continues to be a challenging task in the cross-modal domain. This paper introduces PopDanceSet, the first dataset tailored to the preferences of young audiences, enabling the generation of aesthetically oriented dances. And it surpasses the AIST++ dataset in music genre diversity and the intricacy and depth of dance movements. Moreover, the proposed POPDG model within the iDDPM framework enhances dance diversity and, through the Space Augmentation Algorithm, strengthens spatial physical connections between human body joints, ensuring that increased diversity does not compromise generation quality. A streamlined Alignment Module is also designed to improve the temporal alignment between dance and music. Extensive experiments show that POPDG achieves SOTA results on two datasets. Furthermore, the paper also expands on current evaluation metrics. The dataset and code are available at https://github.com/Luke-Luo1/POPDG.
△ Less
Submitted 27 December, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
DPER: Diffusion Prior Driven Neural Representation for Limited Angle and Sparse View CT Reconstruction
Authors:
Chenhe Du,
Xiyue Lin,
Qing Wu,
Xuanyu Tian,
Ying Su,
Zhe Luo,
Rui Zheng,
Yang Chen,
Hongjiang Wei,
S. Kevin Zhou,
Jingyi Yu,
Yuyao Zhang
Abstract:
Limited-angle and sparse-view computed tomography (LACT and SVCT) are crucial for expanding the scope of X-ray CT applications. However, they face challenges due to incomplete data acquisition, resulting in diverse artifacts in the reconstructed CT images. Emerging implicit neural representation (INR) techniques, such as NeRF, NeAT, and NeRP, have shown promise in under-determined CT imaging recon…
▽ More
Limited-angle and sparse-view computed tomography (LACT and SVCT) are crucial for expanding the scope of X-ray CT applications. However, they face challenges due to incomplete data acquisition, resulting in diverse artifacts in the reconstructed CT images. Emerging implicit neural representation (INR) techniques, such as NeRF, NeAT, and NeRP, have shown promise in under-determined CT imaging reconstruction tasks. However, the unsupervised nature of INR architecture imposes limited constraints on the solution space, particularly for the highly ill-posed reconstruction task posed by LACT and ultra-SVCT. In this study, we introduce the Diffusion Prior Driven Neural Representation (DPER), an advanced unsupervised framework designed to address the exceptionally ill-posed CT reconstruction inverse problems. DPER adopts the Half Quadratic Splitting (HQS) algorithm to decompose the inverse problem into data fidelity and distribution prior sub-problems. The two sub-problems are respectively addressed by INR reconstruction scheme and pre-trained score-based diffusion model. This combination first injects the implicit image local consistency prior from INR. Additionally, it effectively augments the feasibility of the solution space for the inverse problem through the generative diffusion model, resulting in increased stability and precision in the solutions. We conduct comprehensive experiments to evaluate the performance of DPER on LACT and ultra-SVCT reconstruction with two public datasets (AAPM and LIDC), an in-house clinical COVID-19 dataset and a public raw projection dataset created by Mayo Clinic. The results show that our method outperforms the state-of-the-art reconstruction methods on in-domain datasets, while achieving significant performance improvements on out-of-domain (OOD) datasets.
△ Less
Submitted 19 July, 2024; v1 submitted 27 April, 2024;
originally announced April 2024.
-
Pneumonia App: a mobile application for efficient pediatric pneumonia diagnosis using explainable convolutional neural networks (CNN)
Authors:
Jiaming Deng,
Zhenglin Chen,
Minjiang Chen,
Lulu Xu,
Jiaqi Yang,
Zhendong Luo,
Peiwu Qin
Abstract:
Mycoplasma pneumoniae pneumonia (MPP) poses significant diagnostic challenges in pediatric healthcare, especially in regions like China where it's prevalent. We introduce PneumoniaAPP, a mobile application leveraging deep learning techniques for rapid MPP detection. Our approach capitalizes on convolutional neural networks (CNNs) trained on a comprehensive dataset comprising 3345 chest X-ray (CXR)…
▽ More
Mycoplasma pneumoniae pneumonia (MPP) poses significant diagnostic challenges in pediatric healthcare, especially in regions like China where it's prevalent. We introduce PneumoniaAPP, a mobile application leveraging deep learning techniques for rapid MPP detection. Our approach capitalizes on convolutional neural networks (CNNs) trained on a comprehensive dataset comprising 3345 chest X-ray (CXR) images, which includes 833 CXR images revealing MPP and additionally augmented with samples from a public dataset. The CNN model achieved an accuracy of 88.20% and an AUROC of 0.9218 across all classes, with a specific accuracy of 97.64% for the mycoplasma class, as demonstrated on the testing dataset. Furthermore, we integrated explainability techniques into PneumoniaAPP to aid respiratory physicians in lung opacity localization. Our contribution extends beyond existing research by targeting pediatric MPP, emphasizing the age group of 0-12 years, and prioritizing deployment on mobile devices. This work signifies a significant advancement in pediatric pneumonia diagnosis, offering a reliable and accessible tool to alleviate diagnostic burdens in healthcare settings.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation
Authors:
Tairan He,
Zhengyi Luo,
Wenli Xiao,
Chong Zhang,
Kris Kitani,
Changliu Liu,
Guanya Shi
Abstract:
We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwar…
▽ More
We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwards, we train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. We successfully achieve teleoperation of dynamic whole-body motions in real-world scenarios, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Radar Anti-jamming Strategy Learning via Domain-knowledge Enhanced Online Convex Optimization
Authors:
Liangqi Liu,
Wenqiang Pu,
Yingru Li,
Bo Jiu,
Zhi-Quan Luo
Abstract:
The dynamic competition between radar and jammer systems presents a significant challenge for modern Electronic Warfare (EW), as current active learning approaches still lack sample efficiency and fail to exploit jammer's characteristics. In this paper, the competition between a frequency agile radar and a Digital Radio Frequency Memory (DRFM)-based intelligent jammer is considered. We introduce a…
▽ More
The dynamic competition between radar and jammer systems presents a significant challenge for modern Electronic Warfare (EW), as current active learning approaches still lack sample efficiency and fail to exploit jammer's characteristics. In this paper, the competition between a frequency agile radar and a Digital Radio Frequency Memory (DRFM)-based intelligent jammer is considered. We introduce an Online Convex Optimization (OCO) framework designed to illustrate this adversarial interaction. Notably, traditional OCO algorithms exhibit suboptimal sample efficiency due to the limited information obtained per round. To address the limitations, two refined algorithms are proposed, utilizing unbiased gradient estimators that leverage the unique attributes of the jammer system. Sub-linear theoretical results on both static regret and universal regret are provided, marking a significant improvement in OCO performance. Furthermore, simulation results reveal that the proposed algorithms outperform common OCO baselines, suggesting the potential for effective deployment in real-world scenarios.
△ Less
Submitted 11 July, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
Calibration of Deep Learning Classification Models in fNIRS
Authors:
Zhihao Cao,
Zizhou Luo
Abstract:
Functional near-infrared spectroscopy (fNIRS) is a valuable non-invasive tool for monitoring brain activity. The classification of fNIRS data in relation to conscious activity holds significance for advancing our understanding of the brain and facilitating the development of brain-computer interfaces (BCI). Many researchers have turned to deep learning to tackle the classification challenges inher…
▽ More
Functional near-infrared spectroscopy (fNIRS) is a valuable non-invasive tool for monitoring brain activity. The classification of fNIRS data in relation to conscious activity holds significance for advancing our understanding of the brain and facilitating the development of brain-computer interfaces (BCI). Many researchers have turned to deep learning to tackle the classification challenges inherent in fNIRS data due to its strong generalization and robustness. In the application of fNIRS, reliability is really important, and one mathematical formulation of the reliability of confidence is calibration. However, many researchers overlook the important issue of calibration. To address this gap, we propose integrating calibration into fNIRS field and assess the reliability of existing models. Surprisingly, our results indicate poor calibration performance in many proposed models. To advance calibration development in the fNIRS field, we summarize three practical tips. Through this letter, we hope to emphasize the critical role of calibration in fNIRS research and argue for enhancing the reliability of deep learning-based predictions in fNIRS classification tasks. All data from our experimental process are openly available on GitHub.
△ Less
Submitted 20 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Unsupervised learning based end-to-end delayless generative fixed-filter active noise control
Authors:
Zhengding Luo,
Dongyuan Shi,
Xiaoyi Shen,
Woon-Seng Gan
Abstract:
Delayless noise control is achieved by our earlier generative fixed-filter active noise control (GFANC) framework through efficient coordination between the co-processor and real-time controller. However, the one-dimensional convolutional neural network (1D CNN) in the co-processor requires initial training using labelled noise datasets. Labelling noise data can be resource-intensive and may intro…
▽ More
Delayless noise control is achieved by our earlier generative fixed-filter active noise control (GFANC) framework through efficient coordination between the co-processor and real-time controller. However, the one-dimensional convolutional neural network (1D CNN) in the co-processor requires initial training using labelled noise datasets. Labelling noise data can be resource-intensive and may introduce some biases. In this paper, we propose an unsupervised-GFANC approach to simplify the 1D CNN training process and enhance its practicality. During training, the co-processor and real-time controller are integrated into an end-to-end differentiable ANC system. This enables us to use the accumulated squared error signal as the loss for training the 1D CNN. With this unsupervised learning paradigm, the unsupervised-GFANC method not only omits the labelling process but also exhibits better noise reduction performance compared to the supervised GFANC method in real noise experiments.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning
Authors:
Ruoqi Zhang,
Ziwei Luo,
Jens Sjölund,
Thomas B. Schön,
Per Mattsson
Abstract:
This paper presents advanced techniques of training diffusion policies for offline reinforcement learning (RL). At the core is a mean-reverting stochastic differential equation (SDE) that transfers a complex action distribution into a standard Gaussian and then samples actions conditioned on the environment state with a corresponding reverse-time SDE, like a typical diffusion policy. We show that…
▽ More
This paper presents advanced techniques of training diffusion policies for offline reinforcement learning (RL). At the core is a mean-reverting stochastic differential equation (SDE) that transfers a complex action distribution into a standard Gaussian and then samples actions conditioned on the environment state with a corresponding reverse-time SDE, like a typical diffusion policy. We show that such an SDE has a solution that we can use to calculate the log probability of the policy, yielding an entropy regularizer that improves the exploration of offline datasets. To mitigate the impact of inaccurate value functions from out-of-distribution data points, we further propose to learn the lower confidence bound of Q-ensembles for more robust policy improvement. By combining the entropy-regularized diffusion policy with Q-ensembles in offline RL, our method achieves state-of-the-art performance on most tasks in D4RL benchmarks. Code is available at https://github.com/ruoqizzz/Entropy-Regularized-Diffusion-Policy-with-QEnsemble.
△ Less
Submitted 8 January, 2025; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Mapping Information in Feature Extraction Transformation for Chirp Signal
Authors:
Shuyi Gu,
Zhenghua Luo,
Lin Hu,
Yilin Zhang,
Junxiong Guo
Abstract:
Chirp signals have established diverse applications caused by the capable of producing time-dependent linear frequencies. Most feature extraction transformation methods for chirp signals focus on enhancing the performance of transform methods but neglecting the information derived from the transformation process. Consequently, they may fail to fully exploit the information from observations, resul…
▽ More
Chirp signals have established diverse applications caused by the capable of producing time-dependent linear frequencies. Most feature extraction transformation methods for chirp signals focus on enhancing the performance of transform methods but neglecting the information derived from the transformation process. Consequently, they may fail to fully exploit the information from observations, resulting in decreased performance under conditions of low signal-to-noise ratio and limited observations. In this work, we develop a novel post-processing method called mapping information model to addressing this challenge. The model establishes a link between the observation space and feature space in feature extraction transform, enabling interference suppression and obtain more accurate information by iteratively resampling and assigning weights in both spaces. Analysis of the iteration process reveals a continual increase in weight of signal samples and a gradual stability in weight of noise samples. The demonstration of the noise suppression in the iteration process and feature enhancement supports the effectiveness of the mapping information model. Furthermore, numerical simulations also affirm the high efficiency of the proposed model by showcasing enhanced signal detection and estimation performances without requiring additional observations. This superior model allows amplifying performance within feature extraction transformation for chirp signal processing under low SNR and limited observation conditions, opens up new opportunities for areas such as communication, biomedicine, and remote sensing.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Intelligent Surfaces Empowered Wireless Network: Recent Advances and The Road to 6G
Authors:
Qingqing Wu,
Beixiong Zheng,
Changsheng You,
Lipeng Zhu,
Kaiming Shen,
Xiaodan Shao,
Weidong Mei,
Boya Di,
Hongliang Zhang,
Ertugrul Basar,
Lingyang Song,
Marco Di Renzo,
Zhi-Quan Luo,
Rui Zhang
Abstract:
Intelligent surfaces (ISs) have emerged as a key technology to empower a wide range of appealing applications for wireless networks, due to their low cost, high energy efficiency, flexibility of deployment and capability of constructing favorable wireless channels/radio environments. Moreover, the recent advent of several new IS architectures further expanded their electromagnetic functionalities…
▽ More
Intelligent surfaces (ISs) have emerged as a key technology to empower a wide range of appealing applications for wireless networks, due to their low cost, high energy efficiency, flexibility of deployment and capability of constructing favorable wireless channels/radio environments. Moreover, the recent advent of several new IS architectures further expanded their electromagnetic functionalities from passive reflection to active amplification, simultaneous reflection and refraction, as well as holographic beamforming. However, the research on ISs is still in rapid progress and there have been recent technological advances in ISs and their emerging applications that are worthy of a timely review. Thus, we provide in this paper a comprehensive survey on the recent development and advances of ISs aided wireless networks. Specifically, we start with an overview on the anticipated use cases of ISs in future wireless networks such as 6G, followed by a summary of the recent standardization activities related to ISs. Then, the main design issues of the commonly adopted reflection-based IS and their state-of-the-art solutions are presented in detail, including reflection optimization, deployment, signal modulation, wireless sensing, and integrated sensing and communications. Finally, recent progress and new challenges in advanced IS architectures are discussed to inspire futrue research.
△ Less
Submitted 24 March, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
End-to-end Alternating Optimization for Real-World Blind Super Resolution
Authors:
Zhengxiong Luo,
Yan Huang,
Shang Li,
Liang Wang,
Tieniu Tan
Abstract:
Blind Super-Resolution (SR) usually involves two sub-problems: 1) estimating the degradation of the given low-resolution (LR) image; 2) super-resolving the LR image to its high-resolution (HR) counterpart. Both problems are ill-posed due to the information loss in the degrading process. Most previous methods try to solve the two problems independently, but often fall into a dilemma: a good super-r…
▽ More
Blind Super-Resolution (SR) usually involves two sub-problems: 1) estimating the degradation of the given low-resolution (LR) image; 2) super-resolving the LR image to its high-resolution (HR) counterpart. Both problems are ill-posed due to the information loss in the degrading process. Most previous methods try to solve the two problems independently, but often fall into a dilemma: a good super-resolved HR result requires an accurate degradation estimation, which however, is difficult to be obtained without the help of original HR information. To address this issue, instead of considering these two problems independently, we adopt an alternating optimization algorithm, which can estimate the degradation and restore the SR image in a single model. Specifically, we design two convolutional neural modules, namely \textit{Restorer} and \textit{Estimator}. \textit{Restorer} restores the SR image based on the estimated degradation, and \textit{Estimator} estimates the degradation with the help of the restored SR image. We alternate these two modules repeatedly and unfold this process to form an end-to-end trainable network. In this way, both \textit{Restorer} and \textit{Estimator} could get benefited from the intermediate results of each other, and make each sub-problem easier. Moreover, \textit{Restorer} and \textit{Estimator} are optimized in an end-to-end manner, thus they could get more tolerant of the estimation deviations of each other and cooperate better to achieve more robust and accurate final results. Extensive experiments on both synthetic datasets and real-world images show that the proposed method can largely outperform state-of-the-art methods and produce more visually favorable results. The codes are rleased at \url{https://github.com/greatlog/RealDAN.git}.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Real-time FPGA Implementation of CNN-based Distributed Fiber Optic Vibration Event Recognition Method
Authors:
Zhongyao Luo,
Zhao Ge,
Hao Wu,
Ming Tang
Abstract:
Utilizing optical fibers to detect and pinpoint vibrations, Distributed Optical Fiber Vibration Sensing (DVS) technology provides real-time monitoring and surveillance of wide-reaching areas. This field has been leveraging Convolutional Neural Networks (CNN). Recently, a study has accomplished end-to-end vibration event recognition, enabling utilization of CNN-based DVS algorithms as real-time emb…
▽ More
Utilizing optical fibers to detect and pinpoint vibrations, Distributed Optical Fiber Vibration Sensing (DVS) technology provides real-time monitoring and surveillance of wide-reaching areas. This field has been leveraging Convolutional Neural Networks (CNN). Recently, a study has accomplished end-to-end vibration event recognition, enabling utilization of CNN-based DVS algorithms as real-time embedded system for edge computing in practical application situations. Considering the power consumption of central processing unit (CPU) and graphics processing unit (GPU), and the inflexibility of application-specific integrated circuit (ASIC), field-Programmable gate array (FPGA) is the optimal computing platform for the system. This paper proposes to compress pre-trained network and adopt a novel hardware structure, to design a fully on-chip, pipelined inference accelerator for CNN-based DVS algorithm, without fine tuning or re-training. This design allows for real-time processing with low power consumption and system requirement.An examination has been executed on an existing DVS algorithm based on a 40-layer CNN model comprising 2.7 million parameters. It is completely implemented on-chip, pipelined, with no reduction in accuracy.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Boundary Difference Over Union Loss For Medical Image Segmentation
Authors:
Fan Sun,
Zhiming Luo,
Shaozi Li
Abstract:
Medical image segmentation is crucial for clinical diagnosis. However, current losses for medical image segmentation mainly focus on overall segmentation results, with fewer losses proposed to guide boundary segmentation. Those that do exist often need to be used in combination with other losses and produce ineffective results. To address this issue, we have developed a simple and effective loss c…
▽ More
Medical image segmentation is crucial for clinical diagnosis. However, current losses for medical image segmentation mainly focus on overall segmentation results, with fewer losses proposed to guide boundary segmentation. Those that do exist often need to be used in combination with other losses and produce ineffective results. To address this issue, we have developed a simple and effective loss called the Boundary Difference over Union Loss (Boundary DoU Loss) to guide boundary region segmentation. It is obtained by calculating the ratio of the difference set of prediction and ground truth to the union of the difference set and the partial intersection set. Our loss only relies on region calculation, making it easy to implement and training stable without needing any additional losses. Additionally, we use the target size to adaptively adjust attention applied to the boundary regions. Experimental results using UNet, TransUNet, and Swin-UNet on two datasets (ACDC and Synapse) demonstrate the effectiveness of our proposed loss function. Code is available at https://github.com/sunfan-bvb/BoundaryDoULoss.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Perturbing a Neural Network to Infer Effective Connectivity: Evidence from Synthetic EEG Data
Authors:
Peizhen Yang,
Xinke Shen,
Zongsheng Li,
Zixiang Luo,
Kexin Lou,
Quanying Liu
Abstract:
Identifying causal relationships among distinct brain areas, known as effective connectivity, holds key insights into the brain's information processing and cognitive functions. Electroencephalogram (EEG) signals exhibit intricate dynamics and inter-areal interactions within the brain. However, methods for characterizing nonlinear causal interactions among multiple brain regions remain relatively…
▽ More
Identifying causal relationships among distinct brain areas, known as effective connectivity, holds key insights into the brain's information processing and cognitive functions. Electroencephalogram (EEG) signals exhibit intricate dynamics and inter-areal interactions within the brain. However, methods for characterizing nonlinear causal interactions among multiple brain regions remain relatively underdeveloped. In this study, we proposed a data-driven framework to infer effective connectivity by perturbing the trained neural networks. Specifically, we trained neural networks (i.e., CNN, vanilla RNN, GRU, LSTM, and Transformer) to predict future EEG signals according to historical data and perturbed the networks' input to obtain effective connectivity (EC) between the perturbed EEG channel and the rest of the channels. The EC reflects the causal impact of perturbing one node on others. The performance was tested on the synthetic EEG generated by a biological-plausible Jansen-Rit model. CNN and Transformer obtained the best performance on both 3-channel and 90-channel synthetic EEG data, outperforming the classical Granger causality method. Our work demonstrated the potential of perturbing an artificial neural network, learned to predict future system dynamics, to uncover the underlying causal structure.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
An Efficient Benders Decomposition Approach for Optimal Large-Scale Network Slicing
Authors:
Wei-Kun Chen,
Zheyu Wu,
Rui-Jin Zhang,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
This paper considers the network slicing (NS) problem which attempts to map multiple customized virtual network requests to a common shared network infrastructure and allocate network resources to meet diverse service requirements. This paper proposes an efficient customized Benders decomposition algorithm for globally solving the large-scale NP-hard NS problem. The proposed algorithm decomposes t…
▽ More
This paper considers the network slicing (NS) problem which attempts to map multiple customized virtual network requests to a common shared network infrastructure and allocate network resources to meet diverse service requirements. This paper proposes an efficient customized Benders decomposition algorithm for globally solving the large-scale NP-hard NS problem. The proposed algorithm decomposes the hard NS problem into two relatively easy function placement (FP) and traffic routing (TR) subproblems and iteratively solves them enabling the information feedback between each other, which makes it particularly suitable to solve large-scale problems. Specifically, the FP subproblem is to place service functions into cloud nodes in the network, and solving it can return a function placement strategy based on which the TR subproblem is defined; and the TR subproblem is to find paths connecting two nodes hosting two adjacent functions in the network, and solving it can either verify that the solution of the FP subproblem is an optimal solution of the original problem, or return a valid inequality to the FP subproblem that cuts off the current infeasible solution. The proposed algorithm is guaranteed to find the globally optimal solution of the NS problem. By taking the special structure of the NS problem into consideration, we successfully develop two families of valid inequalities that render the proposed algorithm converge much more quickly and thus much more efficient. Numerical results demonstrate that the proposed valid inequalities effectively accelerate the convergence of the decomposition algorithm, and the proposed algorithm significantly outperforms the existing algorithms in terms of both solution efficiency and quality.
△ Less
Submitted 25 September, 2024; v1 submitted 27 June, 2023;
originally announced June 2023.
-
A Computation-efficient Online Secondary Path Modeling Technique for Modified FXLMS Algorithm
Authors:
Junwei Ji,
Dongyuan Shi,
Woon-Seng Gan,
Xiaoyi Shen,
Zhengding Luo
Abstract:
This paper proposes an online secondary path modelling (SPM) technique to improve the performance of the modified filtered reference Least Mean Square (FXLMS) algorithm. It can effectively respond to a time-varying secondary path, which refers to the path from a secondary source to an error sensor. Unlike traditional methods, the proposed approach switches modes between adaptive ANC and online SPM…
▽ More
This paper proposes an online secondary path modelling (SPM) technique to improve the performance of the modified filtered reference Least Mean Square (FXLMS) algorithm. It can effectively respond to a time-varying secondary path, which refers to the path from a secondary source to an error sensor. Unlike traditional methods, the proposed approach switches modes between adaptive ANC and online SPM, eliminating the use of destabilizing components such as auxiliary white noise or additional filters, which can negatively impact the complexity, stability, and noise reduction performance of the ANC system. The system operates in adaptive ANC mode until divergence is detected due to secondary path changes. At this moment, it switches to SPM mode until the path is remodeled and then returns to ANC mode. Furthermore, numerical simulations in the paper demonstrate that the proposed online technique effectively copes with the secondary path variations.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
In-depth analysis of music structure as a text network
Authors:
Ping-Rui Tsai,
Yen-Ting Chou,
Nathan-Christopher Wang,
Hui-Ling Chen,
Hong-Yue Huang,
Zih-Jia Luo,
Tzay-Ming Hong
Abstract:
Music, enchanting and poetic, permeates every corner of human civilization. Although music is not unfamiliar to people, our understanding of its essence remains limited, and there is still no universally accepted scientific description. This is primarily due to music being regarded as a product of both reason and emotion, making it difficult to define. In this article, we focus on the fundamental…
▽ More
Music, enchanting and poetic, permeates every corner of human civilization. Although music is not unfamiliar to people, our understanding of its essence remains limited, and there is still no universally accepted scientific description. This is primarily due to music being regarded as a product of both reason and emotion, making it difficult to define. In this article, we focus on the fundamental elements of music and construct an evolutionary network from the perspective of music as a natural language, aligning with the statistical characteristics of texts. Through this approach, we aim to comprehend the structural differences in music across different periods, enabling a more scientific exploration of music. Relying on the advantages of structuralism, we can concentrate on the relationships and order between the physical elements of music, rather than getting entangled in the blurred boundaries of science and philosophy. The scientific framework we present not only conforms to past conclusions in music, but also serves as a bridge that connects music to natural language processing and knowledge graphs.
△ Less
Submitted 2 January, 2024; v1 submitted 21 March, 2023;
originally announced March 2023.
-
A practical distributed active noise control algorithm overcoming communication restrictions
Authors:
Junwei Ji,
Dongyuan Shi,
Zhengding Luo,
Xiaoyi Shen,
Woon-Seng Gan
Abstract:
By assigning the massive computing tasks of the traditional multichannel active noise control (MCANC) system to several distributed control nodes, distributed multichannel active noise control (DMCANC) techniques have become effective global noise reduction solutions with low computational costs. However, existing DMCANC algorithms simply complete the distribution of traditional centralized algori…
▽ More
By assigning the massive computing tasks of the traditional multichannel active noise control (MCANC) system to several distributed control nodes, distributed multichannel active noise control (DMCANC) techniques have become effective global noise reduction solutions with low computational costs. However, existing DMCANC algorithms simply complete the distribution of traditional centralized algorithms by combining neighbour nodes' information but rarely consider the degraded control performance and system stability of distributed units caused by delays and interruptions in communication. Hence, this paper develops a novel DMCANC algorithm that utilizes the compensation filters and neighbour nodes' information to counterbalance the cross-talk effect between channels while maintaining independent weight updating. Since the neighbours' information required barely affects the local control filter updating in each node, this approach can tolerate communication delay and interruption to some extent. Numerical simulations demonstrate that the proposed algorithm can achieve satisfactory noise reduction performance and high robustness to real-world communication challenges.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
A Momentum Two-gradient Direction Algorithm with Variable Step Size Applied to Solve Practical Output Constraint Issue for Active Noise Control
Authors:
Xiaoyi Shen,
Dongyuan Shi,
Zhengding Luo,
Junwei Ji,
Woon-Seng Gan
Abstract:
Active noise control (ANC) has been widely utilized to reduce unwanted environmental noise. The primary objective of ANC is to generate an anti-noise with the same amplitude but the opposite phase of the primary noise using the secondary source. However, the effectiveness of the ANC application is impacted by the speaker's output saturation. This paper proposes a two-gradient direction ANC algorit…
▽ More
Active noise control (ANC) has been widely utilized to reduce unwanted environmental noise. The primary objective of ANC is to generate an anti-noise with the same amplitude but the opposite phase of the primary noise using the secondary source. However, the effectiveness of the ANC application is impacted by the speaker's output saturation. This paper proposes a two-gradient direction ANC algorithm with a momentum factor to solve the saturation with faster convergence. In order to make it implemented in real-time, a computation-effective variable step size approach is applied to further reduce the steady-state error brought on by the changing gradient directions. The time constant and step size bound for the momentum two-gradient direction algorithm is analyzed. Simulation results show that the proposed algorithm performs effectively in the time-unvaried and time-varied environment.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Learning Deep Intensity Field for Extremely Sparse-View CBCT Reconstruction
Authors:
Yiqun Lin,
Zhongjin Luo,
Wei Zhao,
Xiaomeng Li
Abstract:
Sparse-view cone-beam CT (CBCT) reconstruction is an important direction to reduce radiation dose and benefit clinical applications. Previous voxel-based generation methods represent the CT as discrete voxels, resulting in high memory requirements and limited spatial resolution due to the use of 3D decoders. In this paper, we formulate the CT volume as a continuous intensity field and develop a no…
▽ More
Sparse-view cone-beam CT (CBCT) reconstruction is an important direction to reduce radiation dose and benefit clinical applications. Previous voxel-based generation methods represent the CT as discrete voxels, resulting in high memory requirements and limited spatial resolution due to the use of 3D decoders. In this paper, we formulate the CT volume as a continuous intensity field and develop a novel DIF-Net to perform high-quality CBCT reconstruction from extremely sparse (fewer than 10) projection views at an ultrafast speed. The intensity field of a CT can be regarded as a continuous function of 3D spatial points. Therefore, the reconstruction can be reformulated as regressing the intensity value of an arbitrary 3D point from given sparse projections. Specifically, for a point, DIF-Net extracts its view-specific features from different 2D projection views. These features are subsequently aggregated by a fusion module for intensity estimation. Notably, thousands of points can be processed in parallel to improve efficiency during training and testing. In practice, we collect a knee CBCT dataset to train and evaluate DIF-Net. Extensive experiments show that our approach can reconstruct CBCT with high image quality and high spatial resolution from extremely sparse views within 1.6 seconds, significantly outperforming state-of-the-art methods. Our code will be available at https://github.com/xmed-lab/DIF-Net.
△ Less
Submitted 31 August, 2023; v1 submitted 12 March, 2023;
originally announced March 2023.
-
Deep Generative Fixed-filter Active Noise Control
Authors:
Zhengding Luo,
Dongyuan Shi,
Xiaoyi Shen,
Junwei Ji,
Woon-Seng Gan
Abstract:
Due to the slow convergence and poor tracking ability, conventional LMS-based adaptive algorithms are less capable of handling dynamic noises. Selective fixed-filter active noise control (SFANC) can significantly reduce response time by selecting appropriate pre-trained control filters for different noises. Nonetheless, the limited number of pre-trained control filters may affect noise reduction p…
▽ More
Due to the slow convergence and poor tracking ability, conventional LMS-based adaptive algorithms are less capable of handling dynamic noises. Selective fixed-filter active noise control (SFANC) can significantly reduce response time by selecting appropriate pre-trained control filters for different noises. Nonetheless, the limited number of pre-trained control filters may affect noise reduction performance, especially when the incoming noise differs much from the initial noises during pre-training. Therefore, a generative fixed-filter active noise control (GFANC) method is proposed in this paper to overcome the limitation. Based on deep learning and a perfect-reconstruction filter bank, the GFANC method only requires a few prior data (one pre-trained broadband control filter) to automatically generate suitable control filters for various noises. The efficacy of the GFANC method is demonstrated by numerical simulations on real-recorded noises.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
A Physics-based and Data-driven Approach for Localized Statistical Channel Modeling
Authors:
Shutao Zhang,
Xinzhi Ning,
Xi Zheng,
Qingjiang Shi,
Tsung-Hui Chang,
Zhi-Quan Luo
Abstract:
Localized channel modeling is crucial for offline performance optimization of 5G cellular networks, but the existing channel models are for general scenarios and do not capture local geographical structures. In this paper, we propose a novel physics-based and data-driven localized statistical channel modeling (LSCM), which is capable of sensing the physical geographical structures of the targeted…
▽ More
Localized channel modeling is crucial for offline performance optimization of 5G cellular networks, but the existing channel models are for general scenarios and do not capture local geographical structures. In this paper, we propose a novel physics-based and data-driven localized statistical channel modeling (LSCM), which is capable of sensing the physical geographical structures of the targeted cellular environment. The proposed channel modeling solely relies on the reference signal receiving power (RSRP) of the user equipment, unlike the traditional methods which use full channel impulse response matrices. The key is to build the relationship between the RSRP and the channel's angular power spectrum. Based on it, we formulate the task of channel modeling as a sparse recovery problem where the non-zero entries of the sparse vector indicate the channel paths' powers and angles of departure. A computationally efficient weighted non-negative orthogonal matching pursuit (WNOMP) algorithm is devised for solving the formulated problem. Finally, experiments based on synthetic and real RSRP measurements are presented to examine the performance of the proposed method.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Coordinating Multiple Intelligent Reflecting Surfaces without Channel Information
Authors:
Fan Xu,
Jiawei Yao,
Wenhai Lai,
Kaiming Shen,
Xin Li,
Xin Chen,
Zhi-Quan Luo
Abstract:
Conventional beamforming methods for intelligent reflecting surfaces (IRSs) or reconfigurable intelligent surfaces (RISs) typically entail the full channel state information (CSI). However, the computational cost of channel acquisition soars exponentially with the number of IRSs. To bypass this difficulty, we propose a novel strategy called blind beamforming that coordinates multiple IRSs by means…
▽ More
Conventional beamforming methods for intelligent reflecting surfaces (IRSs) or reconfigurable intelligent surfaces (RISs) typically entail the full channel state information (CSI). However, the computational cost of channel acquisition soars exponentially with the number of IRSs. To bypass this difficulty, we propose a novel strategy called blind beamforming that coordinates multiple IRSs by means of statistics without knowing CSI. Blind beamforming only requires measuring the received signal power at the user terminal for a sequence of randomly generated phase shifts across all IRSs. The main idea is to extract the key statistical quantity for beamforming by exploring only a small portion of the whole solution space of phase shifts. We show that blind beamforming guarantees a signal-to-noise ratio (SNR) boost of Theta(N^{2L}) under certain conditions, where L is the number of IRSs and N is the number of reflecting elements per IRS. The proposed conditions for achieving the optimal SNR boost of Theta(N^{4}) in a double-IRS system are much easier to satisfy than the existing ones in the literature. Most importantly, the proposed conditions can be extended to a fully general L-IRS system. The above result significantly improves upon the state of the art in the area of multi-IRS-assisted communication. Moreover, blind beamforming is justified via field tests and simulations. In particular, as shown in our field tests at 2.6 GHz, our method yields up to 17 dB SNR boost; to the best of our knowledge, this is the first time that the use of multiple IRSs gets verified in the real world.
△ Less
Submitted 8 January, 2024; v1 submitted 19 February, 2023;
originally announced February 2023.
-
ABODE-Net: An Attention-based Deep Learning Model for Non-intrusive Building Occupancy Detection Using Smart Meter Data
Authors:
Zhirui Luo,
Ruobin Qi,
Qingqing Li,
Jun Zheng,
Sihua Shao
Abstract:
Occupancy information is useful for efficient energy management in the building sector. The massive high-resolution electrical power consumption data collected by smart meters in the advanced metering infrastructure (AMI) network make it possible to infer buildings' occupancy status in a non-intrusive way. In this paper, we propose a deep leaning model called ABODE-Net which employs a novel Parall…
▽ More
Occupancy information is useful for efficient energy management in the building sector. The massive high-resolution electrical power consumption data collected by smart meters in the advanced metering infrastructure (AMI) network make it possible to infer buildings' occupancy status in a non-intrusive way. In this paper, we propose a deep leaning model called ABODE-Net which employs a novel Parallel Attention (PA) block for building occupancy detection using smart meter data. The PA block combines the temporal, variable, and channel attention modules in a parallel way to signify important features for occupancy detection. We adopt two smart meter datasets widely used for building occupancy detection in our performance evaluation. A set of state-of-the-art shallow machine learning and deep learning models are included for performance comparison. The results show that ABODE-Net significantly outperforms other models in all experimental cases, which proves its validity as a solution for non-intrusive building occupancy detection.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
A Data Quality Assessment Framework for AI-enabled Wireless Communication
Authors:
Hanning Tang,
Liusha Yang,
Rui Zhou,
Jing Liang,
Hong Wei,
Xuan Wang,
Qingjiang Shi,
Zhi-Quan Luo
Abstract:
Using artificial intelligent (AI) to re-design and enhance the current wireless communication system is a promising pathway for the future sixth-generation (6G) wireless network. The performance of AI-enabled wireless communication depends heavily on the quality of wireless air-interface data. Although there are various approaches to data quality assessment (DQA) for different applications, none h…
▽ More
Using artificial intelligent (AI) to re-design and enhance the current wireless communication system is a promising pathway for the future sixth-generation (6G) wireless network. The performance of AI-enabled wireless communication depends heavily on the quality of wireless air-interface data. Although there are various approaches to data quality assessment (DQA) for different applications, none has been designed for wireless air-interface data. In this paper, we propose a DQA framework to measure the quality of wireless air-interface data from three aspects: similarity, diversity, and completeness. The similarity measures how close the considered datasets are in terms of their statistical distributions; the diversity measures how well-rounded a dataset is, while the completeness measures to what degree the considered dataset satisfies the required performance metrics in an application scenario. The proposed framework can be applied to various types of wireless air-interface data, such as channel state information (CSI), signal-to-interference-plus-noise ratio (SINR), reference signal received power (RSRP), etc. For simplicity, the validity of our proposed DQA framework is corroborated by applying it to CSI data and using similarity and diversity metrics to improve CSI compression and recovery in Massive MIMO systems.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report
Authors:
Andrey Ignatov,
Radu Timofte,
Maurizio Denna,
Abdel Younes,
Ganzorig Gankhuyag,
Jingang Huh,
Myeong Kyun Kim,
Kihwan Yoon,
Hyeon-Cheol Moon,
Seungho Lee,
Yoonsik Choe,
Jinwoo Jeong,
Sungjei Kim,
Maciej Smyl,
Tomasz Latkowski,
Pawel Kubik,
Michal Sokolski,
Yujie Ma,
Jiahao Chao,
Zhou Zhou,
Hongfan Gao,
Zhengfeng Yang,
Zhenbing Zeng,
Zhengyang Zhuge,
Chenghua Li
, et al. (71 additional authors not shown)
Abstract:
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose…
▽ More
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
A Linear Time Algorithm for the Optimal Discrete IRS Beamforming
Authors:
Shuyi Ren,
Kaiming Shen,
Xin Li,
Xin Chen,
Zhi-Quan Luo
Abstract:
It remains an open problem to find the optimal configuration of phase shifts under the discrete constraint for intelligent reflecting surface (IRS) in polynomial time. The above problem is widely believed to be difficult because it is not linked to any known combinatorial problems that can be solved efficiently. The branch-and-bound algorithms and the approximation algorithms constitute the best r…
▽ More
It remains an open problem to find the optimal configuration of phase shifts under the discrete constraint for intelligent reflecting surface (IRS) in polynomial time. The above problem is widely believed to be difficult because it is not linked to any known combinatorial problems that can be solved efficiently. The branch-and-bound algorithms and the approximation algorithms constitute the best results in this area. Nevertheless, this work shows that the global optimum can actually be reached in linear time on average in terms of the number of reflective elements (REs) of IRS. The main idea is to geometrically interpret the discrete beamforming problem as choosing the optimal point on the unit circle. Although the number of possible combinations of phase shifts grows exponentially with the number of REs, it turns out that there are only a linear number of circular arcs that possibly contain the optimal point. Furthermore, the proposed algorithm can be viewed as a novel approach to a special case of the discrete quadratic program (QP).
△ Less
Submitted 7 September, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Fast Nearest Convolution for Real-Time Efficient Image Super-Resolution
Authors:
Ziwei Luo,
Youwei Li,
Lei Yu,
Qi Wu,
Zhihong Wen,
Haoqiang Fan,
Shuaicheng Liu
Abstract:
Deep learning-based single image super-resolution (SISR) approaches have drawn much attention and achieved remarkable success on modern advanced GPUs. However, most state-of-the-art methods require a huge number of parameters, memories, and computational resources, which usually show inferior inference times when applying them to current mobile device CPUs/NPUs. In this paper, we propose a simple…
▽ More
Deep learning-based single image super-resolution (SISR) approaches have drawn much attention and achieved remarkable success on modern advanced GPUs. However, most state-of-the-art methods require a huge number of parameters, memories, and computational resources, which usually show inferior inference times when applying them to current mobile device CPUs/NPUs. In this paper, we propose a simple plain convolution network with a fast nearest convolution module (NCNet), which is NPU-friendly and can perform a reliable super-resolution in real-time. The proposed nearest convolution has the same performance as the nearest upsampling but is much faster and more suitable for Android NNAPI. Our model can be easily deployed on mobile devices with 8-bit quantization and is fully compatible with all major mobile AI accelerators. Moreover, we conduct comprehensive experiments on different tensor operations on a mobile device to illustrate the efficiency of our network architecture. Our NCNet is trained and validated on the DIV2K 3x dataset, and the comparison with other efficient SR methods demonstrated that the NCNet can achieve high fidelity SR results while using fewer inference times. Our codes and pretrained models are publicly available at \url{https://github.com/Algolzw/NCNet}.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
Performance Evaluation of Selective Fixed-filter Active Noise Control based on Different Convolutional Neural Networks
Authors:
Zhengding Luo,
Dongyuan Shi,
Woon-Seng Gan
Abstract:
Due to its rapid response time and a high degree of robustness, the selective fixed-filter active noise control (SFANC) method appears to be a viable candidate for widespread use in a variety of practical active noise control (ANC) systems. In comparison to conventional fixed-filter ANC methods, SFANC can select the pre-trained control filters for different types of noise. Deep learning technologi…
▽ More
Due to its rapid response time and a high degree of robustness, the selective fixed-filter active noise control (SFANC) method appears to be a viable candidate for widespread use in a variety of practical active noise control (ANC) systems. In comparison to conventional fixed-filter ANC methods, SFANC can select the pre-trained control filters for different types of noise. Deep learning technologies, thus, can be used in SFANC methods to enable a more flexible selection of the most appropriate control filters for attenuating various noises. Furthermore, with the assistance of a deep neural network, the selecting strategy can be learned automatically from noise data rather than through trial and error, which significantly simplifies and improves the practicability of ANC design. Therefore, this paper investigates the performance of SFANC based on different one-dimensional and two-dimensional convolutional neural networks. Additionally, we conducted comparative analyses of several network training strategies and discovered that fine-tuning could improve selection performance.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Implementation of Multi-channel Active Noise Control based on Back-propagation Mechanism
Authors:
Zhengding Luo,
Dongyuan Shi,
Junwei Ji,
Woon-seng Gan
Abstract:
Active noise control (ANC) systems can efficiently attenuate low-frequency noises by introducing anti-noises to combine with the unwanted noises. In ANC systems, the filtered-x least mean square (FxLMS) and filtered-X normalized least-mean-square (FxNLMS) algorithm are well-known algorithms for adaptively adjusting control filters. Multi-channel ANC systems are typically required to attenuate unwa…
▽ More
Active noise control (ANC) systems can efficiently attenuate low-frequency noises by introducing anti-noises to combine with the unwanted noises. In ANC systems, the filtered-x least mean square (FxLMS) and filtered-X normalized least-mean-square (FxNLMS) algorithm are well-known algorithms for adaptively adjusting control filters. Multi-channel ANC systems are typically required to attenuate unwanted noises in a large space. However, open-source implementations of the multi-channel FxLMS (McFxLMS) and multi-channel FxNLMS (McFxNLMS) algorithm continue to be scarce. Therefore, this paper proposes a simple and effective implementation approach of the McFxLMS and McFxNLMS algorithm. Motivated by the back-propagation process during neural network training, the McFxLMS and McFxNLMS algorithm can be implemented via automatic derivation mechanism. We implemented the two algorithms using the automatic derivation mechanism in PyTorch and made the source code available on GitHub. This implementation method can improve the practicality of multi-channel ANC systems, which is expected to be widely used in ANC applications.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
A Hybrid SFANC-FxNLMS Algorithm for Active Noise Control based on Deep Learning
Authors:
Zhengding Luo,
Dongyuan Shi,
Woon-Seng Gan
Abstract:
The selective fixed-filter active noise control (SFANC) method selecting the best pre-trained control filters for various types of noise can achieve a fast response time. However, it may lead to large steady-state errors due to inaccurate filter selection and the lack of adaptability. In comparison, the filtered-X normalized least-mean-square (FxNLMS) algorithm can obtain lower steady-state errors…
▽ More
The selective fixed-filter active noise control (SFANC) method selecting the best pre-trained control filters for various types of noise can achieve a fast response time. However, it may lead to large steady-state errors due to inaccurate filter selection and the lack of adaptability. In comparison, the filtered-X normalized least-mean-square (FxNLMS) algorithm can obtain lower steady-state errors through adaptive optimization. Nonetheless, its slow convergence has a detrimental effect on dynamic noise attenuation. Therefore, this paper proposes a hybrid SFANC-FxNLMS approach to overcome the adaptive algorithm's slow convergence and provide a better noise reduction level than the SFANC method. A lightweight one-dimensional convolutional neural network (1D CNN) is designed to automatically select the most suitable pre-trained control filter for each frame of the primary noise. Meanwhile, the FxNLMS algorithm continues to update the coefficients of the chosen pre-trained control filter at the sampling rate. Owing to the effective combination of the two algorithms, experimental results show that the hybrid SFANC-FxNLMS algorithm can achieve a rapid response time, a low noise reduction error, and a high degree of robustness.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Robust Adaptive Beamforming via Worst-Case SINR Maximization with Nonconvex Uncertainty Sets
Authors:
Yongwei Huang,
Hao Fu,
Sergiy A. Vorobyov,
Zhi-Quan Luo
Abstract:
This paper considers a formulation of the robust adaptive beamforming (RAB) problem based on worst-case signal-to-interference-plus-noise ratio (SINR) maximization with a nonconvex uncertainty set for the steering vectors. The uncertainty set consists of a similarity constraint and a (nonconvex) double-sided ball constraint. The worst-case SINR maximization problem is turned into a quadratic matri…
▽ More
This paper considers a formulation of the robust adaptive beamforming (RAB) problem based on worst-case signal-to-interference-plus-noise ratio (SINR) maximization with a nonconvex uncertainty set for the steering vectors. The uncertainty set consists of a similarity constraint and a (nonconvex) double-sided ball constraint. The worst-case SINR maximization problem is turned into a quadratic matrix inequality (QMI) problem using the strong duality of semidefinite programming. Then a linear matrix inequality (LMI) relaxation for the QMI problem is proposed, with an additional valid linear constraint. Necessary and sufficient conditions for the tightened LMI relaxation problem to have a rank-one solution are established. When the tightened LMI relaxation problem still has a high-rank solution, the LMI relaxation problem is further restricted to become a bilinear matrix inequality (BLMI) problem. We then propose an iterative algorithm to solve the BLMI problem that finds an optimal/suboptimal solution for the original RAB problem by solving the BLMI formulations. To validate our results, simulation examples are presented to demonstrate the improved array output SINR of the proposed robust beamformer.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.