-
FEMSN: Frequency-Enhanced Multiscale Network for fault diagnosis of rotating machinery under strong noise environments
Authors:
Yuhan Yuan,
Xiaomo Jiang,
Yanfeng Han,
Ke Xiao
Abstract:
Rolling bearings are critical components of rotating machinery, and their proper functioning is essential for industrial production. Most existing condition monitoring methods focus on extracting discriminative features from time-domain signals to assess bearing health status. However, under complex operating conditions, periodic impulsive characteristics related to fault information are often obs…
▽ More
Rolling bearings are critical components of rotating machinery, and their proper functioning is essential for industrial production. Most existing condition monitoring methods focus on extracting discriminative features from time-domain signals to assess bearing health status. However, under complex operating conditions, periodic impulsive characteristics related to fault information are often obscured by noise interference. Consequently, existing approaches struggle to learn distinctive fault-related features in such scenarios. To address this issue, this paper proposes a novel CNN-based model named FEMSN. Specifically, a Fourier Adaptive Denoising Encoder Layer (FADEL) is introduced as an input denoising layer to enhance key features while filtering out irrelevant information. Subsequently, a Multiscale Time-Frequency Fusion (MSTFF) module is employed to extract fused time-frequency features, further improving the model robustness and nonlinear representation capability. Additionally, a distillation layer is incorporated to expand the receptive field. Based on these advancements, a novel deep lightweight CNN model, termed the Frequency-Enhanced Multiscale Network (FEMSN), is developed. The effectiveness of FEMSN and FADEL in machine health monitoring and stability assessment is validated through two case studies.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
Automated Detection of Epileptic Spikes and Seizures Incorporating a Novel Spatial Clustering Prior
Authors:
Hanyang Dong,
Shurong Sheng,
Xiongfei Wang,
Jiahong Gao,
Yi Sun,
Wanli Yang,
Kuntao Xiao,
Pengfei Teng,
Guoming Luan,
Zhao Lv
Abstract:
A Magnetoencephalography (MEG) time-series recording consists of multi-channel signals collected by superconducting sensors, with each signal's intensity reflecting magnetic field changes over time at the sensor location. Automating epileptic MEG spike detection significantly reduces manual assessment time and effort, yielding substantial clinical benefits. Existing research addresses MEG spike de…
▽ More
A Magnetoencephalography (MEG) time-series recording consists of multi-channel signals collected by superconducting sensors, with each signal's intensity reflecting magnetic field changes over time at the sensor location. Automating epileptic MEG spike detection significantly reduces manual assessment time and effort, yielding substantial clinical benefits. Existing research addresses MEG spike detection by encoding neural network inputs with signals from all channel within a time segment, followed by classification. However, these methods overlook simultaneous spiking occurred from nearby sensors. We introduce a simple yet effective paradigm that first clusters MEG channels based on their sensor's spatial position. Next, a novel convolutional input module is designed to integrate the spatial clustering and temporal changes of the signals. This module is fed into a custom MEEG-ResNet3D developed by the authors, which learns to extract relevant features and classify the input as a spike clip or not. Our method achieves an F1 score of 94.73% on a large real-world MEG dataset Sanbo-CMR collected from two centers, outperforming state-of-the-art approaches by 1.85%. Moreover, it demonstrates efficacy and stability in the Electroencephalographic (EEG) seizure detection task, yielding an improved weighted F1 score of 1.4% compared to current state-of-the-art techniques evaluated on TUSZ, whch is the largest EEG seizure dataset.
△ Less
Submitted 4 January, 2025;
originally announced January 2025.
-
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge
Authors:
Dake Guo,
Jixun Yao,
Xinfa Zhu,
Kangxiang Xia,
Zhao Guo,
Ziyu Zhang,
Yao Wang,
Jie Liu,
Lei Xie
Abstract:
This paper presents the NPU-HWC system submitted to the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC). Our system consists of two modules: a speech generator for Track 1 and a background audio generator for Track 2. In Track 1, we employ Single-Codec to tokenize the speech into discrete tokens and use a language-model-based approach to achieve zero-shot speaking…
▽ More
This paper presents the NPU-HWC system submitted to the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC). Our system consists of two modules: a speech generator for Track 1 and a background audio generator for Track 2. In Track 1, we employ Single-Codec to tokenize the speech into discrete tokens and use a language-model-based approach to achieve zero-shot speaking style cloning. The Single-Codec effectively decouples timbre and speaking style at the token level, reducing the acoustic modeling burden on the autoregressive language model. Additionally, we use DSPGAN to upsample 16 kHz mel-spectrograms to high-fidelity 48 kHz waveforms. In Track 2, we propose a background audio generator based on large language models (LLMs). This system produces scene-appropriate accompaniment descriptions, synthesizes background audio with Tango 2, and integrates it with the speech generated by our Track 1 system. Our submission achieves the second place and the first place in Track 1 and Track 2 respectively.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Sliced Maximal Information Coefficient: A Training-Free Approach for Image Quality Assessment Enhancement
Authors:
Kang Xiao,
Xu Wang,
Yulin He,
Baoliang Chen,
Xuelin Shen
Abstract:
Full-reference image quality assessment (FR-IQA) models generally operate by measuring the visual differences between a degraded image and its reference. However, existing FR-IQA models including both the classical ones (eg, PSNR and SSIM) and deep-learning based measures (eg, LPIPS and DISTS) still exhibit limitations in capturing the full perception characteristics of the human visual system (HV…
▽ More
Full-reference image quality assessment (FR-IQA) models generally operate by measuring the visual differences between a degraded image and its reference. However, existing FR-IQA models including both the classical ones (eg, PSNR and SSIM) and deep-learning based measures (eg, LPIPS and DISTS) still exhibit limitations in capturing the full perception characteristics of the human visual system (HVS). In this paper, instead of designing a new FR-IQA measure, we aim to explore a generalized human visual attention estimation strategy to mimic the process of human quality rating and enhance existing IQA models. In particular, we model human attention generation by measuring the statistical dependency between the degraded image and the reference image. The dependency is captured in a training-free manner by our proposed sliced maximal information coefficient and exhibits surprising generalization in different IQA measures. Experimental results verify the performance of existing IQA models can be consistently improved when our attention module is incorporated. The source code is available at https://github.com/KANGX99/SMIC.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome
Authors:
Yixin Huang,
Yiqi Jin,
Ke Tao,
Kaijian Xia,
Jianfeng Gu,
Lei Yu,
Lan Du,
Cunjian Chen
Abstract:
May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t…
▽ More
May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-temporal relationship among CT scans and emulate the clinical process of diagnosing MTS, we propose a novel attention module called the dual-enhanced positional multi-head self-attention (DEP-MHSA). The proposed DEP-MHSA reconsiders the role of positional embedding and incorporates a dual-enhanced positional embedding in both attention weights and residual connections. Further, we establish a new dataset, termed MTS-CT, consisting of 747 subjects. Experimental results demonstrate that our proposed approach achieves state-of-the-art MTS diagnosis results, and our self-attention design facilitates the spatial-temporal modeling. We believe that our DEP-MHSA is more suitable to handle CT image sequence modeling and the proposed dataset enables future research on MTS diagnosis. We make our code and dataset publicly available at: https://github.com/Nutingnon/MTS_dep_mhsa.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Frequency-Reactive Power Optimization Strategy of Grid-forming Offshore Wind Farm Using DRU-HVDC Transmission
Authors:
Zhekai Li,
Kun Han,
Xu Cai,
Renxin Yang,
Haotian Yu,
Kepeng Xia,
Lulu Liu
Abstract:
The diode rectifier unit-based high voltage direct current (DRU-HVDC) transmission with grid-forming (GFM) wind turbine is becoming a promising scheme for offshore wind farm(OWF) integration due to its high reliability and low cost. In this scheme, the AC network of the OWF and the DRU has completely different synchronization mechanisms and power flow characteristics from the traditional power sys…
▽ More
The diode rectifier unit-based high voltage direct current (DRU-HVDC) transmission with grid-forming (GFM) wind turbine is becoming a promising scheme for offshore wind farm(OWF) integration due to its high reliability and low cost. In this scheme, the AC network of the OWF and the DRU has completely different synchronization mechanisms and power flow characteristics from the traditional power system. To optimize the power flow and reduce the net loss, this paper carries out the power flow modeling and optimization analysis for the DRU-HVDC transmission system with grid-forming OWFs. The influence of the DRU and the GFM wind turbines on the power flow of the system is analyzed. On this basis, improved constraint conditions are proposed and an optimal power flow (OPF) method is established. This method can minimize the power loss by adjusting the reactive power output of each wind turbine and internal network frequency. Finally, based on MATLAB, this paper uses YALMIP toolkit and CPLEX mathematical solver to realize the programming solution of the OPF model proposed in this paper. The results show that the proposed optimization strategy can effectively reduce the power loss of the entire OWF and the transmission system with an optimization ratio of network losses exceeding 25.3%.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
LightBTSeg: A lightweight breast tumor segmentation model using ultrasound images via dual-path joint knowledge distillation
Authors:
Hongjiang Guo,
Shengwen Wang,
Hao Dang,
Kangle Xiao,
Yaru Yang,
Wenpei Liu,
Tongtong Liu,
Yiying Wan
Abstract:
The accurate segmentation of breast tumors is an important prerequisite for lesion detection, which has significant clinical value for breast tumor research. The mainstream deep learning-based methods have achieved a breakthrough. However, these high-performance segmentation methods are formidable to implement in clinical scenarios since they always embrace high computation complexity, massive par…
▽ More
The accurate segmentation of breast tumors is an important prerequisite for lesion detection, which has significant clinical value for breast tumor research. The mainstream deep learning-based methods have achieved a breakthrough. However, these high-performance segmentation methods are formidable to implement in clinical scenarios since they always embrace high computation complexity, massive parameters, slow inference speed, and huge memory consumption. To tackle this problem, we propose LightBTSeg, a dual-path joint knowledge distillation framework, for lightweight breast tumor segmentation. Concretely, we design a double-teacher model to represent the fine-grained feature of breast ultrasound according to different semantic feature realignments of benign and malignant breast tumors. Specifically, we leverage the bottleneck architecture to reconstruct the original Attention U-Net. It is regarded as a lightweight student model named Simplified U-Net. Then, the prior knowledge of benign and malignant categories is utilized to design the teacher network combined dual-path joint knowledge distillation, which distills the knowledge from cumbersome benign and malignant teachers to a lightweight student model. Extensive experiments conducted on breast ultrasound images (Dataset BUSI) and Breast Ultrasound Dataset B (Dataset B) datasets demonstrate that LightBTSeg outperforms various counterparts.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Machine Learning for Automated Mitral Regurgitation Detection from Cardiac Imaging
Authors:
Ke Xiao,
Erik Learned-Miller,
Evangelos Kalogerakis,
James Priest,
Madalina Fiterau
Abstract:
Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP ope…
▽ More
Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP operates on cardiac imaging slices of the 4-chamber view of the heart. It uses standard computer vision techniques and contrastive models to learn from large amounts of unlabeled data, in conjunction with specialized classifiers to establish the first ever automated MR classification system. Evaluated on a test set of 179 labeled -- 154 non-MR and 25 MR -- sequences, CUSSP attains an F1 score of 0.69 and a ROC-AUC score of 0.88, setting the first benchmark result for this new task.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
A Benchmark for Multi-UAV Task Assignment of an Extended Team Orienteering Problem
Authors:
Kun Xiao,
Junqi Lu,
Ying Nie,
Lan Ma,
Xiangke Wang,
Guohui Wang
Abstract:
A benchmark for multi-UAV task assignment is presented in order to evaluate different algorithms. An extended Team Orienteering Problem is modeled for a kind of multi-UAV task assignment problem. Three intelligent algorithms, i.e., Genetic Algorithm, Ant Colony Optimization and Particle Swarm Optimization are implemented to solve the problem. A series of experiments with different settings are con…
▽ More
A benchmark for multi-UAV task assignment is presented in order to evaluate different algorithms. An extended Team Orienteering Problem is modeled for a kind of multi-UAV task assignment problem. Three intelligent algorithms, i.e., Genetic Algorithm, Ant Colony Optimization and Particle Swarm Optimization are implemented to solve the problem. A series of experiments with different settings are conducted to evaluate three algorithms. The modeled problem and the evaluation results constitute a benchmark, which can be used to evaluate other algorithms used for multi-UAV task assignment problems.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
Modeling, Simulation and Implementation of a Bird-Inspired Morphing Wing Aircraft
Authors:
Kun Xiao,
Yuxin Chen,
Wuyao Jiang,
Chenyao Wang,
Longfei Zhao
Abstract:
We present a design of a bird-inspired morphing wing aircraft, including bionic research, modeling, simulation and flight experiments. Inspired by birds and activated by a planar linkage, our proposed aircraft has three key states: gliding, descending and high-maneuverability. We build the aerodynamic model of the aircraft and analyze its mechanisms to find out a group of optimized parameters. Fur…
▽ More
We present a design of a bird-inspired morphing wing aircraft, including bionic research, modeling, simulation and flight experiments. Inspired by birds and activated by a planar linkage, our proposed aircraft has three key states: gliding, descending and high-maneuverability. We build the aerodynamic model of the aircraft and analyze its mechanisms to find out a group of optimized parameters. Furthermore, we validate our design by Computational Fluid Dynamics (CFD) simulation based on Lattice-Boltzmann technology and determine three phases of the planar linkage for the three states. Lastly, we manufacture a prototype and conduct flight experiments to test the performance of the aircraft.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
A Lifting Wing Fixed on Multirotor UAVs for Long Flight Ranges
Authors:
Kun Xiao,
Yao Meng,
Xunhua Dai,
Haotian Zhang,
Quan Quan
Abstract:
This paper presents a lifting-wing multirotor UAV that allows long-range flight. The UAV features a lifting wing in a special mounting angle that works together with rotors to supply lift when it flies forward, achieving a reduction in energy consumption and improvement of flight range compared to traditional multirotor UAVs. Its dynamic model is built according to the classical multirotor theory…
▽ More
This paper presents a lifting-wing multirotor UAV that allows long-range flight. The UAV features a lifting wing in a special mounting angle that works together with rotors to supply lift when it flies forward, achieving a reduction in energy consumption and improvement of flight range compared to traditional multirotor UAVs. Its dynamic model is built according to the classical multirotor theory and the fixed-wing theory, as the aerodynamics of its multiple propellers and that of its lifting wing are almost decoupled. Its design takes into consideration aerodynamics, airframe configuration and the mounting angle. The performance of the UAV is verified by experiments, which show that the lifting wing saves 50.14% of the power when the UAV flies at the cruise speed (15m/s).
△ Less
Submitted 29 June, 2020; v1 submitted 28 June, 2020;
originally announced June 2020.
-
Implementation of UAV Coordination Based on a Hierarchical Multi-UAV Simulation Platform
Authors:
Kun Xiao,
Lan Ma,
Shaochang Tan,
Yirui Cong,
Xiangke Wang
Abstract:
In this paper, a hierarchical multi-UAV simulation platform,called XTDrone, is designed for UAV swarms, which is completely open-source 4 . There are six layers in XTDrone: communication, simulator,low-level control, high-level control, coordination, and human interac-tion layers. XTDrone has three advantages. Firstly, the simulation speedcan be adjusted to match the computer performance, based on…
▽ More
In this paper, a hierarchical multi-UAV simulation platform,called XTDrone, is designed for UAV swarms, which is completely open-source 4 . There are six layers in XTDrone: communication, simulator,low-level control, high-level control, coordination, and human interac-tion layers. XTDrone has three advantages. Firstly, the simulation speedcan be adjusted to match the computer performance, based on the lock-step mode. Thus, the simulations can be conducted on a work stationor on a personal laptop, for different purposes. Secondly, a simplifiedsimulator is also developed which enables quick algorithm designing sothat the approximated behavior of UAV swarms can be observed inadvance. Thirdly, XTDrone is based on ROS, Gazebo, and PX4, andhence the codes in simulations can be easily transplanted to embeddedsystems. Note that XTDrone can support various types of multi-UAVmissions, and we provide two important demos in this paper: one is aground-station-based multi-UAV cooperative search, and the other is adistributed UAV formation flight, including consensus-based formationcontrol, task assignment, and obstacle avoidance.
△ Less
Submitted 30 May, 2022; v1 submitted 3 May, 2020;
originally announced May 2020.
-
Applications of Generative Adversarial Models in Visual Search Reformulation
Authors:
Kyle Xiao,
Houdong Hu,
Yan Wang
Abstract:
Query reformulation is the process by which a input search query is refined by the user to match documents outside the original top-n results. On average, roughly 50% of text search queries involve some form of reformulation, and term suggestion tools are used 35% of the time when offered to users. As prevalent as text search queries are, however, such a feature has yet to be explored at scale for…
▽ More
Query reformulation is the process by which a input search query is refined by the user to match documents outside the original top-n results. On average, roughly 50% of text search queries involve some form of reformulation, and term suggestion tools are used 35% of the time when offered to users. As prevalent as text search queries are, however, such a feature has yet to be explored at scale for visual search. This is because reformulation for images presents a novel challenge to seamlessly transform visual features to match user intent within the context of a typical user session. In this paper, we present methods of semantically transforming visual queries, such as utilizing operations in the latent space of a generative adversarial model for the scenarios of fashion and product search.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.
-
GenerationMania: Learning to Semantically Choreograph
Authors:
Zhiyu Lin,
Kyle Xiao,
Mark Riedl
Abstract:
Beatmania is a rhythm action game where players must reproduce some of the sounds of a song by pressing specific controller buttons at the correct time. In this paper we investigate the use of deep neural networks to automatically create game stages - called charts - for arbitrary pieces of music. Our technique uses a multi-layer feed-forward network trained on sound sequence summary statistics to…
▽ More
Beatmania is a rhythm action game where players must reproduce some of the sounds of a song by pressing specific controller buttons at the correct time. In this paper we investigate the use of deep neural networks to automatically create game stages - called charts - for arbitrary pieces of music. Our technique uses a multi-layer feed-forward network trained on sound sequence summary statistics to predict which sounds in the music are to be played by the player and which will play automatically. We use another neural network along with rules to determine which controls should be mapped to which sounds. We evaluated our system on the ability to reconstruct charts in a held-out test set, achieving an $F_1$-score that significantly beats LSTM baselines.
△ Less
Submitted 12 August, 2019; v1 submitted 28 June, 2018;
originally announced June 2018.