-
Joint Scheduling of DER under Demand Charges: Structure and Approximation
Authors:
Ruixiao Yang,
Gulai Shen,
Ahmed S. Alahmed,
Chuchu Fan
Abstract:
We study the joint scheduling of behind-the-meter distributed energy resources (DERs), including flexible loads, renewable generation, and battery energy storage systems, under net energy metering frameworks with demand charges. The problem is formulated as a stochastic dynamic program aimed at maximizing expected operational surplus while accounting for renewable generation uncertainty. We analyt…
▽ More
We study the joint scheduling of behind-the-meter distributed energy resources (DERs), including flexible loads, renewable generation, and battery energy storage systems, under net energy metering frameworks with demand charges. The problem is formulated as a stochastic dynamic program aimed at maximizing expected operational surplus while accounting for renewable generation uncertainty. We analytically characterize the structure of the optimal control policy and show that it admits a threshold-based form. However, due to the strong temporal coupling of the storage and demand charge constraints, the number of conditional branches in the policy scales combinatorially with the scheduling horizon, as it requires a look-ahead over future states. To overcome the high computational complexity in the general formulation, an efficient approximation algorithm is proposed, which searches for the peak demand under a mildly relaxed problem. We show that the algorithm scales linearly with the scheduling horizon. Extensive simulations using two open-source datasets validate the proposed algorithm and compare its performance against different DER control strategies, including a reinforcement learning-based one. Under varying storage and tariff parameters, the results show that the proposed algorithm outperforms various benchmarks in achieving a relatively small solution gap compared to the theoretical upper bound.
△ Less
Submitted 26 June, 2025;
originally announced June 2025.
-
3D Gaussian Splatting for Fine-Detailed Surface Reconstruction in Large-Scale Scene
Authors:
Shihan Chen,
Zhaojin Li,
Zeyu Chen,
Qingsong Yan,
Gaoyang Shen,
Ran Duan
Abstract:
Recent developments in 3D Gaussian Splatting have made significant advances in surface reconstruction. However, scaling these methods to large-scale scenes remains challenging due to high computational demands and the complex dynamic appearances typical of outdoor environments. These challenges hinder the application in aerial surveying and autonomous driving. This paper proposes a novel solution…
▽ More
Recent developments in 3D Gaussian Splatting have made significant advances in surface reconstruction. However, scaling these methods to large-scale scenes remains challenging due to high computational demands and the complex dynamic appearances typical of outdoor environments. These challenges hinder the application in aerial surveying and autonomous driving. This paper proposes a novel solution to reconstruct large-scale surfaces with fine details, supervised by full-sized images. Firstly, we introduce a coarse-to-fine strategy to reconstruct a coarse model efficiently, followed by adaptive scene partitioning and sub-scene refining from image segments. Additionally, we integrate a decoupling appearance model to capture global appearance variations and a transient mask model to mitigate interference from moving objects. Finally, we expand the multi-view constraint and introduce a single-view regularization for texture-less areas. Our experiments were conducted on the publicly available dataset GauU-Scene V2, which was captured using unmanned aerial vehicles. To the best of our knowledge, our method outperforms existing NeRF-based and Gaussian-based methods, achieving high-fidelity visual results and accurate surface from full-size image optimization. Open-source code will be available on GitHub.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Physical Layer-Based Device Fingerprinting for Wireless Security: From Theory to Practice
Authors:
Junqing Zhang,
Francesco Ardizzon,
Mattia Piana,
Guanxiong Shen,
Stefano Tomasin
Abstract:
The identification of the devices from which a message is received is part of security mechanisms to ensure authentication in wireless communications. Conventional authentication approaches are cryptography-based, which, however, are usually computationally expensive and not adequate in the Internet of Things (IoT), where devices tend to be low-cost and with limited resources. This paper provides…
▽ More
The identification of the devices from which a message is received is part of security mechanisms to ensure authentication in wireless communications. Conventional authentication approaches are cryptography-based, which, however, are usually computationally expensive and not adequate in the Internet of Things (IoT), where devices tend to be low-cost and with limited resources. This paper provides a comprehensive survey of physical layer-based device fingerprinting, which is an emerging device authentication for wireless security. In particular, this article focuses on hardware impairment-based identity authentication and channel features-based authentication. They are passive techniques that are readily applicable to legacy IoT devices. Their intrinsic hardware and channel features, algorithm design methodologies, application scenarios, and key research questions are extensively reviewed here. The remaining research challenges are discussed, and future work is suggested that can further enhance the physical layer-based device fingerprinting.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
DEKC: Data-Enable Control for Tethered Space Robot Deployment in the Presence of Uncertainty via Koopman Operator Theory
Authors:
Ao Jin,
Qinyi Wang,
Sijie Wen,
Ya Liu,
Ganghui Shen,
Panfeng Huang,
Fan Zhang
Abstract:
This work focuses the deployment of tethered space robot in the presence of unknown uncertainty. A data-enable framework called DEKC which contains offline training part and online execution part is proposed to deploy tethered space robot in the presence of uncertainty. The main idea of this work is modeling the unknown uncertainty as a dynamical system, which enables high accuracy and convergence…
▽ More
This work focuses the deployment of tethered space robot in the presence of unknown uncertainty. A data-enable framework called DEKC which contains offline training part and online execution part is proposed to deploy tethered space robot in the presence of uncertainty. The main idea of this work is modeling the unknown uncertainty as a dynamical system, which enables high accuracy and convergence of capturing uncertainty. The core part of proposed framework is a proxy model of uncertainty, which is derived from data-driven Koopman theory and is separated with controller design. In the offline stage, the lifting functions associated with Koopman operator are parameterized with deep neural networks. Then by solving an optimization problem, the lifting functions are learned from sampling data. In the online execution stage, the proxy model cooperates the learned lifting functions obtained in the offline phase to capture the unknown uncertainty. Then the output of proxy model is compensated to the baseline controller such that the effect of uncertainty can be attenuated or even eliminated. Furthermore, considering some scenarios in which the performance of proxy model may weaken, a receding-horizon scheme is proposed to update the proxy model online. Finally, the extensive numerical simulations demonstrate the effectiveness of our proposed framework. The implementation of proposed DEKC framework is publicly available at https://github.com/NPU-RCIR/DEKC.git.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training
Authors:
Marianne de Heer Kloots,
Hosein Mohebbi,
Charlotte Pouw,
Gaofei Shen,
Willem Zuidema,
Martijn Bentum
Abstract:
How language-specific are speech representations learned by self-supervised models? Existing work has shown that a range of linguistic features can be successfully decoded from end-to-end models trained only on speech recordings. However, it's less clear to what extent pre-training on specific languages improves language-specific linguistic information. Here we test the encoding of Dutch phonetic…
▽ More
How language-specific are speech representations learned by self-supervised models? Existing work has shown that a range of linguistic features can be successfully decoded from end-to-end models trained only on speech recordings. However, it's less clear to what extent pre-training on specific languages improves language-specific linguistic information. Here we test the encoding of Dutch phonetic and lexical information in internal representations of self-supervised Wav2Vec2 models. Pre-training exclusively on Dutch improves the representation of Dutch linguistic features as compared to pre-training on similar amounts of English or larger amounts of multilingual data. This language-specific advantage is well-detected by trained clustering or classification probes, and partially observable using zero-shot metrics. Furthermore, the language-specific benefit on linguistic feature encoding aligns with downstream performance on Automatic Speech Recognition.
△ Less
Submitted 10 July, 2025; v1 submitted 1 June, 2025;
originally announced June 2025.
-
On the reliability of feature attribution methods for speech classification
Authors:
Gaofei Shen,
Hosein Mohebbi,
Arianna Bisazza,
Afra Alishahi,
Grzegorz Chrupała
Abstract:
As the capabilities of large-scale pre-trained models evolve, understanding the determinants of their outputs becomes more important. Feature attribution aims to reveal which parts of the input elements contribute the most to model outputs. In speech processing, the unique characteristics of the input signal make the application of feature attribution methods challenging. We study how factors such…
▽ More
As the capabilities of large-scale pre-trained models evolve, understanding the determinants of their outputs becomes more important. Feature attribution aims to reveal which parts of the input elements contribute the most to model outputs. In speech processing, the unique characteristics of the input signal make the application of feature attribution methods challenging. We study how factors such as input type and aggregation and perturbation timespan impact the reliability of standard feature attribution methods, and how these factors interact with characteristics of each classification task. We find that standard approaches to feature attribution are generally unreliable when applied to the speech domain, with the exception of word-aligned perturbation methods when applied to word-based classification tasks.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
Co-optimize condenser water temperature and cooling tower fan using high-fidelity synthetic data
Authors:
Gulai Shen,
Gurpreet Singh,
Ali Mehmani
Abstract:
This paper introduces a novel method for optimizing HVAC systems in buildings by integrating a high-fidelity physics-based simulation model with machine learning and measured data. The method enables a real-time building advisory system that provides optimized settings for condenser water loop operation, assisting building operators in decision-making. The building and its HVAC system are first mo…
▽ More
This paper introduces a novel method for optimizing HVAC systems in buildings by integrating a high-fidelity physics-based simulation model with machine learning and measured data. The method enables a real-time building advisory system that provides optimized settings for condenser water loop operation, assisting building operators in decision-making. The building and its HVAC system are first modeled using eQuest. Synthetic data are then generated by running the simulation multiple times. The data are then processed, cleaned, and used to train the machine learning model. The machine learning model enables real-time optimization of the condenser water loop using particle swarm optimization. The results deliver both a real-time online optimizer and an offline operation look-up table, providing optimized condenser water temperature settings and the optimal number of cooling tower fans at a given cooling load. Potential savings are calculated by comparing measured data from two summer months with the energy costs the building would have experienced under optimized settings. Adaptive model refinement is applied to further improve accuracy and effectiveness by utilizing available measured data. The method bridges the gap between simulation and real-time control. It has the potential to be applied to other building systems, including the chilled water loop, heating systems, ventilation systems, and other related processes. Combining physics models, data models, and measured data also enables performance analysis, tracking, and retrofit recommendations.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches
Authors:
Guodong Shen,
Yuqi Ouyang,
Junru Lu,
Yixuan Yang,
Victor Sanchez
Abstract:
Despite the prevailing transition from single-task to multi-task approaches in video anomaly detection, we observe that many adopt sub-optimal frameworks for individual proxy tasks. Motivated by this, we contend that optimizing single-task frameworks can advance both single- and multi-task approaches. Accordingly, we leverage middle-frame prediction as the primary proxy task, and introduce an effe…
▽ More
Despite the prevailing transition from single-task to multi-task approaches in video anomaly detection, we observe that many adopt sub-optimal frameworks for individual proxy tasks. Motivated by this, we contend that optimizing single-task frameworks can advance both single- and multi-task approaches. Accordingly, we leverage middle-frame prediction as the primary proxy task, and introduce an effective hybrid framework designed to generate accurate predictions for normal frames and flawed predictions for abnormal frames. This hybrid framework is built upon a bi-directional structure that seamlessly integrates both vision transformers and ConvLSTMs. Specifically, we utilize this bi-directional structure to fully analyze the temporal dimension by predicting frames in both forward and backward directions, significantly boosting the detection stability. Given the transformer's capacity to model long-range contextual dependencies, we develop a convolutional temporal transformer that efficiently associates feature maps from all context frames to generate attention-based predictions for target frames. Furthermore, we devise a layer-interactive ConvLSTM bridge that facilitates the smooth flow of low-level features across layers and time-steps, thereby strengthening predictions with fine details. Anomalies are eventually identified by scrutinizing the discrepancies between target frames and their corresponding predictions. Several experiments conducted on public benchmarks affirm the efficacy of our hybrid framework, whether used as a standalone single-task approach or integrated as a branch in a multi-task approach. These experiments also underscore the advantages of merging vision transformers and ConvLSTMs for video anomaly detection.
△ Less
Submitted 20 April, 2025;
originally announced April 2025.
-
Co-Optimizing Distributed Energy Resources under Demand Charges and Bi-Directional Power Flow
Authors:
Ruixiao Yang,
Gulai Shen,
Ahmed S. Alahmed,
Chuchu Fan
Abstract:
We address the co-optimization of behind-the-meter (BTM) distributed energy resources (DER), including flexible demands, renewable distributed generation (DG), and battery energy storage systems (BESS) under net energy metering (NEM) frameworks with demand charges. We formulate the problem as a stochastic dynamic program that accounts for renewable generation uncertainty and operational surplus ma…
▽ More
We address the co-optimization of behind-the-meter (BTM) distributed energy resources (DER), including flexible demands, renewable distributed generation (DG), and battery energy storage systems (BESS) under net energy metering (NEM) frameworks with demand charges. We formulate the problem as a stochastic dynamic program that accounts for renewable generation uncertainty and operational surplus maximization. Our theoretical analysis reveals that the optimal policy follows a threshold structure. Finally, we show that even a simple algorithm leveraging this threshold structure performs well in simulation, emphasizing its importance in developing near-optimal algorithms. These findings provide crucial insights for implementing prosumer energy management systems under complex tariff structures.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Residual Channel Boosts Contrastive Learning for Radio Frequency Fingerprint Identification
Authors:
Rui Pan,
Hui Chen,
Guanxiong Shen,
Hongyang Chen
Abstract:
In order to address the issue of limited data samples for the deployment of pre-trained models in unseen environments, this paper proposes a residual channel-based data augmentation strategy for Radio Frequency Fingerprint Identification (RFFI), coupled with a lightweight SimSiam contrastive learning framework. By applying least square (LS) and minimum mean square error (MMSE) channel estimations…
▽ More
In order to address the issue of limited data samples for the deployment of pre-trained models in unseen environments, this paper proposes a residual channel-based data augmentation strategy for Radio Frequency Fingerprint Identification (RFFI), coupled with a lightweight SimSiam contrastive learning framework. By applying least square (LS) and minimum mean square error (MMSE) channel estimations followed by equalization, signals with different residual channel effects are generated. These residual channels enable the model to learn more effective representations. Then the pre-trained model is fine-tuned with 1% samples in a novel environment for RFFI. Experimental results demonstrate that our method significantly enhances both feature extraction ability and generalization while requiring fewer samples and less time, making it suitable for practical wireless security applications.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Experimental Demonstration of 16D Voronoi Constellation with Two-Level Coding over 50km Four-Core Fiber
Authors:
Can Zhao,
Bin Chen,
Jiaqi Cai,
Zhiwei Liang,
Yi Lei,
Junjie Xiong,
Lin Ma,
Daohui Hu,
Lin Sun,
Gangxiang Shen
Abstract:
A 16-dimensional Voronoi constellation concatenated with multilevel coding is experimentally demonstrated over a 50km four-core fiber transmission system. The proposed scheme reduces the required launch power by 6dB and provides a 17dB larger operating range than 16QAM with BICM at the outer HD-FEC BER threshold.
A 16-dimensional Voronoi constellation concatenated with multilevel coding is experimentally demonstrated over a 50km four-core fiber transmission system. The proposed scheme reduces the required launch power by 6dB and provides a 17dB larger operating range than 16QAM with BICM at the outer HD-FEC BER threshold.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Accelerated Proton Resonance Frequency-based Magnetic Resonance Thermometry by Optimized Deep Learning Method
Authors:
Sijie Xu,
Shenyan Zong,
Chang-Sheng Mei,
Guofeng Shen,
Yueran Zhao,
He Wang
Abstract:
Proton resonance frequency (PRF) based MR thermometry is essential for focused ultrasound (FUS) thermal ablation therapies. This work aims to enhance temporal resolution in dynamic MR temperature map reconstruction using an improved deep learning method. The training-optimized methods and five classical neural networks were applied on the 2-fold and 4-fold under-sampling k-space data to reconstruc…
▽ More
Proton resonance frequency (PRF) based MR thermometry is essential for focused ultrasound (FUS) thermal ablation therapies. This work aims to enhance temporal resolution in dynamic MR temperature map reconstruction using an improved deep learning method. The training-optimized methods and five classical neural networks were applied on the 2-fold and 4-fold under-sampling k-space data to reconstruct the temperature maps. The enhanced training modules included offline/online data augmentations, knowledge distillation, and the amplitude-phase decoupling loss function. The heating experiments were performed by a FUS transducer on phantom and ex vivo tissues, respectively. These data were manually under-sampled to imitate acceleration procedures and trained in our method to get the reconstruction model. The additional dozen or so testing datasets were separately obtained for evaluating the real-time performance and temperature accuracy. Acceleration factors of 1.9 and 3.7 were found for 2 times and 4 times k-space under-sampling strategies and the ResUNet-based deep learning reconstruction performed exceptionally well. In 2-fold acceleration scenario, the RMSE of temperature map patches provided the values of 0.888 degree centigrade and 1.145 degree centigrade on phantom and ex vivo testing datasets. The DICE value of temperature areas enclosed by 43 degree centigrade isotherm was 0.809, and the Bland-Altman analysis showed a bias of -0.253 degree centigrade with the apart of plus or minus 2.16 degree centigrade. In 4 times under-sampling case, these evaluating values decreased by approximately 10%. This study demonstrates that deep learning-based reconstruction can significantly enhance the accuracy and efficiency of MR thermometry for clinical FUS thermal therapies.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Magnetic Resonance Image Processing Transformer for General Accelerated Image Reconstruction
Authors:
Guoyao Shen,
Mengyu Li,
Stephan Anderson,
Chad W. Farris,
Xin Zhang
Abstract:
Recent advancements in deep learning have enabled the development of generalizable models that achieve state-of-the-art performance across various imaging tasks. Vision Transformer (ViT)-based architectures, in particular, have demonstrated strong feature extraction capabilities when pre-trained on large-scale datasets. In this work, we introduce the Magnetic Resonance Image Processing Transformer…
▽ More
Recent advancements in deep learning have enabled the development of generalizable models that achieve state-of-the-art performance across various imaging tasks. Vision Transformer (ViT)-based architectures, in particular, have demonstrated strong feature extraction capabilities when pre-trained on large-scale datasets. In this work, we introduce the Magnetic Resonance Image Processing Transformer (MR-IPT), a ViT-based framework designed to enhance the generalizability and robustness of accelerated MRI reconstruction. Unlike conventional deep learning models that require separate training for different acceleration factors, MR-IPT is pre-trained on a large-scale dataset encompassing multiple undersampling patterns and acceleration settings, enabling a unified reconstruction framework. By leveraging a shared transformer backbone, MR-IPT effectively learns universal feature representations, allowing it to generalize across diverse reconstruction tasks. Extensive experiments demonstrate that MR-IPT outperforms both CNN-based and existing transformer-based methods, achieving superior reconstruction quality across varying acceleration factors and sampling masks. Moreover, MR-IPT exhibits strong robustness, maintaining high performance even under unseen acquisition setups, highlighting its potential as a scalable and efficient solution for accelerated MRI. Our findings suggest that transformer-based general models can significantly advance MRI reconstruction, offering improved adaptability and stability compared to traditional deep learning approaches.
△ Less
Submitted 7 February, 2025; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Encoding of lexical tone in self-supervised models of spoken language
Authors:
Gaofei Shen,
Michaela Watkins,
Afra Alishahi,
Arianna Bisazza,
Grzegorz Chrupała
Abstract:
Interpretability research has shown that self-supervised Spoken Language Models (SLMs) encode a wide variety of features in human speech from the acoustic, phonetic, phonological, syntactic and semantic levels, to speaker characteristics. The bulk of prior research on representations of phonology has focused on segmental features such as phonemes; the encoding of suprasegmental phonology (such as…
▽ More
Interpretability research has shown that self-supervised Spoken Language Models (SLMs) encode a wide variety of features in human speech from the acoustic, phonetic, phonological, syntactic and semantic levels, to speaker characteristics. The bulk of prior research on representations of phonology has focused on segmental features such as phonemes; the encoding of suprasegmental phonology (such as tone and stress patterns) in SLMs is not yet well understood. Tone is a suprasegmental feature that is present in more than half of the world's languages. This paper aims to analyze the tone encoding capabilities of SLMs, using Mandarin and Vietnamese as case studies. We show that SLMs encode lexical tone to a significant degree even when they are trained on data from non-tonal languages. We further find that SLMs behave similarly to native and non-native human participants in tone and consonant perception studies, but they do not follow the same developmental trajectory.
△ Less
Submitted 3 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Learning to Reconstruct Accelerated MRI Through K-space Cold Diffusion without Noise
Authors:
Guoyao Shen,
Mengyu Li,
Chad W. Farris,
Stephan Anderson,
Xin Zhang
Abstract:
Deep learning-based MRI reconstruction models have achieved superior performance these days. Most recently, diffusion models have shown remarkable performance in image generation, in-painting, super-resolution, image editing and more. As a generalized diffusion model, cold diffusion further broadens the scope and considers models built around arbitrary image transformations such as blurring, down-…
▽ More
Deep learning-based MRI reconstruction models have achieved superior performance these days. Most recently, diffusion models have shown remarkable performance in image generation, in-painting, super-resolution, image editing and more. As a generalized diffusion model, cold diffusion further broadens the scope and considers models built around arbitrary image transformations such as blurring, down-sampling, etc. In this paper, we propose a k-space cold diffusion model that performs image degradation and restoration in k-space without the need for Gaussian noise. We provide comparisons with multiple deep learning-based MRI reconstruction models and perform tests on a well-known large open-source MRI dataset. Our results show that this novel way of performing degradation can generate high-quality reconstruction images for accelerated MRI.
△ Less
Submitted 5 December, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Attention Hybrid Variational Net for Accelerated MRI Reconstruction
Authors:
Guoyao Shen,
Boran Hao,
Mengyu Li,
Chad W. Farris,
Ioannis Ch. Paschalidis,
Stephan W. Anderson,
Xin Zhang
Abstract:
The application of compressed sensing (CS)-enabled data reconstruction for accelerating magnetic resonance imaging (MRI) remains a challenging problem. This is due to the fact that the information lost in k-space from the acceleration mask makes it difficult to reconstruct an image similar to the quality of a fully sampled image. Multiple deep learning-based structures have been proposed for MRI r…
▽ More
The application of compressed sensing (CS)-enabled data reconstruction for accelerating magnetic resonance imaging (MRI) remains a challenging problem. This is due to the fact that the information lost in k-space from the acceleration mask makes it difficult to reconstruct an image similar to the quality of a fully sampled image. Multiple deep learning-based structures have been proposed for MRI reconstruction using CS, both in the k-space and image domains as well as using unrolled optimization methods. However, the drawback of these structures is that they are not fully utilizing the information from both domains (k-space and image). Herein, we propose a deep learning-based attention hybrid variational network that performs learning in both the k-space and image domain. We evaluate our method on a well-known open-source MRI dataset and a clinical MRI dataset of patients diagnosed with strokes from our institution to demonstrate the performance of our network. In addition to quantitative evaluation, we undertook a blinded comparison of image quality across networks performed by a subspecialty trained radiologist. Overall, we demonstrate that our network achieves a superior performance among others under multiple reconstruction tasks.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Towards Length-Versatile and Noise-Robust Radio Frequency Fingerprint Identification
Authors:
Guanxiong Shen,
Junqing Zhang,
Alan Marshall,
Mikko Valkama,
Joseph Cavallaro
Abstract:
Radio frequency fingerprint identification (RFFI) can classify wireless devices by analyzing the signal distortions caused by the intrinsic hardware impairments. State-of-the-art neural networks have been adopted for RFFI. However, many neural networks, e.g., multilayer perceptron (MLP) and convolutional neural network (CNN), require fixed-size input data. In addition, many IoT devices work in low…
▽ More
Radio frequency fingerprint identification (RFFI) can classify wireless devices by analyzing the signal distortions caused by the intrinsic hardware impairments. State-of-the-art neural networks have been adopted for RFFI. However, many neural networks, e.g., multilayer perceptron (MLP) and convolutional neural network (CNN), require fixed-size input data. In addition, many IoT devices work in low signal-to-noise ratio (SNR) scenarios but the RFFI performance in such scenarios is rarely investigated. In this paper, we analyze the reason why MLP- and CNN-based RFFI systems are constrained by the input size. To overcome this, we propose four neural networks that can process signals of variable lengths, namely flatten-free CNN, long short-term memory (LSTM) network, gated recurrent unit (GRU) network and transformer. We adopt data augmentation during training which can significantly improve the model's robustness to noise. We compare two augmentation schemes, namely offline and online augmentation. The results show the online one performs better. During the inference, a multi-packet inference approach is further leveraged to improve the classification accuracy in low SNR scenarios. We take LoRa as a case study and evaluate the system by classifying 10 commercial-off-the-shelf LoRa devices in various SNR conditions. The online augmentation can boost the low-SNR classification accuracy by up to 50% and the multi-packet inference approach can further increase the accuracy by over 20%.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Towards Receiver-Agnostic and Collaborative Radio Frequency Fingerprint Identification
Authors:
Guanxiong Shen,
Junqing Zhang,
Alan Marshall,
Roger Woods,
Joseph Cavallaro,
Liquan Chen
Abstract:
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique, which exploits the hardware characteristics of the RF front-end as device identifiers. RFFI is implemented in the wireless receiver and acts to extract the transmitter impairments and then perform classification. The receiver hardware impairments will actually interfere with the feature extraction pro…
▽ More
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique, which exploits the hardware characteristics of the RF front-end as device identifiers. RFFI is implemented in the wireless receiver and acts to extract the transmitter impairments and then perform classification. The receiver hardware impairments will actually interfere with the feature extraction process, but its effect and mitigation have not been comprehensively studied. In this paper, we propose a receiver-agnostic RFFI system that is not sensitive to the changes in receiver characteristics; it is implemented by employing adversarial training to learn the receiver-independent features. Moreover, when there are multiple receivers, this functionality can perform collaborative inference to enhance classification accuracy. Finally, we show how it is possible to leverage fine-tuning for further improvement with fewer collected signals. To validate the approach, we have conducted extensive experimental evaluation by applying the approach to a LoRaWAN case study involving ten LoRa devices and 20 software-defined radio (SDR) receivers. The results show that receiver-agnostic training enables the trained neural network to become robust to changes in receiver characteristics. The collaborative inference improves classification accuracy by up to 20% beyond a single-receiver RFFI system and fine-tuning can bring a 40% improvement for under-performing receivers.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Video Anomaly Detection via Prediction Network with Enhanced Spatio-Temporal Memory Exchange
Authors:
Guodong Shen,
Yuqi Ouyang,
Victor Sanchez
Abstract:
Video anomaly detection is a challenging task because most anomalies are scarce and non-deterministic. Many approaches investigate the reconstruction difference between normal and abnormal patterns, but neglect that anomalies do not necessarily correspond to large reconstruction errors. To address this issue, we design a Convolutional LSTM Auto-Encoder prediction framework with enhanced spatio-tem…
▽ More
Video anomaly detection is a challenging task because most anomalies are scarce and non-deterministic. Many approaches investigate the reconstruction difference between normal and abnormal patterns, but neglect that anomalies do not necessarily correspond to large reconstruction errors. To address this issue, we design a Convolutional LSTM Auto-Encoder prediction framework with enhanced spatio-temporal memory exchange using bi-directionalilty and a higher-order mechanism. The bi-directional structure promotes learning the temporal regularity through forward and backward predictions. The unique higher-order mechanism further strengthens spatial information interaction between the encoder and the decoder. Considering the limited receptive fields in Convolutional LSTMs, we also introduce an attention module to highlight informative features for prediction. Anomalies are eventually identified by comparing the frames with their corresponding predictions. Evaluations on three popular benchmarks show that our framework outperforms most existing prediction-based anomaly detection methods.
△ Less
Submitted 26 June, 2022;
originally announced June 2022.
-
FewSense, Towards a Scalable and Cross-Domain Wi-Fi Sensing System Using Few-Shot Learning
Authors:
Guolin Yin,
Junqing Zhang,
Guanxiong Shen,
Yingying Chen
Abstract:
Wi-Fi sensing can classify human activities because each activity causes unique changes to the channel state information (CSI). Existing WiFi sensing suffers from limited scalability as the system needs to be retrained whenever new activities are added, which cause overheads of data collection and retraining. Cross-domain sensing may fail because the mapping between activities and CSI variations i…
▽ More
Wi-Fi sensing can classify human activities because each activity causes unique changes to the channel state information (CSI). Existing WiFi sensing suffers from limited scalability as the system needs to be retrained whenever new activities are added, which cause overheads of data collection and retraining. Cross-domain sensing may fail because the mapping between activities and CSI variations is destroyed when a different environment or user (domain) is involved. This paper proposed a few-shot learning-based WiFi sensing system, named FewSense, which can recognise novel classes in unseen domains with only few samples. Specifically, a feature extractor was pre-trained offline using the source domain data. When the system was applied in the target domain, few samples were used to fine-tune the feature extractor for domain adaptation. Inference was made by computing the cosine similarity. FewSense can further boost the classification accuracy by collaboratively fusing inference from multiple receivers. We evaluated the performance using three public datasets, i.e., SignFi, Widar, and Wiar. The results show that FewSense with five-shot learning recognised novel classes in unseen domains with an accuracy of 90.3\%, 96.5\% ,82.7\% on SignFi, Widar, and Wiar datasets, respectively. Our collaborative sensing model improved system performance by an average of 30\%.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Radio Frequency Fingerprint Identification for Security in Low-Cost IoT Devices
Authors:
Guanxiong Shen,
Junqing Zhang,
Alan Marshall,
Mikko Valkama,
Joseph Cavallaro
Abstract:
Radio frequency fingerprint identification (RFFI) can uniquely classify wireless devices by analyzing the received signal distortions caused by the intrinsic hardware impairments. The state-of-the-art deep learning techniques such as convolutional neural network (CNN) have been adopted to classify IoT devices with high accuracy. However, deep learning-based RFFI requires input data of a fixed size…
▽ More
Radio frequency fingerprint identification (RFFI) can uniquely classify wireless devices by analyzing the received signal distortions caused by the intrinsic hardware impairments. The state-of-the-art deep learning techniques such as convolutional neural network (CNN) have been adopted to classify IoT devices with high accuracy. However, deep learning-based RFFI requires input data of a fixed size. In addition, many IoT devices work in low signal-to-noise ratio (SNR) scenarios but the low SNR RFFI is rarely investigated. In this paper, the state-of-the-art transformer model is used as the classifier, which can process signals of variable length. Data augmentation is adopted to improve low SNR RFFI performance. A multi-packet inference approach is further proposed to improve the classification accuracy in low SNR scenarios. We take LoRa as a case study and evaluate the system by classifying 10 commercial-off-the-shelf LoRa devices in various SNR conditions. The online augmentation can boost the low SNR RFFI performance by up to 50% and multi-packet inference can further increase it by over 20%.
△ Less
Submitted 28 November, 2021;
originally announced November 2021.
-
Towards Scalable and Channel-Robust Radio Frequency Fingerprint Identification for LoRa
Authors:
Guanxiong Shen,
Junqing Zhang,
Alan Marshall,
Joseph Cavallaro
Abstract:
Radio frequency fingerprint identification (RFFI) is a promising device authentication technique based on the transmitter hardware impairments. In this paper, we propose a scalable and robust RFFI framework achieved by deep learning powered radio frequency fingerprint (RFF) extractor. Specifically, we leverage the deep metric learning to train an RFF extractor, which has excellent generalization a…
▽ More
Radio frequency fingerprint identification (RFFI) is a promising device authentication technique based on the transmitter hardware impairments. In this paper, we propose a scalable and robust RFFI framework achieved by deep learning powered radio frequency fingerprint (RFF) extractor. Specifically, we leverage the deep metric learning to train an RFF extractor, which has excellent generalization ability and can extract RFFs from previously unseen devices. Any devices can be enrolled via the pre-trained RFF extractor and the RFF database can be maintained efficiently for allowing devices to join and leave. Wireless channel impacts the RFF extraction and is tackled by exploiting channel independent feature and data augmentation. We carried out extensive experimental evaluation involving 60 commercial off-the-shelf LoRa devices and a USRP N210 software defined radio platform. The results have successfully demonstrated that our framework can achieve excellent generalization abilities for device classification and rogue device detection as well as effective channel mitigation.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Radio Frequency Fingerprint Identification for LoRa Using Spectrogram and CNN
Authors:
Guanxiong Shen,
Junqing Zhang,
Alan Marshall,
Linning Peng,
Xianbin Wang
Abstract:
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique that relies on intrinsic hardware characteristics of wireless devices. We designed an RFFI scheme for Long Range (LoRa) systems based on spectrogram and convolutional neural network (CNN). Specifically, we used spectrogram to represent the fine-grained time-frequency characteristics of LoRa signals. In…
▽ More
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique that relies on intrinsic hardware characteristics of wireless devices. We designed an RFFI scheme for Long Range (LoRa) systems based on spectrogram and convolutional neural network (CNN). Specifically, we used spectrogram to represent the fine-grained time-frequency characteristics of LoRa signals. In addition, we revealed that the instantaneous carrier frequency offset (CFO) is drifting, which will result in misclassification and significantly compromise the system stability; we demonstrated CFO compensation is an effective mitigation. Finally, we designed a hybrid classifier that can adjust CNN outputs with the estimated CFO. The mean value of CFO remains relatively stable, hence it can be used to rule out CNN predictions whose estimated CFO falls out of the range. We performed experiments in real wireless environments using 20 LoRa devices under test (DUTs) and a Universal Software Radio Peripheral (USRP) N210 receiver. By comparing with the IQ-based and FFT-based RFFI schemes, our spectrogram-based scheme can reach the best classification accuracy, i.e., 97.61% for 20 LoRa DUTs.
△ Less
Submitted 30 December, 2020;
originally announced January 2021.
-
Deep Deterministic Policy Gradient for Relay Selection and Power Allocation in Cooperative Communication Network
Authors:
Yuanzhe Geng,
Erwu Liu,
Rui Wang,
Yiming Liu,
Jie Wang,
Gang Shen,
Zhao Dong
Abstract:
Perfect channel state information (CSI) is usually required when considering relay selection and power allocation in cooperative communication. However, it is difficult to get an accurate CSI in practical situations. In this letter, we study the outage probability minimizing problem based on optimizing relay selection and transmission power. We propose a prioritized experience replay aided deep de…
▽ More
Perfect channel state information (CSI) is usually required when considering relay selection and power allocation in cooperative communication. However, it is difficult to get an accurate CSI in practical situations. In this letter, we study the outage probability minimizing problem based on optimizing relay selection and transmission power. We propose a prioritized experience replay aided deep deterministic policy gradient learning framework, which can find an optimal solution by dealing with continuous action space, without any prior knowledge of CSI. Simulation results reveal that our approach outperforms reinforcement learning based methods in existing literatures, and improves the communication success rate by about 4%.
△ Less
Submitted 14 March, 2021; v1 submitted 11 December, 2020;
originally announced December 2020.
-
PENet: Object Detection using Points Estimation in Aerial Images
Authors:
Ziyang Tang,
Xiang Liu,
Guangyu Shen,
Baijian Yang
Abstract:
Aerial imagery has been increasingly adopted in mission-critical tasks, such as traffic surveillance, smart cities, and disaster assistance. However, identifying objects from aerial images faces the following challenges: 1) objects of interests are often too small and too dense relative to the images; 2) objects of interests are often in different relative sizes; and 3) the number of objects in ea…
▽ More
Aerial imagery has been increasingly adopted in mission-critical tasks, such as traffic surveillance, smart cities, and disaster assistance. However, identifying objects from aerial images faces the following challenges: 1) objects of interests are often too small and too dense relative to the images; 2) objects of interests are often in different relative sizes; and 3) the number of objects in each category is imbalanced. A novel network structure, Points Estimated Network (PENet), is proposed in this work to answer these challenges. PENet uses a Mask Resampling Module (MRM) to augment the imbalanced datasets, a coarse anchor-free detector (CPEN) to effectively predict the center points of the small object clusters, and a fine anchor-free detector FPEN to locate the precise positions of the small objects. An adaptive merge algorithm Non-maximum Merge (NMM) is implemented in CPEN to address the issue of detecting dense small objects, and a hierarchical loss is defined in FPEN to further improve the classification accuracy. Our extensive experiments on aerial datasets visDrone and UAVDT showed that PENet achieved higher precision results than existing state-of-the-art approaches. Our best model achieved 8.7% improvement on visDrone and 20.3% on UAVDT.
△ Less
Submitted 22 January, 2020;
originally announced January 2020.
-
AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation
Authors:
Guangyu Shen,
Chengzhi Mao,
Junfeng Yang,
Baishakhi Ray
Abstract:
Due to the inherent robustness of segmentation models, traditional norm-bounded attack methods show limited effect on such type of models. In this paper, we focus on generating unrestricted adversarial examples for semantic segmentation models. We demonstrate a simple and effective method to generate unrestricted adversarial examples using conditional generative adversarial networks (CGAN) without…
▽ More
Due to the inherent robustness of segmentation models, traditional norm-bounded attack methods show limited effect on such type of models. In this paper, we focus on generating unrestricted adversarial examples for semantic segmentation models. We demonstrate a simple and effective method to generate unrestricted adversarial examples using conditional generative adversarial networks (CGAN) without any hand-crafted metric. The naïve implementation of CGAN, however, yields inferior image quality and low attack success rate. Instead, we leverage the SPADE (Spatially-adaptive denormalization) structure with an additional loss item to generate effective adversarial attacks in a single step. We validate our approach on the popular Cityscapes and ADE20K datasets, and demonstrate that our synthetic adversarial examples are not only realistic, but also improve the attack success rate by up to 41.0\% compared with the state of the art adversarial attack methods including PGD.
△ Less
Submitted 18 November, 2019; v1 submitted 5 October, 2019;
originally announced October 2019.