-
MetaH2: A Snapshot Metasurface HDR Hyperspectral Camera
Authors:
Yuxuan Liu,
Qi Guo
Abstract:
We present a metasurface camera that jointly performs high-dynamic range (HDR) and hyperspectral imaging in a snapshot. The system integrates exposure bracketing and computed tomography imaging spectrometry (CTIS) by simultaneously forming multiple spatially multiplexed projections with unique power ratios and chromatic aberrations on a photosensor. The measurements are subsequently processed thro…
▽ More
We present a metasurface camera that jointly performs high-dynamic range (HDR) and hyperspectral imaging in a snapshot. The system integrates exposure bracketing and computed tomography imaging spectrometry (CTIS) by simultaneously forming multiple spatially multiplexed projections with unique power ratios and chromatic aberrations on a photosensor. The measurements are subsequently processed through a deep reconstruction model to generate an HDR image and a hyperspectral datacube. Our simulation studies show that the proposed system achieves higher reconstruction accuracy than previous snapshot hyperspectral imaging methods on benchmark datasets. We assemble a working prototype and demonstrate snapshot reconstruction of 60 dB dynamic range and 10 nm spectral resolution from 600 nm to 700 nm on real-world scenes from a monochrome photosensor.
△ Less
Submitted 10 July, 2025;
originally announced July 2025.
-
Polyadic encryption
Authors:
Steven Duplij,
Qiang Guo
Abstract:
A novel original procedure of encryption/decryption based on the polyadic algebraic structures and on signal processing methods is proposed. First, we use signals with integer amplitudes to send information. Then we use polyadic techniques to transfer the plaintext into series of special integers. The receiver restores the plaintext using special rules and systems of equations.
A novel original procedure of encryption/decryption based on the polyadic algebraic structures and on signal processing methods is proposed. First, we use signals with integer amplitudes to send information. Then we use polyadic techniques to transfer the plaintext into series of special integers. The receiver restores the plaintext using special rules and systems of equations.
△ Less
Submitted 8 July, 2025;
originally announced July 2025.
-
Holographic Communication via Recordable and Reconfigurable Metasurface
Authors:
Jinzhe Wang,
Qinghua Guo,
Xiaojun Yuan
Abstract:
Holographic surface based communication technologies are anticipated to play a significant role in the next generation of wireless networks. The existing reconfigurable holographic surface (RHS)-based scheme only utilizes the reconstruction process of the holographic principle for beamforming, where the channel sate information (CSI) is needed. However, channel estimation for CSI acquirement is a…
▽ More
Holographic surface based communication technologies are anticipated to play a significant role in the next generation of wireless networks. The existing reconfigurable holographic surface (RHS)-based scheme only utilizes the reconstruction process of the holographic principle for beamforming, where the channel sate information (CSI) is needed. However, channel estimation for CSI acquirement is a challenging task in metasurface based communications. In this study, inspired by both the recording and reconstruction processes of holography, we develop a novel holographic communication scheme by introducing recordable and reconfigurable metasurfaces (RRMs), where channel estimation is not needed thanks to the recording process. Then we analyze the input-output mutual information of the RRM-based communication system and compare it with the existing RHS based system. Our results show that, without channel estimation, the proposed scheme achieves performance comparable to that of the RHS scheme with perfect CSI, suggesting a promising alternative for future wireless communication networks.
△ Less
Submitted 24 June, 2025;
originally announced June 2025.
-
Ming-Omni: A Unified Multimodal Model for Perception and Generation
Authors:
Inclusion AI,
Biao Gong,
Cheng Zou,
Chuanyang Zheng,
Chunluan Zhou,
Canxiang Yan,
Chunxiang Jin,
Chunjie Shen,
Dandan Zheng,
Fudong Wang,
Furong Xu,
GuangMing Yao,
Jun Zhou,
Jingdong Chen,
Jianxin Sun,
Jiajia Liu,
Jianjiang Zhu,
Jun Peng,
Kaixiang Ji,
Kaiyou Song,
Kaimeng Ren,
Libin Wang,
Lixiang Ru,
Lele Xie,
Longhua Tan
, et al. (33 additional authors not shown)
Abstract:
We propose Ming-Omni, a unified multimodal model capable of processing images, text, audio, and video, while demonstrating strong proficiency in both speech and image generation. Ming-Omni employs dedicated encoders to extract tokens from different modalities, which are then processed by Ling, an MoE architecture equipped with newly proposed modality-specific routers. This design enables a single…
▽ More
We propose Ming-Omni, a unified multimodal model capable of processing images, text, audio, and video, while demonstrating strong proficiency in both speech and image generation. Ming-Omni employs dedicated encoders to extract tokens from different modalities, which are then processed by Ling, an MoE architecture equipped with newly proposed modality-specific routers. This design enables a single model to efficiently process and fuse multimodal inputs within a unified framework, thereby facilitating diverse tasks without requiring separate models, task-specific fine-tuning, or structural redesign. Importantly, Ming-Omni extends beyond conventional multimodal models by supporting audio and image generation. This is achieved through the integration of an advanced audio decoder for natural-sounding speech and Ming-Lite-Uni for high-quality image generation, which also allow the model to engage in context-aware chatting, perform text-to-speech conversion, and conduct versatile image editing. Our experimental results showcase Ming-Omni offers a powerful solution for unified perception and generation across all modalities. Notably, our proposed Ming-Omni is the first open-source model we are aware of to match GPT-4o in modality support, and we release all code and model weights to encourage further research and development in the community.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Double Low-Rank 4D Tensor Decomposition for Circular RIS-Aided mmWave MIMO-NOMA System Channel Estimation in Mobility Scenarios
Authors:
Wanyuan Cai,
Xiaoping Jin,
Youming Li,
Menglei Sheng,
Mingjun Huang,
Qinke Qi,
Qiang Guo
Abstract:
Channel estimation is not only essential to highly reliable data transmission and massive device access but also an important component of the integrated sensing and communication (ISAC) in the sixth-generation (6G) mobile communication systems. In this paper, we consider a downlink channel estimation problem for circular reconfigurable intelligent surface (RIS)-aided millimeter-wave (mmWave) mult…
▽ More
Channel estimation is not only essential to highly reliable data transmission and massive device access but also an important component of the integrated sensing and communication (ISAC) in the sixth-generation (6G) mobile communication systems. In this paper, we consider a downlink channel estimation problem for circular reconfigurable intelligent surface (RIS)-aided millimeter-wave (mmWave) multiple-input multiple-output non-orthogonal multiple access (MIMO-NOMA) system in mobility scenarios. First, we propose a subframe partitioning scheme to facilitate the modeling of the received signal as a fourth-order tensor satisfying a canonical polyadic decomposition (CPD) form, thereby formulating the channel estimation problem as tensor decomposition and parameter extraction problems. Then, by exploiting both the global and local low-rank properties of the received signal, we propose a double low-rank 4D tensor decomposition model to decompose the received signal into four factor matrices, which is efficiently solved via alternating direction method of multipliers (ADMM). Subsequently, we propose a two-stage parameter estimation method based on the Jacobi-Anger expansion and the special structure of circular RIS to uniquely decouple the angle parameters. Furthermore, the time delay, Doppler shift, and channel gain parameters can also be estimated without ambiguities, and their estimation accuracy can be efficiently improved, especially at low signal-to-noise ratio (SNR). Finally, a concise closed-form expression for the Cramér-Rao bound (CRB) is derived as a performance benchmark. Numerical experiments are conducted to demonstrate the effectiveness of the proposed method compared with the other discussed methods.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
MedITok: A Unified Tokenizer for Medical Image Synthesis and Interpretation
Authors:
Chenglong Ma,
Yuanfeng Ji,
Jin Ye,
Zilong Li,
Chenhui Wang,
Junzhi Ning,
Wei Li,
Lihao Liu,
Qiushan Guo,
Tianbin Li,
Junjun He,
Hongming Shan
Abstract:
Advanced autoregressive models have reshaped multimodal AI. However, their transformative potential in medical imaging remains largely untapped due to the absence of a unified visual tokenizer -- one capable of capturing fine-grained visual structures for faithful image reconstruction and realistic image synthesis, as well as rich semantics for accurate diagnosis and image interpretation. To this…
▽ More
Advanced autoregressive models have reshaped multimodal AI. However, their transformative potential in medical imaging remains largely untapped due to the absence of a unified visual tokenizer -- one capable of capturing fine-grained visual structures for faithful image reconstruction and realistic image synthesis, as well as rich semantics for accurate diagnosis and image interpretation. To this end, we present MedITok, the first unified tokenizer tailored for medical images, encoding both low-level structural details and high-level clinical semantics within a unified latent space. To balance these competing objectives, we introduce a novel two-stage training framework: a visual representation alignment stage that cold-starts the tokenizer reconstruction learning with a visual semantic constraint, followed by a textual semantic representation alignment stage that infuses detailed clinical semantics into the latent space. Trained on the meticulously collected large-scale dataset with over 30 million medical images and 2 million image-caption pairs, MedITok achieves state-of-the-art performance on more than 30 datasets across 9 imaging modalities and 4 different tasks. By providing a unified token space for autoregressive modeling, MedITok supports a wide range of tasks in clinical diagnostics and generative healthcare applications. Model and code will be made publicly available at: https://github.com/Masaaki-75/meditok.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Focal Split: Untethered Snapshot Depth from Differential Defocus
Authors:
Junjie Luo,
John Mamish,
Alan Fu,
Thomas Concannon,
Josiah Hester,
Emma Alexander,
Qi Guo
Abstract:
We introduce Focal Split, a handheld, snapshot depth camera with fully onboard power and computing based on depth-from-differential-defocus (DfDD). Focal Split is passive, avoiding power consumption of light sources. Its achromatic optical system simultaneously forms two differentially defocused images of the scene, which can be independently captured using two photosensors in a snapshot. The data…
▽ More
We introduce Focal Split, a handheld, snapshot depth camera with fully onboard power and computing based on depth-from-differential-defocus (DfDD). Focal Split is passive, avoiding power consumption of light sources. Its achromatic optical system simultaneously forms two differentially defocused images of the scene, which can be independently captured using two photosensors in a snapshot. The data processing is based on the DfDD theory, which efficiently computes a depth and a confidence value for each pixel with only 500 floating point operations (FLOPs) per pixel from the camera measurements. We demonstrate a Focal Split prototype, which comprises a handheld custom camera system connected to a Raspberry Pi 5 for real-time data processing. The system consumes 4.9 W and is powered on a 5 V, 10,000 mAh battery. The prototype can measure objects with distances from 0.4 m to 1.2 m, outputting 480$\times$360 sparse depth maps at 2.1 frames per second (FPS) using unoptimized Python scripts. Focal Split is DIY friendly. A comprehensive guide to building your own Focal Split depth camera, code, and additional data can be found at https://focal-split.qiguo.org.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Wavefront Estimation From a Single Measurement: Uniqueness and Algorithms
Authors:
Nicholas Chimitt,
Ali Almuallem,
Qi Guo,
Stanley H. Chan
Abstract:
Wavefront estimation is an essential component of adaptive optics where the goal is to recover the underlying phase from its Fourier magnitude. While this may sound identical to classical phase retrieval, wavefront estimation faces more strict requirements regarding uniqueness as adaptive optics systems need a unique phase to compensate for the distorted wavefront. Existing real-time wavefront est…
▽ More
Wavefront estimation is an essential component of adaptive optics where the goal is to recover the underlying phase from its Fourier magnitude. While this may sound identical to classical phase retrieval, wavefront estimation faces more strict requirements regarding uniqueness as adaptive optics systems need a unique phase to compensate for the distorted wavefront. Existing real-time wavefront estimation methodologies are dominated by sensing via specialized optical hardware due to their high speed, but they often have a low spatial resolution. A computational method that can perform both fast and accurate wavefront estimation with a single measurement can improve resolution and bring new applications such as real-time passive wavefront estimation, opening the door to a new generation of medical and defense applications.
In this paper, we tackle the wavefront estimation problem by observing that the non-uniqueness is related to the geometry of the pupil shape. By analyzing the source of ambiguities and breaking the symmetry, we present a joint optics-algorithm approach by co-designing the shape of the pupil and the reconstruction neural network. Using our proposed lightweight neural network, we demonstrate wavefront estimation of a phase of size $128\times 128$ at $5,200$ frames per second on a CPU computer, achieving an average Strehl ratio up to $0.98$ in the noiseless case. We additionally test our method on real measurements using a spatial light modulator. Code is available at https://pages.github.itap.purdue.edu/StanleyChanGroup/wavefront-estimation/.
△ Less
Submitted 12 April, 2025;
originally announced April 2025.
-
1-Tb/s/λ Transmission over Record 10714-km AR-HCF
Authors:
Dawei Ge,
Siyuan Liu,
Qiang Qiu,
Peng Li,
Qiang Guo,
Yiqi Li,
Dong Wang,
Baoluo Yan,
Mingqing Zuo,
Lei Zhang,
Dechao Zhang,
Hu Shi,
Jie Luo,
Han Li,
Zhangyuan Chen
Abstract:
We present the first single-channel 1.001-Tb/s DP-36QAM-PCS recirculating transmission over 73 loops of 146.77-km ultra-low-loss & low-IMI DNANF-5 fiber, achieving a record transmission distance of 10,714.28 km.
We present the first single-channel 1.001-Tb/s DP-36QAM-PCS recirculating transmission over 73 loops of 146.77-km ultra-low-loss & low-IMI DNANF-5 fiber, achieving a record transmission distance of 10,714.28 km.
△ Less
Submitted 2 April, 2025; v1 submitted 31 March, 2025;
originally announced March 2025.
-
Spectrum from Defocus: Fast Spectral Imaging with Chromatic Focal Stack
Authors:
M. Kerem Aydin,
Yi-Chun Hung,
Jaclyn Pytlarz,
Qi Guo,
Emma Alexander
Abstract:
Hyperspectral cameras face harsh trade-offs between spatial, spectral, and temporal resolution in an inherently low-photon regime. Computational imaging systems break through these trade-offs with compressive sensing, but require complex optics and/or extensive compute. We present Spectrum from Defocus (SfD), a chromatic focal sweep method that recovers state-of-the-art hyperspectral images with a…
▽ More
Hyperspectral cameras face harsh trade-offs between spatial, spectral, and temporal resolution in an inherently low-photon regime. Computational imaging systems break through these trade-offs with compressive sensing, but require complex optics and/or extensive compute. We present Spectrum from Defocus (SfD), a chromatic focal sweep method that recovers state-of-the-art hyperspectral images with a small system of off-the-shelf optics and < 1 second of compute. Our camera uses two lenses and a grayscale sensor to preserve nearly all incident light in a chromatically-aberrated focal stack. Our physics-based iterative algorithm efficiently demixes, deconvolves, and denoises the blurry grayscale focal stack into a sharp spectral image. The combination of photon efficiency, optical simplicity, and physical modeling makes SfD a promising solution for fast, compact, interpretable hyperspectral imaging.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
$\mathbfΦ$-GAN: Physics-Inspired GAN for Generating SAR Images Under Limited Data
Authors:
Xidan Zhang,
Yihan Zhuang,
Qian Guo,
Haodong Yang,
Xuelin Qian,
Gong Cheng,
Junwei Han,
Zhongling Huang
Abstract:
Approaches for improving generative adversarial networks (GANs) training under a few samples have been explored for natural images. However, these methods have limited effectiveness for synthetic aperture radar (SAR) images, as they do not account for the unique electromagnetic scattering properties of SAR. To remedy this, we propose a physics-inspired regularization method dubbed $Φ$-GAN, which i…
▽ More
Approaches for improving generative adversarial networks (GANs) training under a few samples have been explored for natural images. However, these methods have limited effectiveness for synthetic aperture radar (SAR) images, as they do not account for the unique electromagnetic scattering properties of SAR. To remedy this, we propose a physics-inspired regularization method dubbed $Φ$-GAN, which incorporates the ideal point scattering center (PSC) model of SAR with two physical consistency losses. The PSC model approximates SAR targets using physical parameters, ensuring that $Φ$-GAN generates SAR images consistent with real physical properties while preventing discriminator overfitting by focusing on PSC-based decision cues. To embed the PSC model into GANs for end-to-end training, we introduce a physics-inspired neural module capable of estimating the physical parameters of SAR targets efficiently. This module retains the interpretability of the physical model and can be trained with limited data. We propose two physical loss functions: one for the generator, guiding it to produce SAR images with physical parameters consistent with real ones, and one for the discriminator, enhancing its robustness by basing decisions on PSC attributes. We evaluate $Φ$-GAN across several conditional GAN (cGAN) models, demonstrating state-of-the-art performance in data-scarce scenarios on three SAR image datasets.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Exploiting Vulnerabilities in Speech Translation Systems through Targeted Adversarial Attacks
Authors:
Chang Liu,
Haolin Wu,
Xi Yang,
Kui Zhang,
Cong Wu,
Weiming Zhang,
Nenghai Yu,
Tianwei Zhang,
Qing Guo,
Jie Zhang
Abstract:
As speech translation (ST) systems become increasingly prevalent, understanding their vulnerabilities is crucial for ensuring robust and reliable communication. However, limited work has explored this issue in depth. This paper explores methods of compromising these systems through imperceptible audio manipulations. Specifically, we present two innovative approaches: (1) the injection of perturbat…
▽ More
As speech translation (ST) systems become increasingly prevalent, understanding their vulnerabilities is crucial for ensuring robust and reliable communication. However, limited work has explored this issue in depth. This paper explores methods of compromising these systems through imperceptible audio manipulations. Specifically, we present two innovative approaches: (1) the injection of perturbation into source audio, and (2) the generation of adversarial music designed to guide targeted translation, while also conducting more practical over-the-air attacks in the physical world. Our experiments reveal that carefully crafted audio perturbations can mislead translation models to produce targeted, harmful outputs, while adversarial music achieve this goal more covertly, exploiting the natural imperceptibility of music. These attacks prove effective across multiple languages and translation models, highlighting a systemic vulnerability in current ST architectures. The implications of this research extend beyond immediate security concerns, shedding light on the interpretability and robustness of neural speech processing systems. Our findings underscore the need for advanced defense mechanisms and more resilient architectures in the realm of audio systems. More details and samples can be found at https://adv-st.github.io.
△ Less
Submitted 4 March, 2025; v1 submitted 2 March, 2025;
originally announced March 2025.
-
Equity-aware Design and Timing of Fare-free Transit Zoning under Demand Uncertainty
Authors:
Qianwen Guo,
Jiaqing Lu,
Joseph Y. J. Chow,
Paul Schonfeld
Abstract:
We propose the first analytical stochastic model for optimizing the configuration and implementation policies of fare-free transit. The model focuses on a transportation corridor with two transportation modes: automobiles and buses. The corridor is divided into two sections, an inner one with fare-free transit service and an outer one with fare-based transit service. Under the static version of th…
▽ More
We propose the first analytical stochastic model for optimizing the configuration and implementation policies of fare-free transit. The model focuses on a transportation corridor with two transportation modes: automobiles and buses. The corridor is divided into two sections, an inner one with fare-free transit service and an outer one with fare-based transit service. Under the static version of the model, the optimized length and frequency of the fare-free transit zone can be determined by maximizing total social welfare. The findings indicate that implementing fare-free transit can increase transit ridership and reduce automobile use within the fare-free zone while social equity among the demand groups can be enhanced by lengthening the fare-free zone. Notably, the optimal zone length increases when both social welfare and equity are considered jointly, compared to only prioritizing social welfare. The dynamic model, framed within a market entry and exit real options approach, solves the fare policy switching problem, establishing optimal timing policies for activating or terminating fare-free service. The results from dynamic models reveal earlier implementation and extended durations of fare-free transit in the social welfare-aware regime, driven by lower thresholds compared to the social equity-aware regime.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Policy Selection and Schedules for Exclusive Bus Lane and High Occupancy Vehicle Lane in a Bi-modal Transportation Corridor
Authors:
Jiaqing Lu,
Qianwen Guo,
Paul Schonfeld
Abstract:
Efficient management of transportation corridors is critical for sustaining urban mobility, directly influencing transportation efficiency. Two prominent strategies for enhancing public transit services and alleviating congestion, Exclusive Bus Lane (EBL) and High Occupancy Vehicle Lane (HOVL), are gaining increasing attention. EBLs prioritize bus transit by providing dedicated lanes for faster tr…
▽ More
Efficient management of transportation corridors is critical for sustaining urban mobility, directly influencing transportation efficiency. Two prominent strategies for enhancing public transit services and alleviating congestion, Exclusive Bus Lane (EBL) and High Occupancy Vehicle Lane (HOVL), are gaining increasing attention. EBLs prioritize bus transit by providing dedicated lanes for faster travel times, while HOVLs encourage carpooling by reserving lanes for high-occupancy vehicles. However, static implementations of these policies may underutilize road resources and disrupt general-purpose lanes. Dynamic control of these policies, based on real-time demand, can potentially maximize road efficiency and minimize negative impacts. This study develops cost functions for Mixed Traffic Policy (MTP), Exclusive Bus Lane Policy (EBLP), and High Occupancy Vehicle Lane Policy (HOVLP), incorporating optimized bus frequency and demand split under equilibrium condition. Switching thresholds for policy selection are derived to identify optimal periods for implementing each policy based on dynamic demand simulated using an Ornstein-Uhlenbeck (O-U) process. Results reveal significant reductions in total system costs with the proposed dynamic policy integration. Compared to static implementations, the combined policy achieves cost reductions of 12.0%, 5.3% and 42.5% relative to MTP-only, EBLP-only, and HOVLP-only scenarios, respectively. Additionally, in two real case studies of existing EBL and HOVL operations, the proposed dynamic policy reduces total costs by 32.2% and 27.9%, respectively. The findings provide valuable insights for policymakers and transit planners, offering a robust framework for dynamically scheduling and integrating EBL and HOVL policies to optimize urban corridor efficiency and reduce overall system costs.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Runway capacity expansion planning for public airports under demand uncertainty
Authors:
Ziyue Li,
Joseph Y. J. Chow,
Qianwen Guo
Abstract:
Flight delay is a significant issue affecting air travel. The runway system, frequently falling short of demand, serves as a bottleneck. As demand increases, runway capacity expansion becomes imperative to mitigate congestion. However, the decision to expand runway capacity is challenging due to inherent uncertainties in demand forecasts. This paper presents a novel approach to modeling air traffi…
▽ More
Flight delay is a significant issue affecting air travel. The runway system, frequently falling short of demand, serves as a bottleneck. As demand increases, runway capacity expansion becomes imperative to mitigate congestion. However, the decision to expand runway capacity is challenging due to inherent uncertainties in demand forecasts. This paper presents a novel approach to modeling air traffic demand growth as a jump diffusion process, incorporating two layers of uncertainty: Geometric Brownian Motion (GBM) for continuous variability and a Poisson process to capture the impact of crisis events, such as natural disasters or public health emergencies, on decision-making. We propose a real options model to jointly evaluate the interrelated factors of optimal runway capacity and investment timing under uncertainty, with investment timing linked to trigger demand. The findings suggest that increased uncertainty indicates more conservative decision-making. Furthermore, the relationship between optimal investment timing and expansion size is complex: if the expansion size remains unchanged, the trigger demand decreases as the demand growth rate increases; if the expansion size experiences a jump, the trigger demand also exhibits a sharp rise. This work provides valuable insights for airport authorities for informed capacity expansion decision-making.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Vehicle occupancy estimation in Automated Guideway Transit via deep learning with Wi-Fi probe requests
Authors:
Ziyue Li,
Qianwen Guo
Abstract:
This study contributes to the advancement of vehicle occupancy estimation in Automated Guideway Transit (AGT) systems using Wi-Fi probe requests and deep learning models. We propose a comprehensive framework for evaluating various approaches to occupancy estimation, particularly in the context of MAC address randomization. While many methods proposed in the literature claim effectiveness in simple…
▽ More
This study contributes to the advancement of vehicle occupancy estimation in Automated Guideway Transit (AGT) systems using Wi-Fi probe requests and deep learning models. We propose a comprehensive framework for evaluating various approaches to occupancy estimation, particularly in the context of MAC address randomization. While many methods proposed in the literature claim effectiveness in simpler experimental settings, our research reveals that those methods are unreliable in the complex environment of AGT systems. Specifically, techniques for handling randomized MAC addresses and distinguishing between passenger and non-passenger data do not perform well in AGT systems. Despite challenges in tracking individual devices, our study demonstrates that accurate occupancy estimation using Wi-Fi probe requests remains feasible. A pilot study conducted on the Miami-Dade Metromover, an AGT system characterized by frequent stops, significant occupancy fluctuations, and absence of fare collection devices, provides a robust testing ground for the framework. Additionally, our findings show that deep learning models significantly outperform machine learning models in this context. The insights from this study can significantly enhance decision-making for transit agencies to optimize operations and elevate service quality.
△ Less
Submitted 16 May, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
Tumor Detection, Segmentation and Classification Challenge on Automated 3D Breast Ultrasound: The TDSC-ABUS Challenge
Authors:
Gongning Luo,
Mingwang Xu,
Hongyu Chen,
Xinjie Liang,
Xing Tao,
Dong Ni,
Hyunsu Jeong,
Chulhong Kim,
Raphael Stock,
Michael Baumgartner,
Yannick Kirchhoff,
Maximilian Rokuss,
Klaus Maier-Hein,
Zhikai Yang,
Tianyu Fan,
Nicolas Boutry,
Dmitry Tereshchenko,
Arthur Moine,
Maximilien Charmetant,
Jan Sauer,
Hao Du,
Xiang-Hui Bai,
Vipul Pai Raikar,
Ricardo Montoya-del-Angel,
Robert Marti
, et al. (12 additional authors not shown)
Abstract:
Breast cancer is one of the most common causes of death among women worldwide. Early detection helps in reducing the number of deaths. Automated 3D Breast Ultrasound (ABUS) is a newer approach for breast screening, which has many advantages over handheld mammography such as safety, speed, and higher detection rate of breast cancer. Tumor detection, segmentation, and classification are key componen…
▽ More
Breast cancer is one of the most common causes of death among women worldwide. Early detection helps in reducing the number of deaths. Automated 3D Breast Ultrasound (ABUS) is a newer approach for breast screening, which has many advantages over handheld mammography such as safety, speed, and higher detection rate of breast cancer. Tumor detection, segmentation, and classification are key components in the analysis of medical images, especially challenging in the context of 3D ABUS due to the significant variability in tumor size and shape, unclear tumor boundaries, and a low signal-to-noise ratio. The lack of publicly accessible, well-labeled ABUS datasets further hinders the advancement of systems for breast tumor analysis. Addressing this gap, we have organized the inaugural Tumor Detection, Segmentation, and Classification Challenge on Automated 3D Breast Ultrasound 2023 (TDSC-ABUS2023). This initiative aims to spearhead research in this field and create a definitive benchmark for tasks associated with 3D ABUS image analysis. In this paper, we summarize the top-performing algorithms from the challenge and provide critical analysis for ABUS image examination. We offer the TDSC-ABUS challenge as an open-access platform at https://tdsc-abus2023.grand-challenge.org/ to benchmark and inspire future developments in algorithmic research.
△ Less
Submitted 26 January, 2025;
originally announced January 2025.
-
Risk and Vulnerability Assessment of Energy-Transportation Infrastructure Systems to Extreme Weather
Authors:
Jiawei Wang,
Qinglai Guo,
Hongbin Sun
Abstract:
The interaction between extreme weather events and interdependent critical infrastructure systems involves complex spatiotemporal dynamics. Multi-type emergency decisions within energy-transportation infrastructures significantly influence system performance throughout the extreme weather process. A comprehensive assessment of these factors faces challenges in model complexity and heterogeneity be…
▽ More
The interaction between extreme weather events and interdependent critical infrastructure systems involves complex spatiotemporal dynamics. Multi-type emergency decisions within energy-transportation infrastructures significantly influence system performance throughout the extreme weather process. A comprehensive assessment of these factors faces challenges in model complexity and heterogeneity between energy and transportation systems. This paper proposes an assessment framework that accommodates multiple types of emergency decisions. It integrates the heterogeneous energy and transportation infrastructures in the form of a network flow model to simulate and quantify the impact of extreme weather events on the energy-transportation infrastructure system. Based on this framework, a targeted method for identifying system vulnerabilities is further introduced, utilizing a neural network surrogate that achieves privacy protection and evaluation acceleration while maintaining consideration of system interdependencies. Numerical experiments demonstrate that the proposed framework and method can reveal the risk levels faced by urban infrastructure systems, identify weak points that should be prioritized for reinforcement, and strike a balance between accuracy and evaluation speed.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Bright-NeRF:Brightening Neural Radiance Field with Color Restoration from Low-light Raw Images
Authors:
Min Wang,
Xin Huang,
Guoqing Zhou,
Qifeng Guo,
Qing Wang
Abstract:
Neural Radiance Fields (NeRFs) have demonstrated prominent performance in novel view synthesis. However, their input heavily relies on image acquisition under normal light conditions, making it challenging to learn accurate scene representation in low-light environments where images typically exhibit significant noise and severe color distortion. To address these challenges, we propose a novel app…
▽ More
Neural Radiance Fields (NeRFs) have demonstrated prominent performance in novel view synthesis. However, their input heavily relies on image acquisition under normal light conditions, making it challenging to learn accurate scene representation in low-light environments where images typically exhibit significant noise and severe color distortion. To address these challenges, we propose a novel approach, Bright-NeRF, which learns enhanced and high-quality radiance fields from multi-view low-light raw images in an unsupervised manner. Our method simultaneously achieves color restoration, denoising, and enhanced novel view synthesis. Specifically, we leverage a physically-inspired model of the sensor's response to illumination and introduce a chromatic adaptation loss to constrain the learning of response, enabling consistent color perception of objects regardless of lighting conditions. We further utilize the raw data's properties to expose the scene's intensity automatically. Additionally, we have collected a multi-view low-light raw image dataset to advance research in this field. Experimental results demonstrate that our proposed method significantly outperforms existing 2D and 3D approaches. Our code and dataset will be made publicly available.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
GLinSAT: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent
Authors:
Hongtai Zeng,
Chao Yang,
Yanzhen Zhou,
Cheng Yang,
Qinglai Guo
Abstract:
Ensuring that the outputs of neural networks satisfy specific constraints is crucial for applying neural networks to real-life decision-making problems. In this paper, we consider making a batch of neural network outputs satisfy bounded and general linear constraints. We first reformulate the neural network output projection problem as an entropy-regularized linear programming problem. We show tha…
▽ More
Ensuring that the outputs of neural networks satisfy specific constraints is crucial for applying neural networks to real-life decision-making problems. In this paper, we consider making a batch of neural network outputs satisfy bounded and general linear constraints. We first reformulate the neural network output projection problem as an entropy-regularized linear programming problem. We show that such a problem can be equivalently transformed into an unconstrained convex optimization problem with Lipschitz continuous gradient according to the duality theorem. Then, based on an accelerated gradient descent algorithm with numerical performance enhancement, we present our architecture, GLinSAT, to solve the problem. To the best of our knowledge, this is the first general linear satisfiability layer in which all the operations are differentiable and matrix-factorization-free. Despite the fact that we can explicitly perform backpropagation based on automatic differentiation mechanism, we also provide an alternative approach in GLinSAT to calculate the derivatives based on implicit differentiation of the optimality condition. Experimental results on constrained traveling salesman problems, partial graph matching with outliers, predictive portfolio allocation and power system unit commitment demonstrate the advantages of GLinSAT over existing satisfiability layers. Our implementation is available at \url{https://github.com/HunterTracer/GLinSAT}.
△ Less
Submitted 11 November, 2024; v1 submitted 25 September, 2024;
originally announced September 2024.
-
Depth from Coupled Optical Differentiation
Authors:
Junjie Luo,
Yuxuan Liu,
Emma Alexander,
Qi Guo
Abstract:
We propose depth from coupled optical differentiation, a low-computation passive-lighting 3D sensing mechanism. It is based on our discovery that per-pixel object distance can be rigorously determined by a coupled pair of optical derivatives of a defocused image using a simple, closed-form relationship. Unlike previous depth-from-defocus (DfD) methods that leverage spatial derivatives of the image…
▽ More
We propose depth from coupled optical differentiation, a low-computation passive-lighting 3D sensing mechanism. It is based on our discovery that per-pixel object distance can be rigorously determined by a coupled pair of optical derivatives of a defocused image using a simple, closed-form relationship. Unlike previous depth-from-defocus (DfD) methods that leverage spatial derivatives of the image to estimate scene depths, the proposed mechanism's use of only optical derivatives makes it significantly more robust to noise. Furthermore, unlike many previous DfD algorithms with requirements on aperture code, this relationship is proved to be universal to a broad range of aperture codes.
We build the first 3D sensor based on depth from coupled optical differentiation. Its optical assembly includes a deformable lens and a motorized iris, which enables dynamic adjustments to the optical power and aperture radius. The sensor captures two pairs of images: one pair with a differential change of optical power and the other with a differential change of aperture scale. From the four images, a depth and confidence map can be generated with only 36 floating point operations per output pixel (FLOPOP), more than ten times lower than the previous lowest passive-lighting depth sensing solution to our knowledge. Additionally, the depth map generated by the proposed sensor demonstrates more than twice the working range of previous DfD methods while using significantly lower computation.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Neural Network-Assisted Hybrid Model Based Message Passing for Parametric Holographic MIMO Near Field Channel Estimation
Authors:
Zhengdao Yuan,
Yabo Guo,
Dawei Gao,
Qinghua Guo,
Zhongyong Wang,
Chongwen Huang,
Ming Jin,
Kai-Kit Wong
Abstract:
Holographic multiple-input and multiple-output (HMIMO) is a promising technology with the potential to achieve high energy and spectral efficiencies, enhance system capacity and diversity, etc. In this work, we address the challenge of HMIMO near field (NF) channel estimation, which is complicated by the intricate model introduced by the dyadic Green's function. Despite its complexity, the channel…
▽ More
Holographic multiple-input and multiple-output (HMIMO) is a promising technology with the potential to achieve high energy and spectral efficiencies, enhance system capacity and diversity, etc. In this work, we address the challenge of HMIMO near field (NF) channel estimation, which is complicated by the intricate model introduced by the dyadic Green's function. Despite its complexity, the channel model is governed by a limited set of parameters. This makes parametric channel estimation highly attractive, offering substantial performance enhancements and enabling the extraction of valuable sensing parameters, such as user locations, which are particularly beneficial in mobile networks. However, the relationship between these parameters and channel gains is nonlinear and compounded by integration, making the estimation a formidable task. To tackle this problem, we propose a novel neural network (NN) assisted hybrid method. With the assistance of NNs, we first develop a novel hybrid channel model with a significantly simplified expression compared to the original one, thereby enabling parametric channel estimation. Using the readily available training data derived from the original channel model, the NNs in the hybrid channel model can be effectively trained offline. Then, building upon this hybrid channel model, we formulate the parametric channel estimation problem with a probabilistic framework and design a factor graph representation for Bayesian estimation. Leveraging the factor graph representation and unitary approximate message passing (UAMP), we develop an effective message passing-based Bayesian channel estimation algorithm. Extensive simulations demonstrate the superior performance of the proposed method.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Full-Duplex ISAC-Enabled D2D Underlaid Cellular Networks: Joint Transceiver Beamforming and Power Allocation
Authors:
Tao Jiang,
Ming Jin,
Qinghua Guo,
Yinhong Liu,
Yaming Li
Abstract:
Integrating device-to-device (D2D) communication into cellular networks can significantly reduce the transmission burden on base stations (BSs). Besides, integrated sensing and communication (ISAC) is envisioned as a key feature in future wireless networks. In this work, we consider a full-duplex ISAC- based D2D underlaid system, and propose a joint beamforming and power allocation scheme to impro…
▽ More
Integrating device-to-device (D2D) communication into cellular networks can significantly reduce the transmission burden on base stations (BSs). Besides, integrated sensing and communication (ISAC) is envisioned as a key feature in future wireless networks. In this work, we consider a full-duplex ISAC- based D2D underlaid system, and propose a joint beamforming and power allocation scheme to improve the performance of the coexisting ISAC and D2D networks. To enhance spectral efficiency, a sum rate maximization problem is formulated for the full-duplex ISAC-based D2D underlaid system, which is non-convex. To solve the non-convex optimization problem, we propose a successive convex approximation (SCA)-based iterative algorithm and prove its convergence. Numerical results are provided to validate the effectiveness of the proposed scheme with the iterative algorithm, demonstrating that the proposed scheme outperforms state-of-the-art ones in both communication and sensing performance.
△ Less
Submitted 21 August, 2024; v1 submitted 21 August, 2024;
originally announced August 2024.
-
Iterative Equalization of CPM With Unitary Approximate Message Passing
Authors:
Zilong Liu,
Yi Song,
Qinghua Guo,
Peng Sun,
Kexian Gong,
Zhongyong Wang
Abstract:
Continuous phase modulation (CPM) has extensive applications in wireless communications due to its high spectral and power efficiency. However, its nonlinear characteristics pose significant challenges for detection in frequency selective fading channels. This paper proposes an iterative receiver tailored for the detection of CPM signals over frequency selective fading channels. This design levera…
▽ More
Continuous phase modulation (CPM) has extensive applications in wireless communications due to its high spectral and power efficiency. However, its nonlinear characteristics pose significant challenges for detection in frequency selective fading channels. This paper proposes an iterative receiver tailored for the detection of CPM signals over frequency selective fading channels. This design leverages the factor graph framework to integrate equalization, demodulation, and decoding functions. The equalizer employs the unitary approximate message passing (UAMP) algorithm, while the unitary transformation is implemented using the fast Fourier transform (FFT) with the aid of a cyclic prefix (CP), thereby achieving low computational complexity while with high performance. For CPM demodulation and channel decoding, with belief propagation (BP), we design a message passing-based maximum a posteriori (MAP) algorithm, and the message exchange between the demodulator, decoder and equalizer is elaborated. With proper message passing schedules, the receiver can achieve fast convergence. Simulation results show that compared with existing turbo receivers, the proposed receiver delivers significant performance enhancement with low computational complexity.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks
Authors:
Hang Li,
Hongming Yang,
Qinghua Guo,
J. Andrew Zhang,
Yang Xiang,
Yashan Pang
Abstract:
In this work, we investigate sensing parameter estimation in the presence of clutter in perceptive mobile networks (PMNs) that integrate radar sensing into mobile communications. Performing clutter suppression before sensing parameter estimation is generally desirable as the number of sensing parameters can be signiffcantly reduced. However, existing methods require high-complexity clutter mitigat…
▽ More
In this work, we investigate sensing parameter estimation in the presence of clutter in perceptive mobile networks (PMNs) that integrate radar sensing into mobile communications. Performing clutter suppression before sensing parameter estimation is generally desirable as the number of sensing parameters can be signiffcantly reduced. However, existing methods require high-complexity clutter mitigation and sensing parameter estimation, where clutter is ffrstly identiffed and then removed. In this correspondence, we propose a much simpler but more effective method by incorporating a clutter cancellation mechanism in formulating a sparse signal model for sensing parameter estimation.
In particular, clutter mitigation is performed directly on the received signals and the unitary approximate message passing (UAMP) is leveraged to exploit the common support for sensing parameter estimation in the formulated sparse signal recovery problem. Simulation results show that, compared to state-of-theart methods, the proposed method delivers signiffcantly better performance while with substantially reduced complexity.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Prediction-Free Coordinated Dispatch of Microgrid: A Data-Driven Online Optimization Approach
Authors:
Kaidi Huang,
Lin Cheng,
Ning Qi,
David Wenzhong Gao,
Asad Mujeeb,
Qinglai Guo
Abstract:
Traditional prediction-dependent dispatch methods can face challenges when renewables and prices predictions are unreliable in microgrid. Instead, this paper proposes a novel prediction-free two-stage coordinated dispatch approach in microgrid. Empirical learning is conducted during the offline stage, where we calculate the offline optimal state of charge (SOC) sequences for generic energy storage…
▽ More
Traditional prediction-dependent dispatch methods can face challenges when renewables and prices predictions are unreliable in microgrid. Instead, this paper proposes a novel prediction-free two-stage coordinated dispatch approach in microgrid. Empirical learning is conducted during the offline stage, where we calculate the offline optimal state of charge (SOC) sequences for generic energy storage under different historical scenarios. During the online stage, we synthesize a dynamically updated reference for SOC and a dynamic opportunity price (DOP) based on empirical learning and real-time observations. They provide a global vision for online operation and effectively address the myopic tendencies inherent to online decision-making. The real-time control action, generated from online optimization algorithm, aims to minimize the operational costs while tracking the reference and considering DOP. Additionally, we develop an adaptive virtual-queue-based online optimization algorithm based on online convex optimization (OCO) framework. We provide theoretical proof that the proposed algorithm outperforms the existing OCO algorithms and achieves sublinear dynamic regret bound and sublinear strict constraint violation bound. Simulation-based studies demonstrate that, compared with model predictive control-based methods, it reduces operational costs and voltage violation rate by 5% and 9%, respectively.
△ Less
Submitted 1 October, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications
Authors:
Zhou Zhou,
Guohang He,
Zheng Zhang,
Luziwei Leng,
Qinghai Guo,
Jianxing Liao,
Xuan Song,
Ran Cheng
Abstract:
Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks…
▽ More
Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks to identify an optimal neural decoding backbone that boasts robust performance and swift inference capabilities suitable for edge deployment. We executed a series of neural decoding experiments involving nonhuman primates engaged in random reaching tasks, evaluating four prospective models, Gated Recurrent Unit (GRU), Transformer, Receptance Weighted Key Value (RWKV), and Selective State Space model (Mamba), across several metrics: single-session decoding, multi-session decoding, new session fine-tuning, inference speed, calibration speed, and scalability. The findings indicate that although the GRU model delivers sufficient accuracy, the RWKV and Mamba models are preferable due to their superior inference and calibration speeds. Additionally, RWKV and Mamba comply with the scaling law, demonstrating improved performance with larger data sets and increased model sizes, whereas GRU shows less pronounced scalability, and the Transformer model requires computational resources that scale prohibitively. This paper presents a thorough comparative analysis of the four models in various scenarios. The results are pivotal in pinpointing an optimal backbone that can handle increasing data volumes and is viable for edge implementation. This analysis provides essential insights for ongoing research and practical applications in the field.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Transmission Interface Power Flow Adjustment: A Deep Reinforcement Learning Approach based on Multi-task Attribution Map
Authors:
Shunyu Liu,
Wei Luo,
Yanzhen Zhou,
Kaixuan Chen,
Quan Zhang,
Huating Xu,
Qinglai Guo,
Mingli Song
Abstract:
Transmission interface power flow adjustment is a critical measure to ensure the security and economy operation of power systems. However, conventional model-based adjustment schemes are limited by the increasing variations and uncertainties occur in power systems, where the adjustment problems of different transmission interfaces are often treated as several independent tasks, ignoring their coup…
▽ More
Transmission interface power flow adjustment is a critical measure to ensure the security and economy operation of power systems. However, conventional model-based adjustment schemes are limited by the increasing variations and uncertainties occur in power systems, where the adjustment problems of different transmission interfaces are often treated as several independent tasks, ignoring their coupling relationship and even leading to conflict decisions. In this paper, we introduce a novel data-driven deep reinforcement learning (DRL) approach, to handle multiple power flow adjustment tasks jointly instead of learning each task from scratch. At the heart of the proposed method is a multi-task attribution map (MAM), which enables the DRL agent to explicitly attribute each transmission interface task to different power system nodes with task-adaptive attention weights. Based on this MAM, the agent can further provide effective strategies to solve the multi-task adjustment problem with a near-optimal operation cost. Simulation results on the IEEE 118-bus system, a realistic 300-bus system in China, and a very large European system with 9241 buses demonstrate that the proposed method significantly improves the performance compared with several baseline methods, and exhibits high interpretability with the learnable MAM.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Preventive Audits for Data Applications Before Data Sharing in the Power IoT
Authors:
Bohong Wang,
Qinglai Guo,
Yanxi Lin,
Yang Yu
Abstract:
With the increase in data volume, more types of data are being used and shared, especially in the power Internet of Things (IoT). However, the processes of data sharing may lead to unexpected information leakage because of the ubiquitous relevance among the different data, thus it is necessary for data owners to conduct preventive audits for data applications before data sharing to avoid the risk…
▽ More
With the increase in data volume, more types of data are being used and shared, especially in the power Internet of Things (IoT). However, the processes of data sharing may lead to unexpected information leakage because of the ubiquitous relevance among the different data, thus it is necessary for data owners to conduct preventive audits for data applications before data sharing to avoid the risk of key information leakage. Considering that the same data may play completely different roles in different application scenarios, data owners should know the expected data applications of the data buyers in advance and provide modified data that are less relevant to the private information of the data owners and more relevant to the nonprivate information that the data buyers need. In this paper, data sharing in the power IoT is regarded as the background, and the mutual information of the data and their implicit information is selected as the data feature parameter to indicate the relevance between the data and their implicit information or the ability to infer the implicit information from the data. Therefore, preventive audits should be conducted based on changes in the data feature parameters before and after data sharing. The probability exchange adjustment method is proposed as the theoretical basis of preventive audits under simplified consumption, and the corresponding optimization models are constructed and extended to more practical scenarios with multivariate characteristics. Finally, case studies are used to validate the effectiveness of the proposed preventive audits.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Force-EvT: A Closer Look at Robotic Gripper Force Measurement with Event-based Vision Transformer
Authors:
Qianyu Guo,
Ziqing Yu,
Jiaming Fu,
Yawen Lu,
Yahya Zweiri,
Dongming Gan
Abstract:
Robotic grippers are receiving increasing attention in various industries as essential components of robots for interacting and manipulating objects. While significant progress has been made in the past, conventional rigid grippers still have limitations in handling irregular objects and can damage fragile objects. We have shown that soft grippers offer deformability to adapt to a variety of objec…
▽ More
Robotic grippers are receiving increasing attention in various industries as essential components of robots for interacting and manipulating objects. While significant progress has been made in the past, conventional rigid grippers still have limitations in handling irregular objects and can damage fragile objects. We have shown that soft grippers offer deformability to adapt to a variety of object shapes and maximize object protection. At the same time, dynamic vision sensors (e.g., event-based cameras) are capable of capturing small changes in brightness and streaming them asynchronously as events, unlike RGB cameras, which do not perform well in low-light and fast-moving environments. In this paper, a dynamic-vision-based algorithm is proposed to measure the force applied to the gripper. In particular, we first set up a DVXplorer Lite series event camera to capture twenty-five sets of event data. Second, motivated by the impressive performance of the Vision Transformer (ViT) algorithm in dense image prediction tasks, we propose a new approach that demonstrates the potential for real-time force estimation and meets the requirements of real-world scenarios. We extensively evaluate the proposed algorithm on a wide range of scenarios and settings, and show that it consistently outperforms recent approaches.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Diffusion Attack: Leveraging Stable Diffusion for Naturalistic Image Attacking
Authors:
Qianyu Guo,
Jiaming Fu,
Yawen Lu,
Dongming Gan
Abstract:
In Virtual Reality (VR), adversarial attack remains a significant security threat. Most deep learning-based methods for physical and digital adversarial attacks focus on enhancing attack performance by crafting adversarial examples that contain large printable distortions that are easy for human observers to identify. However, attackers rarely impose limitations on the naturalness and comfort of t…
▽ More
In Virtual Reality (VR), adversarial attack remains a significant security threat. Most deep learning-based methods for physical and digital adversarial attacks focus on enhancing attack performance by crafting adversarial examples that contain large printable distortions that are easy for human observers to identify. However, attackers rarely impose limitations on the naturalness and comfort of the appearance of the generated attack image, resulting in a noticeable and unnatural attack. To address this challenge, we propose a framework to incorporate style transfer to craft adversarial inputs of natural styles that exhibit minimal detectability and maximum natural appearance, while maintaining superior attack capabilities.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
HyperColorization: Propagating spatially sparse noisy spectral clues for reconstructing hyperspectral images
Authors:
M. Kerem Aydin,
Qi Guo,
Emma Alexander
Abstract:
Hyperspectral cameras face challenging spatial-spectral resolution trade-offs and are more affected by shot noise than RGB photos taken over the same total exposure time. Here, we present a colorization algorithm to reconstruct hyperspectral images from a grayscale guide image and spatially sparse spectral clues. We demonstrate that our algorithm generalizes to varying spectral dimensions for hype…
▽ More
Hyperspectral cameras face challenging spatial-spectral resolution trade-offs and are more affected by shot noise than RGB photos taken over the same total exposure time. Here, we present a colorization algorithm to reconstruct hyperspectral images from a grayscale guide image and spatially sparse spectral clues. We demonstrate that our algorithm generalizes to varying spectral dimensions for hyperspectral images, and show that colorizing in a low-rank space reduces compute time and the impact of shot noise. To enhance robustness, we incorporate guided sampling, edge-aware filtering, and dimensionality estimation techniques. Our method surpasses previous algorithms in various performance metrics, including SSIM, PSNR, GFC, and EMD, which we analyze as metrics for characterizing hyperspectral image quality. Collectively, these findings provide a promising avenue for overcoming the time-space-wavelength resolution trade-off by reconstructing a dense hyperspectral image from samples obtained by whisk or push broom scanners, as well as hybrid spatial-spectral computational imaging systems.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Bayesian Learning for Double-RIS Aided ISAC Systems with Superimposed Pilots and Data
Authors:
Xu Gan,
Chongwen Huang,
Zhaohui Yang,
Caijun Zhong,
Xiaoming Chen,
Zhaoyang Zhang,
Qinghua Guo,
Chau Yuen,
Merouane Debbah
Abstract:
Reconfigurable intelligent surface (RIS) has great potential to improve the performance of integrated sensing and communication (ISAC) systems, especially in scenarios where line-of-sight paths between the base station and users are blocked. However, the spectral efficiency (SE) of RIS-aided ISAC uplink transmissions may be drastically reduced by the heavy burden of pilot overhead for realizing se…
▽ More
Reconfigurable intelligent surface (RIS) has great potential to improve the performance of integrated sensing and communication (ISAC) systems, especially in scenarios where line-of-sight paths between the base station and users are blocked. However, the spectral efficiency (SE) of RIS-aided ISAC uplink transmissions may be drastically reduced by the heavy burden of pilot overhead for realizing sensing capabilities. In this paper, we tackle this bottleneck by proposing a superimposed symbol scheme, which superimposes sensing pilots onto data symbols over the same time-frequency resources. Specifically, we develop a structure-aware sparse Bayesian learning framework, where decoded data symbols serve as side information to enhance sensing performance and increase SE. To meet the low-latency requirements of emerging ISAC applications, we further propose a low-complexity simultaneous communication and localization algorithm for multiple users. This algorithm employs the unitary approximate message passing in the Bayesian learning framework for initial angle estimate, followed by iterative refinements through reduced-dimension matrix calculations. Moreover, the sparse code multiple access technology is incorporated into this iterative framework for accurate data detection which also facilitates localization. Numerical results show that the proposed superimposed symbol-based scheme empowered by the developed algorithm can achieve centimeter-level localization while attaining up to $96\%$ of the SE of conventional communications without sensing capabilities. Moreover, compared to other typical ISAC schemes, the proposed superimposed symbol scheme can provide an effective throughput improvement over $133\%$.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Uncertainty-Aware Transient Stability-Constrained Preventive Redispatch: A Distributional Reinforcement Learning Approach
Authors:
Zhengcheng Wang,
Fei Teng,
Yanzhen Zhou,
Qinglai Guo,
Hongbin Sun
Abstract:
Transient stability-constrained preventive redispatch plays a crucial role in ensuring power system security and stability. Since redispatch strategies need to simultaneously satisfy complex transient constraints and the economic need, model-based formulation and optimization become extremely challenging. In addition, the increasing uncertainty and variability introduced by renewable sources start…
▽ More
Transient stability-constrained preventive redispatch plays a crucial role in ensuring power system security and stability. Since redispatch strategies need to simultaneously satisfy complex transient constraints and the economic need, model-based formulation and optimization become extremely challenging. In addition, the increasing uncertainty and variability introduced by renewable sources start to drive the system stability consideration from deterministic to probabilistic, which further exaggerates the complexity. In this paper, a Graph neural network guided Distributional Deep Reinforcement Learning (GD2RL) method is proposed, for the first time, to solve the uncertainty-aware transient stability-constrained preventive redispatch problem. First, a graph neural network-based transient simulator is trained by supervised learning to efficiently generate post-contingency rotor angle curves with the steady-state and contingency as inputs, which serves as a feature extractor for operating states and a surrogate time-domain simulator during the environment interaction for reinforcement learning. Distributional deep reinforcement learning with explicit uncertainty distribution of system operational conditions is then applied to generate the redispatch strategy to balance the user-specified probabilistic stability performance and economy preferences. The full distribution of the post-redispatch transient stability index is directly provided as the output. Case studies on the modified New England 39-bus system validate the proposed method.
△ Less
Submitted 29 June, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
A Network for structural dense displacement based on 3D deformable mesh model and optical flow
Authors:
Peimian Du,
Qicheng Guo,
Yanru Li
Abstract:
This study proposes a Network to recognize displacement of a RC frame structure from a video by a monocular camera. The proposed Network consists of two modules which is FlowNet2 and POFRN-Net. FlowNet2 is used to generate dense optical flow as well as POFRN-Net is to extract pose parameter H. FlowNet2 convert two video frames into dense optical flow. POFRN-Net is inputted dense optical flow from…
▽ More
This study proposes a Network to recognize displacement of a RC frame structure from a video by a monocular camera. The proposed Network consists of two modules which is FlowNet2 and POFRN-Net. FlowNet2 is used to generate dense optical flow as well as POFRN-Net is to extract pose parameter H. FlowNet2 convert two video frames into dense optical flow. POFRN-Net is inputted dense optical flow from FlowNet2 to output the pose parameter H. The displacement of any points of structure can be calculated from parameter H. The Fast Fourier Transform (FFT) is applied to obtain frequency domain signal from corresponding displacement signal. Furthermore, the comparison of the truth displacement on the First floor of the First video is shown in this study. Finally, the predicted displacements on four floors of RC frame structure of given three videos are exhibited in the last of this study.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Active Support of Inverters for Improving Short-Term Voltage Security in 100% IBRsPenetrated Power Systems
Authors:
Yinhong Lin,
Bin Wang,
Qinglai Guo,
Haotian Zhao,
Hongbin Sun
Abstract:
Due to the energy crisis and environmental pollution, the installed capacity of inverter-based resources (IBRs) in power grids is rapidly increasing, and grid-following control (GFL) is the most prevalent at present. Meanwhile, grid-forming control-based (GFM) devices have been installed in the grid to provide active support for frequency and voltage. In the future GFL devices combined with GFM wi…
▽ More
Due to the energy crisis and environmental pollution, the installed capacity of inverter-based resources (IBRs) in power grids is rapidly increasing, and grid-following control (GFL) is the most prevalent at present. Meanwhile, grid-forming control-based (GFM) devices have been installed in the grid to provide active support for frequency and voltage. In the future GFL devices combined with GFM will be promising, especially in power systems with high penetration or 100% IBRs. When a short-circuit fault occurs in the grid, the controlled current source characteristic of the GFL devices leads to insufficient dynamic voltage support (DVS), while the GFM devices usually reduce the internal voltage to limit the current. Thus, deep voltage sags and undesired disconnections of IBRs may occur. Moreover, due to the dispersed locations and the control strategies' diversity of IBRs, the voltage support of different devices may not be fully coordinated, which is not conducive to short-term voltage security (STVS). To address this issue, a control scheme based on the simulation of transient characteristics of synchronous machines (SMs) is proposed. Then, a new fault ride-through strategy (FRT) is proposed based on the characteristic differences between GFL and GFM devices, and an optimization model of multi-device control parameters is formulated to meet the short-term voltage security constraints (SVSCs) and device capacity constraints. Finally, a fast solution method based on analytical modeling is proposed for the model. Test results based on the doublegenerator-one-load system, the IEEE 14-bus system, and other systems of different sizes show that the proposed method can effectively enhance the active support capability of GFL and GFM to the grid voltage, and avoid the large-scale disconnection of IBRs
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Hybrid Vector Message Passing for Generalized Bilinear Factorization
Authors:
Hao Jiang,
Xiaojun Yuan,
Qinghua Guo
Abstract:
In this paper, we propose a new message passing algorithm that utilizes hybrid vector message passing (HVMP) to solve the generalized bilinear factorization (GBF) problem. The proposed GBF-HVMP algorithm integrates expectation propagation (EP) and variational message passing (VMP) via variational free energy minimization, yielding tractable Gaussian messages. Furthermore, GBF-HVMP enables vector/m…
▽ More
In this paper, we propose a new message passing algorithm that utilizes hybrid vector message passing (HVMP) to solve the generalized bilinear factorization (GBF) problem. The proposed GBF-HVMP algorithm integrates expectation propagation (EP) and variational message passing (VMP) via variational free energy minimization, yielding tractable Gaussian messages. Furthermore, GBF-HVMP enables vector/matrix variables rather than scalar ones in message passing, resulting in a loop-free Bayesian network that improves convergence. Numerical results show that GBF-HVMP significantly outperforms state-of-the-art methods in terms of NMSE performance and computational complexity.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
A global product of fine-scale urban building height based on spaceborne lidar
Authors:
Xiao Ma,
Guang Zheng,
Chi Xu,
L. Monika Moskal,
Peng Gong,
Qinghua Guo,
Huabing Huang,
Xuecao Li,
Yong Pang,
Cheng Wang,
Huan Xie,
Bailang Yu,
Bo Zhao,
Yuyu Zhou
Abstract:
Characterizing urban environments with broad coverages and high precision is more important than ever for achieving the UN's Sustainable Development Goals (SDGs) as half of the world's populations are living in cities. Urban building height as a fundamental 3D urban structural feature has far-reaching applications. However, so far, producing readily available datasets of recent urban building heig…
▽ More
Characterizing urban environments with broad coverages and high precision is more important than ever for achieving the UN's Sustainable Development Goals (SDGs) as half of the world's populations are living in cities. Urban building height as a fundamental 3D urban structural feature has far-reaching applications. However, so far, producing readily available datasets of recent urban building heights with fine spatial resolutions and global coverages remains a challenging task. Here, we provide an up-to-date global product of urban building heights based on a fine grid size of 150 m around 2020 by combining the spaceborne lidar instrument of GEDI and multi-sourced data including remotely sensed images (i.e., Landsat-8, Sentinel-2, and Sentinel-1) and topographic data. Our results revealed that the estimated method of building height samples based on the GEDI data was effective with 0.78 of Pearson's r and 3.67 m of RMSE in comparison to the reference data. The mapping product also demonstrated good performance as indicated by its strong correlation with the reference data (i.e., Pearson's r = 0.71, RMSE = 4.60 m). Compared with the currently existing products, our global urban building height map holds the ability to provide a higher spatial resolution (i.e., 150 m) with a great level of inherent details about the spatial heterogeneity and flexibility of updating using the GEDI samples as inputs. This work will boost future urban studies across many fields including climate, environmental, ecological, and social sciences.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Message Passing-Based Joint Channel Estimation and Signal Detection for OTFS with Superimposed Pilots
Authors:
Fupeng Huang,
Qinghua Guo,
Youwen Zhang,
Yuriy Zakharov
Abstract:
Receivers with joint channel estimation and signal detection using superimposed pilots (SP) can achieve high transmission efficiency in orthogonal time frequency space (OTFS) systems. However, existing receivers have high computational complexity, hindering their practical applications. In this work, with SP in the delay-Doppler (DD) domain and the generalized complex exponential (GCE) basis expan…
▽ More
Receivers with joint channel estimation and signal detection using superimposed pilots (SP) can achieve high transmission efficiency in orthogonal time frequency space (OTFS) systems. However, existing receivers have high computational complexity, hindering their practical applications. In this work, with SP in the delay-Doppler (DD) domain and the generalized complex exponential (GCE) basis expansion modeling (BEM) for channels, a message passing-based SP-DD iterative receiver is proposed, which drastically reduces the computational complexity while with marginal performance loss, compared to existing ones. To facilitate channel estimation (CE) in the proposed receiver, we design pilot signal to achieve pilot power concentration in the frequency domain, thereby developing an SP-DD-D receiver that can effectively reduce the power of the pilot signal and almost no loss of CE accuracy. Extensive simulation results are provided to demonstrate the superiority of the proposed SP-DD-D receiver.
△ Less
Submitted 15 April, 2024; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Message Passing Based Block Sparse Signal Recovery for DOA Estimation Using Large Arrays
Authors:
Yiwen Mao,
Dawei Gao,
Qinghua Guo,
Ming Jin
Abstract:
This work deals with directional of arrival (DOA) estimation with a large antenna array. We first develop a novel signal model with a sparse system transfer matrix using an inverse discrete Fourier transform (DFT) operation, which leads to the formulation of a structured block sparse signal recovery problem with a sparse sensing matrix. This enables the development of a low complexity message pass…
▽ More
This work deals with directional of arrival (DOA) estimation with a large antenna array. We first develop a novel signal model with a sparse system transfer matrix using an inverse discrete Fourier transform (DFT) operation, which leads to the formulation of a structured block sparse signal recovery problem with a sparse sensing matrix. This enables the development of a low complexity message passing based Bayesian algorithm with a factor graph representation. Simulation results demonstrate the superior performance of the proposed method.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Exploiting Structured Sparsity with Low Complexity Sparse Bayesian Learning for RIS-assisted MIMO Channel Estimation
Authors:
W. Li,
Z. Lin,
Q. Guo,
B. Vucetic
Abstract:
As an emerging communication auxiliary technology, reconfigurable intelligent surface (RIS) is expected to play a significant role in the upcoming 6G networks. Due to its total reflection characteristics, it is challenging to implement conventional channel estimation algorithms. This work focuses on RIS-assisted MIMO communications. Although many algorithms have been proposed to address this issue…
▽ More
As an emerging communication auxiliary technology, reconfigurable intelligent surface (RIS) is expected to play a significant role in the upcoming 6G networks. Due to its total reflection characteristics, it is challenging to implement conventional channel estimation algorithms. This work focuses on RIS-assisted MIMO communications. Although many algorithms have been proposed to address this issue, there are still ample opportunities for improvement in terms of estimation accuracy, complexity, and applicability. To fully exploit the structured sparsity of the multiple-input-multiple-output (MIMO) channels, we propose a new channel estimation algorithm called unitary approximate message passing sparse Bayesian learning with partial common support identification (UAMPSBL-PCI). Thanks to the mechanism of PCI and the use of UAMP, the proposed algorithm has a lower complexity while delivering enhanced performance relative to existing channel estimation algorithms. Extensive simulations demonstrate its excellent performance in various environments.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction
Authors:
Aoqi Guo,
Junnan Wu,
Peng Gao,
Wenbo Zhu,
Qinwen Guo,
Dazhi Gao,
Yujun Wang
Abstract:
Recently, deep learning-based beamforming algorithms have shown promising performance in target speech extraction tasks. However, most systems do not fully utilize spatial information. In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer. To achieve this, we first use the UNet-TCN structure to model input fea…
▽ More
Recently, deep learning-based beamforming algorithms have shown promising performance in target speech extraction tasks. However, most systems do not fully utilize spatial information. In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer. To achieve this, we first use the UNet-TCN structure to model input features and improve the estimation accuracy of the speech pre-separation module by avoiding information loss caused by direct dimensionality reduction in other models. Furthermore, we introduce a multi-head cross-attention mechanism that enhances the neural beamformer's perception of spatial information by making full use of the spatial information received by the array. Experimental results demonstrate that our approach, which incorporates a more reasonable target mask estimation network and a spatial information-based cross-attention mechanism into the neural beamformer, effectively improves speech separation performance.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Short-Term Voltage Security Constrained UC to Prevent Trip Faults in High Wind Power Penetrated Power Systems
Authors:
Yinhong Lin,
Bin Wang,
Qinglai Guo,
Haotian Zhao,
Hongbin Sun
Abstract:
For high wind power-penetrated power systems, the multiple renewable energy station short-circuit ratio (MRSCR) is often insufficient due to weak grid structures. Additionally, transient voltage sag/overvoltage issues may cause trip faults of wind turbines (WTs). Due to the time delay in WTs' controllers, it is difficult for WTs alone to meet the reactive power demands in different stages of the t…
▽ More
For high wind power-penetrated power systems, the multiple renewable energy station short-circuit ratio (MRSCR) is often insufficient due to weak grid structures. Additionally, transient voltage sag/overvoltage issues may cause trip faults of wind turbines (WTs). Due to the time delay in WTs' controllers, it is difficult for WTs alone to meet the reactive power demands in different stages of the transient process. Some synchronous machines (SMs) must be retained through unit commitment (UC) scheduling to improve MRSCR and prevent trip faults of WTs. The MRSCR and short-term voltage security constrained-UC model is a mixed integer nonlinear programming (MINLP) problem with differential algebraic equations (DAEs) and symbolic matrix inversion, which is intractable to solve. Based on the dynamic characteristics of different devices, the original model is simplified as a general MINLP model without DAEs. Then, generalized Benders decomposition is applied to improve the solution efficiency. The relaxed MRSCR constraints are formulated in the master problem to improve the convergence, and the precise MRSCR constraints are formulated in the subproblems to consider the impact of voltage profiles. Case studies based on several benchmark systems and a provincial power grid verify the validity and efficiency of the proposed method
△ Less
Submitted 18 June, 2023;
originally announced June 2023.
-
Cooperative IoT Data Sharing with Heterogeneity of Participants Based on Electricity Retail
Authors:
Bohong Wang,
Qinglai Guo,
Tian Xia,
Qiang Li,
Di Liu,
Feng Zhao
Abstract:
With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions. However, the isolation of IoT data among several entities makes it difficult to achieve optimal allocation of data resources and convert data resources into real economic value, thus it is necessary to introduce the I…
▽ More
With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions. However, the isolation of IoT data among several entities makes it difficult to achieve optimal allocation of data resources and convert data resources into real economic value, thus it is necessary to introduce the IoT data sharing mode to drive data circulation. To enhance the accuracy and fairness of IoT data sharing, the heterogeneity of participants is sufficiently considered, and data valuation and profit allocation in IoT data sharing are improved based on the background of electricity retail. Data valuation is supposed to be relevant to attributes of IoT data buyers, thus risk preferences of electricity retailers are applied as characteristic attributes and data premium rates are proposed to modify data value rates. Profit allocation should measure the marginal contribution shares of electricity retailers and data brokers fairly, thus asymmetric Nash bargaining model is used to guarantee that they could receive reasonable profits based on their specific contribution to the coalition of IoT data sharing. Considering the heterogeneity of participants comprehensively, the proposed IoT data sharing fits for a large coalition of IoT data sharing with multiple electricity retailers and data brokers. Finally, to demonstrate the applications of IoT data sharing in smart grids, case studies are utilized to validate the results of data value for electricity retailers with different risk preferences and the efficiency of profit allocation using asymmetric Nash bargaining model.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Sensing Aided Uplink Transmission in OTFS ISAC with Joint Parameter Association, Channel Estimation and Signal Detection
Authors:
Xi Yang,
Hang Li,
Qinghua Guo,
J. Andrew Zhang,
Xiaojing Huang,
Zhiqun Cheng
Abstract:
In this work, we study sensing-aided uplink transmission in an integrated sensing and communication (ISAC) vehicular network with the use of orthogonal time frequency space (OTFS) modulation. To exploit sensing parameters for improving uplink communications, the parameters must be first associated with the transmitters, which is a challenging task. We propose a scheme that jointly conducts paramet…
▽ More
In this work, we study sensing-aided uplink transmission in an integrated sensing and communication (ISAC) vehicular network with the use of orthogonal time frequency space (OTFS) modulation. To exploit sensing parameters for improving uplink communications, the parameters must be first associated with the transmitters, which is a challenging task. We propose a scheme that jointly conducts parameter association, channel estimation and signal detection by formulating it as a constrained bilinear recovery problem. Then we develop a message passing algorithm to solve the problem, leveraging the bilinear unitary approximate message passing (Bi-UAMP) algorithm. Numerical results validate the proposed scheme, which show that relevant performance bounds can be closely approached.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Asynchronous Grant-Free Random Access: Receiver Design with Partially Uni-Directional Message Passing and Interference Suppression Analysis
Authors:
Zhaoji Zhang,
Yuhao Chi,
Qinghua Guo,
Ying Li,
Guanghui Song,
Chongwen Huang
Abstract:
Massive Machine-Type Communications (mMTC) features a massive number of low-cost user equipments (UEs) with sparse activity. Tailor-made for these features, grant-free random access (GF-RA) serves as an efficient access solution for mMTC. However, most existing GF-RA schemes rely on strict synchronization, which incurs excessive coordination burden for the low-cost UEs. In this work, we propose a…
▽ More
Massive Machine-Type Communications (mMTC) features a massive number of low-cost user equipments (UEs) with sparse activity. Tailor-made for these features, grant-free random access (GF-RA) serves as an efficient access solution for mMTC. However, most existing GF-RA schemes rely on strict synchronization, which incurs excessive coordination burden for the low-cost UEs. In this work, we propose a receiver design for asynchronous GF-RA, and address the joint user activity detection (UAD) and channel estimation (CE) problem in the presence of asynchronization-induced inter-symbol interference. Specifically, the delay profile is exploited at the receiver to distinguish different UEs. However, a sample correlation problem in this receiver design impedes the factorization of the joint likelihood function, which complicates the UAD and CE problem. To address this correlation problem, we design a partially uni-directional (PUD) factor graph representation for the joint likelihood function. Building on this PUD factor graph, we further propose a PUD message passing based sparse Bayesian learning (SBL) algorithm for asynchronous UAD and CE (PUDMP-SBL-aUADCE). Our theoretical analysis shows that the PUDMP-SBL-aUADCE algorithm exhibits higher signal-to-interference-and-noise ratio (SINR) in the asynchronous case than in the synchronous case, i.e., the proposed receiver design can exploit asynchronization to suppress multi-user interference. In addition, considering potential timing error from the low-cost UEs, we investigate the impacts of imperfect delay profile, and reveal the advantages of adopting the SBL method in this case. Finally, extensive simulation results are provided to demonstrate the performance of the PUDMP-SBL-aUADCE algorithm.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Evading DeepFake Detectors via Adversarial Statistical Consistency
Authors:
Yang Hou,
Qing Guo,
Yihao Huang,
Xiaofei Xie,
Lei Ma,
Jianjun Zhao
Abstract:
In recent years, as various realistic face forgery techniques known as DeepFake improves by leaps and bounds,more and more DeepFake detection techniques have been proposed. These methods typically rely on detecting statistical differences between natural (i.e., real) and DeepFakegenerated images in both spatial and frequency domains. In this work, we propose to explicitly minimize the statistical…
▽ More
In recent years, as various realistic face forgery techniques known as DeepFake improves by leaps and bounds,more and more DeepFake detection techniques have been proposed. These methods typically rely on detecting statistical differences between natural (i.e., real) and DeepFakegenerated images in both spatial and frequency domains. In this work, we propose to explicitly minimize the statistical differences to evade state-of-the-art DeepFake detectors. To this end, we propose a statistical consistency attack (StatAttack) against DeepFake detectors, which contains two main parts. First, we select several statistical-sensitive natural degradations (i.e., exposure, blur, and noise) and add them to the fake images in an adversarial way. Second, we find that the statistical differences between natural and DeepFake images are positively associated with the distribution shifting between the two kinds of images, and we propose to use a distribution-aware loss to guide the optimization of different degradations. As a result, the feature distributions of generated adversarial examples is close to the natural images.Furthermore, we extend the StatAttack to a more powerful version, MStatAttack, where we extend the single-layer degradation to multi-layer degradations sequentially and use the loss to tune the combination weights jointly. Comprehensive experimental results on four spatial-based detectors and two frequency-based detectors with four datasets demonstrate the effectiveness of our proposed attack method in both white-box and black-box settings.
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
Hierarchically Structured Matrix Recovery-Based Channel Estimation for RIS-Aided Communications
Authors:
Yabo Guo,
Peng Sun,
Zhengdao Yuan,
Qinghua Guo,
Zhongyong Wang
Abstract:
Reconfigurable intelligent surface (RIS) has emerged as a promising technology for improving capacity and extending coverage of wireless networks. In this work, we consider RIS-aided millimeter wave (mmWave) multiple-input and multiple-output (MIMO) communications, where acquiring accurate channel state information is challenging due to the high dimensionality of channels. To fully exploit the str…
▽ More
Reconfigurable intelligent surface (RIS) has emerged as a promising technology for improving capacity and extending coverage of wireless networks. In this work, we consider RIS-aided millimeter wave (mmWave) multiple-input and multiple-output (MIMO) communications, where acquiring accurate channel state information is challenging due to the high dimensionality of channels. To fully exploit the structures of the channels, we formulate the channel estimation as a hierarchically structured matrix recovery problem, and design a low-complexity message passing algorithm to solve it. Simulation results demonstrate the superiority of the proposed algorithm and its performance close to the oracle bound.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
Matrix Factorization Based Blind Bayesian Receiver for Grant-Free Random Access in mmWave MIMO mMTC
Authors:
Zhengdao Yuan,
Fei Liu,
Qinghua Guo,
Xiaojun Yuan,
Zhongyong Wang,
Yonghui Li
Abstract:
Grant-free random access is promising for massive connectivity with sporadic transmissions in massive machine type communications (mMTC), where the hand-shaking between the access point (AP) and users is skipped, leading to high access efficiency. In grant-free random access, the AP needs to identify the active users and perform channel estimation and signal detection. Conventionally, pilot signal…
▽ More
Grant-free random access is promising for massive connectivity with sporadic transmissions in massive machine type communications (mMTC), where the hand-shaking between the access point (AP) and users is skipped, leading to high access efficiency. In grant-free random access, the AP needs to identify the active users and perform channel estimation and signal detection. Conventionally, pilot signals are required for the AP to achieve user activity detection and channel estimation before active user signal detection, which may still result in substantial overhead and latency. In this paper, to further reduce the overhead and latency, we explore the problem of grant-free random access without the use of pilot signals in a millimeter wave (mmWave) multiple input and multiple output (MIMO) system, where the AP performs blind joint user activity detection, channel estimation and signal detection (UACESD). We show that the blind joint UACESD can be formulated as a constrained composite matrix factorization problem, which can be solved by exploiting the structures of the channel matrix and signal matrix. Leveraging our recently developed unitary approximate message passing based matrix factorization (UAMP-MF) algorithm, we design a message passing based Bayesian algorithm to solve the blind joint UACESD problem. Extensive simulation results demonstrate the effectiveness of the blind grant-free random access scheme.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
RIS-Assisted Joint Uplink Communication and Imaging: Phase Optimization and Bayesian Echo Decoupling
Authors:
Shengyu Zhu,
Zehua Yu,
Qinghua Guo,
Jinshan Ding,
Qiang Cheng,
Tie Jun Cui
Abstract:
Achieving integrated sensing and communication (ISAC) via uplink transmission is challenging due to the unknown waveform and the coupling of communication and sensing echoes. In this paper, a joint uplink communication and imaging system is proposed for the first time, where a reconfigurable intelligent surface (RIS) is used to manipulate the electromagnetic signals for echo decoupling at the base…
▽ More
Achieving integrated sensing and communication (ISAC) via uplink transmission is challenging due to the unknown waveform and the coupling of communication and sensing echoes. In this paper, a joint uplink communication and imaging system is proposed for the first time, where a reconfigurable intelligent surface (RIS) is used to manipulate the electromagnetic signals for echo decoupling at the base station (BS). Aiming to enhance the transmission gain in desired directions and generate required radiation pattern in the region of interest (RoI), a phase optimization problem for RIS is formulated, which is high dimensional and nonconvex with discrete constraints. To tackle this problem, a back propagation based phase design scheme for both continuous and discrete phase models is developed. Moreover, the echo decoupling problem is tackled using the Bayesian method with the factor graph technique, where the problem is represented by a graph model which consists of difficult local functions. Based on the graph model, a message-passing algorithm is derived, which can efficiently cooperate with the adaptive sparse Bayesian learning (SBL) to achieve joint communication and imaging. Numerical results show that the proposed method approaches the relevant lower bound asymptotically, and the communication performance can be enhanced with the utilization of imaging echoes.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.