Search | arXiv e-print repository

Characterization of Rydberg-Atom Signal Reception of Dual-Frequency Signals Coupled with Two Energy Levels

Authors: Hao Wu, Chongwu Xie, Xinyuan Yao, Kang-Da Wu, Shanchi Wu, Rui Ni, Guo-Yong Xiang, Chen Gong

Abstract: Rydberg atomic sensors have been adopted for novel radio frequency (RF) measurement technique and the sensing capability for signals in multiple frequencies makes it attractive for multi-user communication. However, unlike traditional antennas where the signals in multiple frequencies are orthogonal, the received signals of atomic sensors corresponding to different energy levels will be downconver… ▽ More Rydberg atomic sensors have been adopted for novel radio frequency (RF) measurement technique and the sensing capability for signals in multiple frequencies makes it attractive for multi-user communication. However, unlike traditional antennas where the signals in multiple frequencies are orthogonal, the received signals of atomic sensors corresponding to different energy levels will be downconverted to the baseband simultaneously, resulting in multi-user interference. Thus, in this paper, we analyze the mutual interference characteristics of two RF signals with different carrier frequencies coupling different energy levels. We introduce the joint response coefficient based on the receiver characteristics and analyze the interference of one user to another. We analyze the bit-error rate (BER) and symbol-error rate (SER) for two signals coupling two different energy levels. We also conduct experiments to validate the BER and SER results. △ Less

Submitted 26 June, 2025; originally announced June 2025.

arXiv:2506.19684 [pdf, ps, other]

Beyond 200 Gb/s/lane: An Analytical Approach to Optimal Detection in Shaped IM-DD Optical Links with Relative Intensity Noise

Authors: Felipe Villenas, Kaiquan Wu, Yunus Can Gültekin, Jamal Riani, Alex Alvarado

Abstract: Next-generation intensity-modulation (IM) and direct-detection (DD) systems used in data centers are expected to operate at 400 Gb/s/lane and beyond. Such rates can be achieved by increasing the system bandwidth or the modulation format, which in turn requires maintaining or increasing the signal-to-noise ratio (SNR). Such SNR requirements can be achieved by increasing the transmitted optical powe… ▽ More Next-generation intensity-modulation (IM) and direct-detection (DD) systems used in data centers are expected to operate at 400 Gb/s/lane and beyond. Such rates can be achieved by increasing the system bandwidth or the modulation format, which in turn requires maintaining or increasing the signal-to-noise ratio (SNR). Such SNR requirements can be achieved by increasing the transmitted optical power. This increase in optical power causes the emergence of relative intensity noise (RIN), a signal-dependent impairment inherent to the transmitter laser, which ultimately limits the performance of the system. In this paper, we develop an analytical symbol error rate (SER) expression for the optimal detector for the IM-DD optical link under study. The developed expression takes into account the signal-dependent nature of RIN and does not make any assumptions on the geometry or probability distribution of the constellation. Our expression is therefore applicable to general probabilistically and/or geometrically shaped systems. Unlike results available in the literature, our proposed expression provides a perfect match to numerical simulations of probabilistic and geometrically shaped systems. △ Less

Submitted 24 June, 2025; originally announced June 2025.

Comments: preprint

arXiv:2506.19627 [pdf, ps, other]

On Error Rate Approximations for FSO Systems with Weak Turbulence and Pointing Errors

Authors: Carmen Álvarez Roa, Yunus Can Gültekin, Kaiquan Wu, Cornelis Willem Korevaar, Alex Alvarado

Abstract: Atmospheric attenuation, atmospheric turbulence, geometric spread, and pointing errors, degrade the performance of free-space optical transmission. In the weak turbulence regime, the probability density function describing the distribution of the channel fading coefficient that models these four effects is known in the literature. This function is an integral equation, which makes it difficult to… ▽ More Atmospheric attenuation, atmospheric turbulence, geometric spread, and pointing errors, degrade the performance of free-space optical transmission. In the weak turbulence regime, the probability density function describing the distribution of the channel fading coefficient that models these four effects is known in the literature. This function is an integral equation, which makes it difficult to find simple analytical expressions of important performance metrics such as the bit error rate (BER) and symbol error rate (SER). In this paper, we present simple and accurate approximations of the average BER and SER for pulse-amplitude modulation (PAM) in the weak turbulence regime for an intensity modulation and direct detection system. Our numerical results show that the proposed expressions exhibit excellent accuracy when compared against Monte Carlo simulations. To demonstrate the usefulness of the developed approximations, we perform two asymptotic analyses. First, we investigate the additional transmit power required to maintain the same SER when the spectral efficiency increases by 1 bit/symbol. Second, we study the asymptotic behavior of our SER approximation for dense PAM constellations and high transmit power. △ Less

Submitted 24 June, 2025; originally announced June 2025.

arXiv:2506.18512 [pdf, ps, other]

MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis

Authors: Yuting Zhang, Kaishen Yuan, Hao Lu, Yutao Yue, Jintai Chen, Kaishun Wu

Abstract: Accurate and interpretable multi-disease diagnosis remains a critical challenge in medical research, particularly when leveraging heterogeneous multimodal medical data. Current approaches often rely on single-modal data, limiting their ability to comprehensively understand complex diseases. To address this, we propose MedTVT-R1, a novel Multimodal Large Language Model (MLLM) framework designed to… ▽ More Accurate and interpretable multi-disease diagnosis remains a critical challenge in medical research, particularly when leveraging heterogeneous multimodal medical data. Current approaches often rely on single-modal data, limiting their ability to comprehensively understand complex diseases. To address this, we propose MedTVT-R1, a novel Multimodal Large Language Model (MLLM) framework designed to integrate clinical multimodal data for reasoning and diagnosing multiple diseases. We construct MedTVT-QA, a curated instruction dataset that provides question-answer pairs for physiological-level interpretations and disease-level diagnoses with a Chain of Evidence approach. MedTVT-R1 incorporates a modality perception layer to capture inter-modal dependencies and adaptively weight modality contributions. Additionally, we employ Group Relative Policy Optimization (GRPO)-based Reinforcement Fine-Tuning with a Jaccard Reward function to enhance diagnostic reasoning. Experimental results demonstrate MedTVT-R1's superiority in multimodal feature utilization and multi-disease diagnosis, offering significant potential for clinical applications such as diagnostic report generation and comorbidity reasoning. The dataset and code are available at https://github.com/keke-nice/MedTVT-R1. △ Less

Submitted 23 June, 2025; originally announced June 2025.

arXiv:2506.01761 [pdf, ps, other]

A New 5 bit/2D-symbol Modulation Format for Relative Intensity Noise-dominated IM-DD Systems

Authors: Felipe Villenas, Kaiquan Wu, Yunus Can Gültekin, Jamal Riani, Alex Alvarado

Abstract: We propose a novel 5-bit/2D-symbol modulation format based on PAM-6 optimized for IM-DD systems dominated by relative intensity noise. The proposed modulation scheme improves SNR by 0.94 dB compared to conventional PAM-6 and achieves near-optimal BER performance. We propose a novel 5-bit/2D-symbol modulation format based on PAM-6 optimized for IM-DD systems dominated by relative intensity noise. The proposed modulation scheme improves SNR by 0.94 dB compared to conventional PAM-6 and achieves near-optimal BER performance. △ Less

Submitted 2 June, 2025; originally announced June 2025.

Comments: Submitted to ECOC 2025

arXiv:2505.19539 [pdf, ps, other]

Water Level Sensing via Communication Signals in a Bi-Static System

Authors: Zhongqin Wang, J. Andrew Zhang, Kai Wu, Y. Jay Guo

Abstract: Accurate water level sensing is essential for flood monitoring, agricultural irrigation, and water resource optimization. Traditional methods require dedicated sensor deployments, leading to high installation costs, vulnerability to interference, and limited resolution. This work proposes PMNs-WaterSense, a novel scheme leveraging Channel State Information (CSI) from existing mobile networks for w… ▽ More Accurate water level sensing is essential for flood monitoring, agricultural irrigation, and water resource optimization. Traditional methods require dedicated sensor deployments, leading to high installation costs, vulnerability to interference, and limited resolution. This work proposes PMNs-WaterSense, a novel scheme leveraging Channel State Information (CSI) from existing mobile networks for water level sensing. Our scheme begins with a CSI-power method to eliminate phase offsets caused by clock asynchrony in bi-static systems. We then apply multi-domain filtering across the time (Doppler), frequency (delay), and spatial (Angle-of-Arrival, AoA) domains to extract phase features that finely capture variations in path length over water. To resolve the $2π$ phase ambiguity, we introduce a Kalman filter-based unwrapping technique. Additionally, we exploit transceiver geometry to convert path length variations into water level height changes, even with limited antenna configurations. We validate our framework through controlled experiments with 28 GHz mmWave and 3.1 GHz LTE signals in real time, achieving average height estimation errors of 0.025 cm and 0.198 cm, respectively. Moreover, real-world river monitoring with 2.6 GHz LTE signals achieves an average error of 4.8 cm for a 1-meter water level change, demonstrating its effectiveness in practical deployments. △ Less

Submitted 26 May, 2025; originally announced May 2025.

arXiv:2505.07687 [pdf, ps, other]

ABS-Mamba: SAM2-Driven Bidirectional Spiral Mamba Network for Medical Image Translation

Authors: Feng Yuan, Yifan Gao, Wenbin Wu, Keqing Wu, Xiaotong Guo, Jie Jiang, Xin Gao

Abstract: Accurate multi-modal medical image translation requires ha-rmonizing global anatomical semantics and local structural fidelity, a challenge complicated by intermodality information loss and structural distortion. We propose ABS-Mamba, a novel architecture integrating the Segment Anything Model 2 (SAM2) for organ-aware semantic representation, specialized convolutional neural networks (CNNs) for pr… ▽ More Accurate multi-modal medical image translation requires ha-rmonizing global anatomical semantics and local structural fidelity, a challenge complicated by intermodality information loss and structural distortion. We propose ABS-Mamba, a novel architecture integrating the Segment Anything Model 2 (SAM2) for organ-aware semantic representation, specialized convolutional neural networks (CNNs) for preserving modality-specific edge and texture details, and Mamba's selective state-space modeling for efficient long- and short-range feature dependencies. Structurally, our dual-resolution framework leverages SAM2's image encoder to capture organ-scale semantics from high-resolution inputs, while a parallel CNNs branch extracts fine-grained local features. The Robust Feature Fusion Network (RFFN) integrates these epresentations, and the Bidirectional Mamba Residual Network (BMRN) models spatial dependencies using spiral scanning and bidirectional state-space dynamics. A three-stage skip fusion decoder enhances edge and texture fidelity. We employ Efficient Low-Rank Adaptation (LoRA+) fine-tuning to enable precise domain specialization while maintaining the foundational capabilities of the pre-trained components. Extensive experimental validation on the SynthRAD2023 and BraTS2019 datasets demonstrates that ABS-Mamba outperforms state-of-the-art methods, delivering high-fidelity cross-modal synthesis that preserves anatomical semantics and structural details to enhance diagnostic accuracy in clinical applications. The code is available at https://github.com/gatina-yone/ABS-Mamba △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: MICCAI 2025(under view)

arXiv:2504.17433 [pdf, other]

On Geometric Shaping for 400 Gbps IM-DD Links with Laser Intensity Noise

Authors: Felipe Villenas, Kaiquan Wu, Yunus Can Gültekin, Jamal Riani, Alex Alvarado

Abstract: We propose geometric shaping for IM-DD links dominated by relative intensity noise (RIN). For 400 Gbps links, our geometrically-shaped constellations result in error probability improvements that relaxes the RIN laser design by 3 dB. We propose geometric shaping for IM-DD links dominated by relative intensity noise (RIN). For 400 Gbps links, our geometrically-shaped constellations result in error probability improvements that relaxes the RIN laser design by 3 dB. △ Less

Submitted 28 May, 2025; v1 submitted 24 April, 2025; originally announced April 2025.

Comments: Presented at OFC 2025

arXiv:2504.15042 [pdf, ps, other]

Bayesian Sensing for Time-Varying Channels in ISAC Systems

Authors: Xueyang Wang, Kai Wu, J. Andrew Zhang, Shiqi Gong, Chengwen Xing

Abstract: Future mobile networks are projected to support integrated sensing and communications in high-speed communication scenarios. Nevertheless, large Doppler shifts induced by time-varying channels may cause severe inter-carrier interference (ICI). Frequency domain shows the potential of reducing ISAC complexity as compared with other domains. However, parameter mismatching issue still exists for such… ▽ More Future mobile networks are projected to support integrated sensing and communications in high-speed communication scenarios. Nevertheless, large Doppler shifts induced by time-varying channels may cause severe inter-carrier interference (ICI). Frequency domain shows the potential of reducing ISAC complexity as compared with other domains. However, parameter mismatching issue still exists for such sensing. In this paper, we develop a novel sensing scheme based on sparse Bayesian framework, where the delay and Doppler estimation problem in time-varying channels is formulated as a 3D multiple measurement-sparse signal recovery (MM-SSR) problem. We then propose a novel two-layer variational Bayesian inference (VBI) method to decompose the 3D MM-SSR problem into two layers and estimate the Doppler in the first layer and the delay in the second layer alternatively. Subsequently, as is benefited from newly unveiled signal construction, a simplified two-stage multiple signal classification (MUSIC)-based VBI method is proposed, where the delay and the Doppler are estimated by MUSIC and VBI, respectively. Additionally, the Cramér-Rao bound (CRB) of the considered sensing parameters is derived to characterize the lower bound for the proposed estimators. Corroborated by extensive simulation results, our proposed method can achieve improved mean square error (MSE) than its conventional counterparts and is robust against the target number and target speed, thereby validating its wide applicability and advantages over prior arts. △ Less

Submitted 21 April, 2025; originally announced April 2025.

Comments: 14 pages, 8 figures, manuscript submitted to IEEE Transactions on Communications (TCOM)

arXiv:2504.08188 [pdf, other]

Safe Data-Driven Predictive Control

Authors: Amin Vahidi-Moghaddam, Kaian Chen, Kaixiang Zhang, Zhaojian Li, Yan Wang, Kai Wu

Abstract: In the realm of control systems, model predictive control (MPC) has exhibited remarkable potential; however, its reliance on accurate models and substantial computational resources has hindered its broader application, especially within real-time nonlinear systems. This study presents an innovative control framework to enhance the practical viability of the MPC. The developed safe data-driven pred… ▽ More In the realm of control systems, model predictive control (MPC) has exhibited remarkable potential; however, its reliance on accurate models and substantial computational resources has hindered its broader application, especially within real-time nonlinear systems. This study presents an innovative control framework to enhance the practical viability of the MPC. The developed safe data-driven predictive control aims to eliminate the requirement for precise models and alleviate computational burdens in the nonlinear MPC (NMPC). This is achieved by learning both the system dynamics and the control policy, enabling efficient data-driven predictive control while ensuring system safety. The methodology involves a spatial temporal filter (STF)-based concurrent learning for system identification, a robust control barrier function (RCBF) to ensure the system safety amid model uncertainties, and a RCBF-based NMPC policy approximation. An online policy correction mechanism is also introduced to counteract performance degradation caused by the existing model uncertainties. Demonstrated through simulations on two applications, the proposed approach offers comparable performance to existing benchmarks with significantly reduced computational costs. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: arXiv admin note: substantial text overlap with arXiv:2306.17270

arXiv:2503.24002 [pdf, ps, other]

A Simple BER Expression for FSO Systems with Weak Turbulence and Pointing Errors

Authors: C. Álvarez Roa, Y. C. Gültekin, K. Wu, C. W. Korevaar, A. Alvarado

Abstract: We develop a simple approximation for the average BER for an FSO system impacted by weak turbulence and pointing errors. Numerical results show that the proposed expression accurately predicts the true BER. We develop a simple approximation for the average BER for an FSO system impacted by weak turbulence and pointing errors. Numerical results show that the proposed expression accurately predicts the true BER. △ Less

Submitted 9 April, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.07527 [pdf, other]

Real-Time Load Estimation for Load-lifting Exoskeletons Using Insole Pressure Sensors and Machine Learning

Authors: Kaida Wu, Peihao Xiang, Chaohao Lin, Lixuan Chen, Ou Bai

Abstract: This paper presents a novel method for real-time lifting-load estimation to enhance the control strategies of upper-limb assistive exoskeletons. By leveraging cost-effective insole pressure sensors, the proposed system extracts differential pressure data that minimizes disturbances from variations in body weight and sensor placement. Two modeling approaches are explored: a channel-based method tha… ▽ More This paper presents a novel method for real-time lifting-load estimation to enhance the control strategies of upper-limb assistive exoskeletons. By leveraging cost-effective insole pressure sensors, the proposed system extracts differential pressure data that minimizes disturbances from variations in body weight and sensor placement. Two modeling approaches are explored: a channel-based method that employs traditional regression techniques-Elastic Net, Support Vector Regression (SVR), and Multi-Layer Perceptron (MLP)-and a map-based method that utilizes transfer learning with a pre-trained MobileNetV2 model. The experiment is in the preliminary test stage, covering load ranges from 2 kg to 10 kg in increments of 0.5 kg, and collecting data from three subjects to test the approach. In the Channel-based method, the average Weighted Mean Absolute Percentage Error(WMAPE) for three subjects showed that the SVR achieved 13.46%, with the MLP performing similarly. In the Map-based method, using data from one subject, the Fully Fine-Tuned MobileNetV2 model reached a WMAPE of 9.74%. The results indicate that the integration of insole sensor technology with advanced machine learning models provides an effective solution for dynamic load estimation, potentially reducing the risks of over- and under-compensation in exoskeleton control. △ Less

Submitted 10 March, 2025; originally announced March 2025.

arXiv:2501.15368 [pdf, other]

Baichuan-Omni-1.5 Technical Report

Authors: Yadong Li, Jun Liu, Tao Zhang, Tao Zhang, Song Chen, Tianpeng Li, Zehuan Li, Lijun Liu, Lingfeng Ming, Guosheng Dong, Da Pan, Chong Li, Yuanbo Fang, Dongdong Kuang, Mingrui Wang, Chenglin Zhu, Youwei Zhang, Hongyu Guo, Fengyu Zhang, Yuran Wang, Bowen Ding, Wei Song, Xu Li, Yuqi Huo, Zheng Liang , et al. (68 additional authors not shown)

Abstract: We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pip… ▽ More We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pipeline for multimodal data, obtaining about 500B high-quality data (text, audio, and vision). Second, an audio-tokenizer (Baichuan-Audio-Tokenizer) has been designed to capture both semantic and acoustic information from audio, enabling seamless integration and enhanced compatibility with MLLM. Lastly, we designed a multi-stage training strategy that progressively integrates multimodal alignment and multitask fine-tuning, ensuring effective synergy across all modalities. Baichuan-Omni-1.5 leads contemporary models (including GPT4o-mini and MiniCPM-o 2.6) in terms of comprehensive omni-modal capabilities. Notably, it achieves results comparable to leading models such as Qwen2-VL-72B across various multimodal medical benchmarks. △ Less

Submitted 25 January, 2025; originally announced January 2025.

arXiv:2501.11196 [pdf, other]

Enhancing Brain Tumor Segmentation Using Channel Attention and Transfer learning

Authors: Majid Behzadpour, Ebrahim Azizi, Kai Wu, Bengie L. Ortiz

Abstract: Accurate and efficient segmentation of brain tumors is critical for diagnosis, treatment planning, and monitoring in clinical practice. In this study, we present an enhanced ResUNet architecture for automatic brain tumor segmentation, integrating an EfficientNetB0 encoder, a channel attention mechanism, and an Atrous Spatial Pyramid Pooling (ASPP) module. The EfficientNetB0 encoder leverages pre-t… ▽ More Accurate and efficient segmentation of brain tumors is critical for diagnosis, treatment planning, and monitoring in clinical practice. In this study, we present an enhanced ResUNet architecture for automatic brain tumor segmentation, integrating an EfficientNetB0 encoder, a channel attention mechanism, and an Atrous Spatial Pyramid Pooling (ASPP) module. The EfficientNetB0 encoder leverages pre-trained features to improve feature extraction efficiency, while the channel attention mechanism enhances the model's focus on tumor-relevant features. ASPP enables multiscale contextual learning, crucial for handling tumors of varying sizes and shapes. The proposed model was evaluated on two benchmark datasets: TCGA LGG and BraTS 2020. Experimental results demonstrate that our method consistently outperforms the baseline ResUNet and its EfficientNet variant, achieving Dice coefficients of 0.903 and 0.851 and HD95 scores of 9.43 and 3.54 for whole tumor and tumor core regions on the BraTS 2020 dataset, respectively. compared with state-of-the-art methods, our approach shows competitive performance, particularly in whole tumor and tumor core segmentation. These results indicate that combining a powerful encoder with attention mechanisms and ASPP can significantly enhance brain tumor segmentation performance. The proposed approach holds promise for further optimization and application in other medical image segmentation tasks. △ Less

Submitted 19 January, 2025; originally announced January 2025.

Comments: 13 pages, 1 figure

arXiv:2412.03799 [pdf, other]

High-Spatial Resolution Transmission and Storage Expansion Planning for High Renewable Grids: A Case Study

Authors: Kevin Wu, Rabab Haider, Pascal Van Hentenryck

Abstract: Transmission Expansion Planning (TEP) is the process of optimizing the development and upgrade of the power grid to ensure reliable, efficient, and cost-effective electricity delivery while addressing grid constraints. To support growing demand and renewable energy integration, energy storage is emerging as a pivotal asset that provides temporal flexibility and alleviates congestion. This paper pr… ▽ More Transmission Expansion Planning (TEP) is the process of optimizing the development and upgrade of the power grid to ensure reliable, efficient, and cost-effective electricity delivery while addressing grid constraints. To support growing demand and renewable energy integration, energy storage is emerging as a pivotal asset that provides temporal flexibility and alleviates congestion. This paper presents a TEP model that incorporates the sizing and siting of short-duration storage. With a focus on high spatial resolution, the model is applied to a 2,000-bus synthetic Texas power system, offering detailed insights into geographic investment and operational patterns. To maintain computational feasibility, a simple yet effective storage candidates (SC) method is introduced, significantly reducing the search space. Results highlight that transmission investments are primarily driven by renewable energy expansion, while storage investments are shaped by renewable curtailment and load-shedding events, with their primary function being peak load shaving. The findings underscore the importance of co-optimizing transmission and storage to minimize costs and enhance grid reliability. However, limitations in the ability of the SC method to identify optimal storage locations to meet long-term needs suggest opportunities for future research, including dynamic candidate selection and hybrid optimization techniques. △ Less

Submitted 4 December, 2024; originally announced December 2024.

arXiv:2411.17870 [pdf, other]

Breast Tumor Classification Using EfficientNet Deep Learning Model

Authors: Majid Behzadpour, Bengie L. Ortiz, Ebrahim Azizi, Kai Wu

Abstract: Precise breast cancer classification on histopathological images has the potential to greatly improve the diagnosis and patient outcome in oncology. The data imbalance problem largely stems from the inherent imbalance within medical image datasets, where certain tumor subtypes may appear much less frequently. This constitutes a considerable limitation in biased model predictions that can overlook… ▽ More Precise breast cancer classification on histopathological images has the potential to greatly improve the diagnosis and patient outcome in oncology. The data imbalance problem largely stems from the inherent imbalance within medical image datasets, where certain tumor subtypes may appear much less frequently. This constitutes a considerable limitation in biased model predictions that can overlook critical but rare classes. In this work, we adopted EfficientNet, a state-of-the-art convolutional neural network (CNN) model that balances high accuracy with computational cost efficiency. To address data imbalance, we introduce an intensive data augmentation pipeline and cost-sensitive learning, improving representation and ensuring that the model does not overly favor majority classes. This approach provides the ability to learn effectively from rare tumor types, improving its robustness. Additionally, we fine-tuned the model using transfer learning, where weights in the beginning trained on a binary classification task were adopted to multi-class classification, improving the capability to detect complex patterns within the BreakHis dataset. Our results underscore significant improvements in the binary classification performance, achieving an exceptional recall increase for benign cases from 0.92 to 0.95, alongside an accuracy enhancement from 97.35 % to 98.23%. Our approach improved the performance of multi-class tasks from 91.27% with regular augmentation to 94.54% with intensive augmentation, reaching 95.04% with transfer learning. This framework demonstrated substantial gains in precision in the minority classes, such as Mucinous carcinoma and Papillary carcinoma, while maintaining high recall consistently across these critical subtypes, as further confirmed by confusion matrix analysis. △ Less

Submitted 26 November, 2024; originally announced November 2024.

Comments: 19 pages, 7 figures

arXiv:2410.00356 [pdf, other]

A Digital Twin Framework for Physical-Virtual Integration in V2X-Enabled Connected Vehicle Corridors

Authors: Keshu Wu, Pei Li, Yang Cheng, Steven T. Parker, Bin Ran, David A. Noyce, Xinyue Ye

Abstract: Transportation Cyber-Physical Systems (T-CPS) enhance safety and mobility by integrating cyber and physical transportation systems. A key component of T-CPS is the Digital Twin (DT), a virtual representation that enables simulation, analysis, and optimization through real-time data exchange and communication. Although existing studies have explored DTs for vehicles, communications, pedestrians, an… ▽ More Transportation Cyber-Physical Systems (T-CPS) enhance safety and mobility by integrating cyber and physical transportation systems. A key component of T-CPS is the Digital Twin (DT), a virtual representation that enables simulation, analysis, and optimization through real-time data exchange and communication. Although existing studies have explored DTs for vehicles, communications, pedestrians, and traffic, real-world validations and implementations of DTs that encompass infrastructure, vehicles, signals, communications, and more remain limited due to several challenges. These include accessing real-world connected infrastructure, integrating heterogeneous, multi-sourced data, ensuring real-time data processing, and synchronizing the digital and physical systems. To address these challenges, this study develops a traffic DT based on a real-world connected vehicle corridor. Leveraging the Cellular Vehicle-to-Everything (C-V2X) infrastructure in the corridor, along with communication, computing, and simulation technologies, the proposed DT accurately replicates physical vehicle behaviors, signal timing, communications, and traffic patterns within the virtual environment. Building upon the previous data pipeline, the digital system ensures robust synchronization with the physical environment. Moreover, the DT's scalable and redundant architecture enhances data integrity, making it capable of supporting future large-scale C-V2X deployments. Furthermore, its ability to provide feedback to the physical system is demonstrated through applications such as signal timing adjustments, vehicle advisory messages, and incident notifications. The proposed DT is a vital tool in T-CPS, enabling real-time traffic monitoring, prediction, and optimization to enhance the reliability and safety of transportation systems. △ Less

Submitted 26 February, 2025; v1 submitted 30 September, 2024; originally announced October 2024.

arXiv:2409.20414 [pdf]

KANDU-Net:A Dual-Channel U-Net with KAN for Medical Image Segmentation

Authors: Chenglin Fang, Kaigui Wu

Abstract: The U-Net model has consistently demonstrated strong performance in the field of medical image segmentation, with various improvements and enhancements made since its introduction. This paper presents a novel architecture that integrates KAN networks with U-Net, leveraging the powerful nonlinear representation capabilities of KAN networks alongside the established strengths of U-Net. We introduce… ▽ More The U-Net model has consistently demonstrated strong performance in the field of medical image segmentation, with various improvements and enhancements made since its introduction. This paper presents a novel architecture that integrates KAN networks with U-Net, leveraging the powerful nonlinear representation capabilities of KAN networks alongside the established strengths of U-Net. We introduce a KAN-convolution dual-channel structure that enables the model to more effectively capture both local and global features. We explore effective methods for fusing features extracted by KAN with those obtained through convolutional layers, utilizing an auxiliary network to facilitate this integration process. Experiments conducted across multiple datasets show that our model performs well in terms of accuracy, indicating that the KAN-convolution dual-channel approach has significant potential in medical image segmentation tasks. △ Less

Submitted 30 September, 2024; originally announced September 2024.

arXiv:2409.02430 [pdf, other]

Transfer-based Adversarial Poisoning Attacks for Online (MIMO-)Deep Receviers

Authors: Kunze Wu, Weiheng Jiang, Dusit Niyato, Yinghuan Li, Chuang Luo

Abstract: Recently, the design of wireless receivers using deep neural networks (DNNs), known as deep receivers, has attracted extensive attention for ensuring reliable communication in complex channel environments. To adapt quickly to dynamic channels, online learning has been adopted to update the weights of deep receivers with over-the-air data (e.g., pilots). However, the fragility of neural models and… ▽ More Recently, the design of wireless receivers using deep neural networks (DNNs), known as deep receivers, has attracted extensive attention for ensuring reliable communication in complex channel environments. To adapt quickly to dynamic channels, online learning has been adopted to update the weights of deep receivers with over-the-air data (e.g., pilots). However, the fragility of neural models and the openness of wireless channels expose these systems to malicious attacks. To this end, understanding these attack methods is essential for robust receiver design. In this paper, we propose a transfer-based adversarial poisoning attack method for online receivers. Without knowledge of the attack target, adversarial perturbations are injected to the pilots, poisoning the online deep receiver and impairing its ability to adapt to dynamic channels and nonlinear effects. In particular, our attack method targets Deep Soft Interference Cancellation (DeepSIC)[1] using online meta-learning. As a classical model-driven deep receiver, DeepSIC incorporates wireless domain knowledge into its architecture. This integration allows it to adapt efficiently to time-varying channels with only a small number of pilots, achieving optimal performance in a multi-input and multi-output (MIMO) scenario. The deep receiver in this scenario has a number of applications in the field of wireless communication, which motivates our study of the attack methods targeting it. Specifically, we demonstrate the effectiveness of our attack in simulations on synthetic linear, synthetic nonlinear, static, and COST 2100 channels. Simulation results indicate that the proposed poisoning attack significantly reduces the performance of online receivers in rapidly changing scenarios. △ Less

Submitted 23 September, 2024; v1 submitted 4 September, 2024; originally announced September 2024.

Comments: 15 pages, 14 figures

arXiv:2408.09357 [pdf, other]

Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation

Authors: Xukun Zhou, Fengxin Li, Ziqiao Peng, Kejian Wu, Jun He, Biao Qin, Zhaoxin Fan, Hongyan Liu

Abstract: Audio-driven 3D face animation is increasingly vital in live streaming and augmented reality applications. While remarkable progress has been observed, most existing approaches are designed for specific individuals with predefined speaking styles, thus neglecting the adaptability to varied speaking styles. To address this limitation, this paper introduces MetaFace, a novel methodology meticulously… ▽ More Audio-driven 3D face animation is increasingly vital in live streaming and augmented reality applications. While remarkable progress has been observed, most existing approaches are designed for specific individuals with predefined speaking styles, thus neglecting the adaptability to varied speaking styles. To address this limitation, this paper introduces MetaFace, a novel methodology meticulously crafted for speaking style adaptation. Grounded in the novel concept of meta-learning, MetaFace is composed of several key components: the Robust Meta Initialization Stage (RMIS) for fundamental speaking style adaptation, the Dynamic Relation Mining Neural Process (DRMN) for forging connections between observed and unobserved speaking styles, and the Low-rank Matrix Memory Reduction Approach to enhance the efficiency of model optimization as well as learning style details. Leveraging these novel designs, MetaFace not only significantly outperforms robust existing baselines but also establishes a new state-of-the-art, as substantiated by our experimental results. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.06667 [pdf, other]

Joint Source-Channel Optimization for UAV Video Coding and Transmission

Authors: Kesong Wu, Xianbin Cao, Peng Yang, Haijun Zhang, Tony Q. S. Quek, Dapeng Oliver Wu

Abstract: This paper is concerned with unmanned aerial vehicle (UAV) video coding and transmission in scenarios such as emergency rescue and environmental monitoring. Unlike existing methods of modeling UAV video source coding and channel transmission separately, we investigate the joint source-channel optimization issue for video coding and transmission. Particularly, we design eight-dimensional delay-powe… ▽ More This paper is concerned with unmanned aerial vehicle (UAV) video coding and transmission in scenarios such as emergency rescue and environmental monitoring. Unlike existing methods of modeling UAV video source coding and channel transmission separately, we investigate the joint source-channel optimization issue for video coding and transmission. Particularly, we design eight-dimensional delay-power-rate-distortion models in terms of source coding and channel transmission and characterize the correlation between video coding and transmission, with which a joint source-channel optimization problem is formulated. Its objective is to minimize end-to-end distortion and UAV power consumption by optimizing fine-grained parameters related to UAV video coding and transmission. This problem is confirmed to be a challenging sequential-decision and non-convex optimization problem. We therefore decompose it into a family of repeated optimization problems by Lyapunov optimization and design an approximate convex optimization scheme with provable performance guarantees to tackle these problems. Based on the theoretical transformation, we propose a Lyapunov repeated iteration (LyaRI) algorithm. Both objective and subjective experiments are conducted to comprehensively evaluate the performance of LyaRI. The results indicate that, compared to its counterparts, LyaRI achieves better video quality and stability performance, with a 47.74% reduction in the variance of the obtained encoding bitrate. △ Less

Submitted 24 December, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

arXiv:2407.21514 [pdf]

Wireless Communications in Doubly Selective Channels with Domain Adaptivity

Authors: J. Andrew Zhang, Hongyang Zhang, Kai Wu, Xiaojing Huang, Jinhong Yuan, Y. Jay Guo

Abstract: Wireless communications are significantly impacted by the propagation environment, particularly in doubly selective channels with variations in both time and frequency domains. Orthogonal Time Frequency Space (OTFS) modulation has emerged as a promising solution; however, its high equalization complexity, if performed in the delay-Doppler domain, limits its universal application. This article expl… ▽ More Wireless communications are significantly impacted by the propagation environment, particularly in doubly selective channels with variations in both time and frequency domains. Orthogonal Time Frequency Space (OTFS) modulation has emerged as a promising solution; however, its high equalization complexity, if performed in the delay-Doppler domain, limits its universal application. This article explores domain-adaptive system design, with an emphasis on adaptive equalization, while also discussing modulation and pilot placement strategies. It investigates the dynamic selection of best-fit domains based on channel conditions to enhance performance across diverse environments. We examine channel domain connections, signal designs, and equalization techniques with domain adaptivity, and highlight future research opportunities. △ Less

Submitted 30 October, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

Comments: Magazine article, 7 pages, 4 figures, 2 tables

arXiv:2404.16484 [pdf, other]

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: CVPR 2024, AI for Streaming (AIS) Workshop

arXiv:2403.17701

Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation

Authors: Hao Tang, Lianglun Cheng, Guoheng Huang, Zhengguang Tan, Junhao Lu, Kaihong Wu

Abstract: Image segmentation holds a vital position in the realms of diagnosis and treatment within the medical domain. Traditional convolutional neural networks (CNNs) and Transformer models have made significant advancements in this realm, but they still encounter challenges because of limited receptive field or high computing complexity. Recently, State Space Models (SSMs), particularly Mamba and its var… ▽ More Image segmentation holds a vital position in the realms of diagnosis and treatment within the medical domain. Traditional convolutional neural networks (CNNs) and Transformer models have made significant advancements in this realm, but they still encounter challenges because of limited receptive field or high computing complexity. Recently, State Space Models (SSMs), particularly Mamba and its variants, have demonstrated notable performance in the field of vision. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. Motivated by previous spatial and channel attention methods, we propose Triplet Mamba-UNet. The method leverages residual VSS Blocks to extract intensive contextual features, while Triplet SSM is employed to fuse features across spatial and channel dimensions. We conducted experiments on ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, and Kvasir-Instrument datasets, demonstrating the superior segmentation performance of our proposed TM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters. △ Less

Submitted 3 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Experimental method encountered errors, undergoing experiment again

arXiv:2402.09048 [pdf, other]

Sensing in Bi-Static ISAC Systems with Clock Asynchronism: A Signal Processing Perspective

Authors: Kai Wu, Jacopo Pegoraro, Francesca Meneghello, J. Andrew Zhang, Jesus O. Lacruz, Joerg Widmer, Francesco Restuccia, Michele Rossi, Xiaojing Huang, Daqing Zhang, Giuseppe Caire, Y. Jay Guo

Abstract: Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks a… ▽ More Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks at far-separated transmitters and receivers. This causes the received signal to be affected by time-varying random phase offsets, severely degrading, or even failing, direct sensing. Hence, to effectively enable ISAC, considerable research has been directed toward addressing the clock asynchronism issue in bi-static sensing. This paper provides an overview of the issue and existing techniques developed in an ISAC background. Based on the review and comparison, we also draw insights into the future research directions and open problems, aiming to nurture the maturation of bi-static sensing in ISAC. △ Less

Submitted 24 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

Comments: 20 pages, 6 figures, 1 table

arXiv:2401.12974 [pdf, other]

SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI

Authors: Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, Alex M. Meyer, Brian Guo, Yashvi Atul Shah, Emily Luo, Shipra Rajput, Sally Kuehn, Clark Bulleit, Kevin A. Wu, Jisoo Lee, Brandon Ramirez, Darui Lu, Jay M. Levin, Maciej A. Mazurowski

Abstract: Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment pla… ▽ More Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment planning. Specifically, segmenting bones in MRI would allow for more quantitative assessments of musculoskeletal conditions, while such assessments are largely absent in current radiological practice. The difficulty of bone MRI segmentation is illustrated by the fact that limited algorithms are publicly available for use, and those contained in the literature typically address a specific anatomic area. In our study, we propose a versatile, publicly available deep-learning model for bone segmentation in MRI across multiple standard MRI locations. The proposed model can operate in two modes: fully automated segmentation and prompt-based segmentation. Our contributions include (1) collecting and annotating a new MRI dataset across various MRI protocols, encompassing over 300 annotated volumes and 8485 annotated slices across diverse anatomic regions; (2) investigating several standard network architectures and strategies for automated segmentation; (3) introducing SegmentAnyBone, an innovative foundational model-based approach that extends Segment Anything Model (SAM); (4) comparative analysis of our algorithm and previous approaches; and (5) generalization analysis of our algorithm across different anatomical locations and MRI sequences, as well as an external dataset. We publicly release our model at https://github.com/mazurowski-lab/SegmentAnyBone. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 15 pages, 15 figures

arXiv:2401.09064 [pdf, other]

Performance Bounds and Optimization for CSI-Ratio based Bi-static Doppler Sensing in ISAC Systems

Authors: Yanmo Hu, Kai Wu, J. Andrew Zhang, Weibo Deng, Y. Jay Guo

Abstract: Bi-static sensing is crucial for exploring the potential of networked sensing capabilities in integrated sensing and communications (ISAC). However, it suffers from the challenging clock asynchronism issue. CSI ratio-based sensing is an effective means to address the issue. Its performance bounds, particular for Doppler sensing, have not been fully understood yet. This work endeavors to fill the r… ▽ More Bi-static sensing is crucial for exploring the potential of networked sensing capabilities in integrated sensing and communications (ISAC). However, it suffers from the challenging clock asynchronism issue. CSI ratio-based sensing is an effective means to address the issue. Its performance bounds, particular for Doppler sensing, have not been fully understood yet. This work endeavors to fill the research gap. Focusing on a single dynamic path in high-SNR scenarios, we derive the closed-form CRB. Then, through analyzing the mutual interference between dynamic and static paths, we simplify the CRB results by deriving close approximations, further unveiling new insights of the impact of numerous physical parameters on Doppler sensing. Moreover, utilizing the new CRB and analyses, we propose novel waveform optimization strategies for noise- and interference-limited sensing scenarios, which are also empowered by closed-form and efficient solutions. Extensive simulation results are provided to validate the preciseness of the derived CRB results and analyses, with the aid of the maximum-likelihood estimator. The results also demonstrate the substantial enhanced Doppler sensing accuracy and the sensing capabilities for low-speed target achieved by the proposed waveform design. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: 14 pages, 15 figures, journal paper

arXiv:2312.14511 [pdf]

3D Programming of Patterned Heterogeneous Interface for 4D Smart Robotics

Authors: Kewei Song, Chunfeng Xiong, Ze Zhang, Kunlin Wu, Weiyang Wan, Yifan Wang, Shinjiro Umezu, Hirotaka Sato

Abstract: Shape memory structures are playing an important role in many cutting-edge intelligent fields. However, the existing technologies can only realize 4D printing of a single polymer or metal, which limits practical applications. Here, we report a construction strategy for TSMP/M heterointerface, which uses Pd2+-containing shape memory polymer (AP-SMR) to induce electroless plating reaction and relies… ▽ More Shape memory structures are playing an important role in many cutting-edge intelligent fields. However, the existing technologies can only realize 4D printing of a single polymer or metal, which limits practical applications. Here, we report a construction strategy for TSMP/M heterointerface, which uses Pd2+-containing shape memory polymer (AP-SMR) to induce electroless plating reaction and relies on molecular dynamics, which has both shape memory properties and metal activity and information processing power. Through multi-material DLP 3D printing technology, the interface can be 3D selectively programmed on functional substrate parts of arbitrary shapes to become 4D electronic smart devices (Robotics). Microscopically, this type of interface appears as a composite structure with a nanometer-micrometer interface height, which is composed of a pure substrate layer (smart materials), an intermediate layer (a composite structure in which metal particles are embedded in a polymer cross-linked network) and a pure metal layer. The structure programmed by TSMP/M heterointerface exhibits both SMA characteristics and metal properties, thus having more intelligent functions (electroactive, electrothermal deformation, electronically controlled denaturation) and higher performance (selectivity of shape memory structures can be realized control, remote control, inline control and low voltage control). This is expected to provide a more flexible manufacturing process as platform technology for designing, manufacturing and applying smart devices with new concepts, and promote the development of cutting-edge industries such as smart robots and smart electronics. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: 37 Pages, 11 Figures

arXiv:2312.12581 [pdf]

Wireless, Customizable Coaxially-shielded Coils for Magnetic Resonance Imaging

Authors: Ke Wu, Xia Zhu, Stephan W. Anderson, Xin Zhang

Abstract: Anatomy-specific RF receive coil arrays routinely adopted in magnetic resonance imaging (MRI) for signal acquisition, are commonly burdened by their bulky, fixed, and rigid configurations, which may impose patient discomfort, bothersome positioning, and suboptimal sensitivity in certain situations. Herein, leveraging coaxial cables' inherent flexibility and electric field confining property, for t… ▽ More Anatomy-specific RF receive coil arrays routinely adopted in magnetic resonance imaging (MRI) for signal acquisition, are commonly burdened by their bulky, fixed, and rigid configurations, which may impose patient discomfort, bothersome positioning, and suboptimal sensitivity in certain situations. Herein, leveraging coaxial cables' inherent flexibility and electric field confining property, for the first time, we present wireless, ultra-lightweight, coaxially-shielded MRI coils achieving a signal-to-noise ratio (SNR) comparable to or surpassing that of commercially available cutting-edge receive coil arrays with the potential for improved patient comfort, ease of implementation, and significantly reduced costs. The proposed coils demonstrate versatility by functioning both independently in form-fitting configurations, closely adapting to relatively small anatomical sites, and collectively by inductively coupling together as metamaterials, allowing for extension of the field-of-view of their coverage to encompass larger anatomical regions without compromising coil sensitivity. The wireless, coaxially-shielded MRI coils reported herein pave the way toward next generation MRI coils. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.11306 [pdf]

Human-machine cooperation: optimization of drug retrieval sequencing in automated drug dispensing systems

Authors: Mengge Yuan, Kan Wu, Ning Zhao

Abstract: Automated drug dispensing systems (ADDSs) are increasingly in demand in today's pharmacies, primarily driven by the growing ageing population. Recognizing the practical challenges faced by pharmacies implementing ADDSs, this study aims to optimize the layout design and sequencing issues within a human-machine cooperation environment to enhance the system throughput of ADDSs. Specifically, we devel… ▽ More Automated drug dispensing systems (ADDSs) are increasingly in demand in today's pharmacies, primarily driven by the growing ageing population. Recognizing the practical challenges faced by pharmacies implementing ADDSs, this study aims to optimize the layout design and sequencing issues within a human-machine cooperation environment to enhance the system throughput of ADDSs. Specifically, we develop models for drug retrieval sequencing under different system layout designs, taking into account the stochastic sorting time of pharmacists. The prescription order arrival pattern follows a successive arrival mode. To assess the efficiency of ADDSs with one input/output point and two input/output points, we propose dual command retrieval sequencing models that optimize the retrieval sequence of drugs in adjacent prescription orders. Notably, our models incorporate the stochastic sorting time of pharmacists to analyze its impact on ADDS performance. Through experimental comparisons of average picking times for prescription orders under various operational conditions, we demonstrate that a system layout design incorporating two input/output points significantly enhances the efficiency of prescription order fulfilment within a human-machine cooperation environment. Furthermore, our proposed retrieval sequencing method outperforms dynamic programming, greedy, and random strategies in terms of improving prescription order-picking efficiency. By addressing the layout design and sequencing challenges, our research contributes to the field of intelligent warehousing, particularly in smart pharmacies. The findings provide valuable insights for healthcare facilities and organizations seeking to optimize ADDS performance and enhance drug dispensing efficiency. △ Less

Submitted 16 January, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.10018 [pdf]

Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging

Authors: Xia Zhu, Ke Wu, Stephan W. Anderson, Xin Zhang

Abstract: Recent advancements in metamaterials have yielded the possibility of a wireless solution to improve signal-to-noise ratio (SNR) in magnetic resonance imaging (MRI). Unlike traditional closely packed local coil arrays with rigid designs and numerous components, these lightweight, cost-effective metamaterials eliminate the need for radio frequency (RF) cabling, baluns, adapters, and interfaces. Howe… ▽ More Recent advancements in metamaterials have yielded the possibility of a wireless solution to improve signal-to-noise ratio (SNR) in magnetic resonance imaging (MRI). Unlike traditional closely packed local coil arrays with rigid designs and numerous components, these lightweight, cost-effective metamaterials eliminate the need for radio frequency (RF) cabling, baluns, adapters, and interfaces. However, their clinical adoption has been limited by their low sensitivity, bulky physical footprint, and limited, specific use cases. Herein, we introduce a wearable metamaterial developed using commercially available coaxial cable, designed for a 3.0 T MRI system. This metamaterial inherits the coaxially-shielded structure of its constituent coaxial cable, effectively containing the electric field within the cable, thereby mitigating the electric coupling to its loading while ensuring safer clinical adoption, lower signal loss, and resistance to frequency shifts. Weighing only 50g, the metamaterial maximizes its sensitivity by conforming to the anatomical region of interest. MRI images acquired using this metamaterial with various pulse sequences demonstrate an up to 2-fold SNR enhancement when compared to a state-of-the-art 16-channel knee coil. This work introduces a novel paradigm for constructing metamaterials in the MRI environment, paving the way for the development of next-generation wireless MRI technology. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2311.04049 [pdf, other]

3D EAGAN: 3D edge-aware attention generative adversarial network for prostate segmentation in transrectal ultrasound images

Authors: Mengqing Liu, Xiao Shao, Liping Jiang, Kaizhi Wu

Abstract: Automatic prostate segmentation in TRUS images has always been a challenging problem, since prostates in TRUS images have ambiguous boundaries and inhomogeneous intensity distribution. Although many prostate segmentation methods have been proposed, they still need to be improved due to the lack of sensibility to edge information. Consequently, the objective of this study is to devise a highly effe… ▽ More Automatic prostate segmentation in TRUS images has always been a challenging problem, since prostates in TRUS images have ambiguous boundaries and inhomogeneous intensity distribution. Although many prostate segmentation methods have been proposed, they still need to be improved due to the lack of sensibility to edge information. Consequently, the objective of this study is to devise a highly effective prostate segmentation method that overcomes these limitations and achieves accurate segmentation of prostates in TRUS images. A 3D edge-aware attention generative adversarial network (3D EAGAN)-based prostate segmentation method is proposed in this paper, which consists of an edge-aware segmentation network (EASNet) that performs the prostate segmentation and a discriminator network that distinguishes predicted prostates from real prostates. The proposed EASNet is composed of an encoder-decoder-based U-Net backbone network, a detail compensation module, four 3D spatial and channel attention modules, an edge enhance module, and a global feature extractor. The detail compensation module is proposed to compensate for the loss of detailed information caused by the down-sampling process of the encoder. The features of the detail compensation module are selectively enhanced by the 3D spatial and channel attention module. Furthermore, an edge enhance module is proposed to guide shallow layers in the EASNet to focus on contour and edge information in prostates. Finally, features from shallow layers and hierarchical features from the decoder module are fused through the global feature extractor to predict the segmentation prostates. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.00535 [pdf]

Active Noise Control Portable Device Design

Authors: kai Wu, Yuanyuan Chen

Abstract: While our world is filled with its own natural sounds that we can't resist enjoying, it is also chock-full of other sounds that can be irritating, this is noise. Noise not only influences the working efficiency but also the human's health. The problem of reducing noise is one of great importance and great difficulty. The problem has been addressed in many ways over the years. The current methods f… ▽ More While our world is filled with its own natural sounds that we can't resist enjoying, it is also chock-full of other sounds that can be irritating, this is noise. Noise not only influences the working efficiency but also the human's health. The problem of reducing noise is one of great importance and great difficulty. The problem has been addressed in many ways over the years. The current methods for noise reducing mostly rely on the materials and transmission medium, which are only effective to some extent for the high frequency noise. However, the effective reduction noise method especially for low frequency noise is very limited. Here we come up with a noise reduction system consist of a sensor to detect the noise in the environment. Then the noise will be sent to an electronic control system to process the noise, which will generate a reverse phase frequency signal to counteract the disturbance. Finally, the processed smaller noise will be broadcasted by the speaker. Through this smart noise reduction system, even the noise with low-frequency can be eliminated. The system is also integrated with sleep tracking and music player applications. It can also remember and store settings for the same environment, sense temperature, and smart control of home furniture, fire alarm, etc. This smart system can transfer data easily by Wi-Fi or Bluetooth and controlled by its APP. In this project, we will present a model of the above technology which can be used in various environments to prevent noise pollution and provide a solution to the people who have difficulties finding a peaceful and quiet environment for sleep, work or study. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.16869 [pdf]

Single-pixel imaging based on deep learning

Authors: Kai Song, Yaoxing Bian, Ku Wu, Hongrui Liu, Shuangping Han, Jiaming Li, Jiazhao Tian, Chengbin Qin, Jianyong Hu, Liantuan Xiao

Abstract: Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to i… ▽ More Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to its exceptional reconstruction quality, fast reconstruction speed, and the potential to complete advanced sensing tasks without reconstructing images. Here, this advance is discussed and some opinions are offered. Firstly, based on the fundamental principles of single-pixel imaging and deep learning, the principles and algorithms of single-pixel imaging based on deep learning are described and analyzed. Subsequently, the implementation technologies of single-pixel imaging based on deep learning are reviewed. They are divided into super-resolution single-pixel imaging, single-pixel imaging through scattering media, photon-level single-pixel imaging, optical encryption based on single-pixel imaging, color single-pixel imaging, and image-free sensing according to diverse application fields. Finally, major challenges and corresponding feasible approaches are discussed, as well as more possible applications in the future. △ Less

Submitted 16 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

arXiv:2310.10957 [pdf, other]

Medical Image Segmentation via Sparse Coding Decoder

Authors: Long Zeng, Kaigui Wu

Abstract: Transformers have achieved significant success in medical image segmentation, owing to its capability to capture long-range dependencies. Previous works incorporate convolutional layers into the encoder module of transformers, thereby enhancing their ability to learn local relationships among pixels. However, transformers may suffer from limited generalization capabilities and reduced robustness,… ▽ More Transformers have achieved significant success in medical image segmentation, owing to its capability to capture long-range dependencies. Previous works incorporate convolutional layers into the encoder module of transformers, thereby enhancing their ability to learn local relationships among pixels. However, transformers may suffer from limited generalization capabilities and reduced robustness, attributed to the insufficient spatial recovery ability of their decoders. To address this issue, A convolution sparse vector coding based decoder is proposed , namely CAScaded multi-layer Convolutional Sparse vector Coding DEcoder (CASCSCDE), which represents features extracted by the encoder using sparse vectors. To prove the effectiveness of our CASCSCDE, The widely-used TransUNet model is chosen for the demonstration purpose, and the CASCSCDE is incorporated with TransUNet to establish the TransCASCSCDE architecture. Our experiments demonstrate that TransUNet with CASCSCDE significantly enhances performance on the Synapse benchmark, obtaining up to 3.15\% and 1.16\% improvements in DICE and mIoU scores, respectively. CASCSCDE opens new ways for constructing decoders based on convolutional sparse vector coding. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 8 pages, 1 figures

MSC Class: 68T07; 68U10 ACM Class: I.4.6; I.4.7; I.5.1

arXiv:2310.03297 [pdf, other]

Passive Respiration Detection via mmWave Communication Signal Under Interference

Authors: Kehan Wu, Renqi Chen, Haiyu Wang, Chenqing Ji, Jiayuan Zhu, Guang Wu

Abstract: Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration… ▽ More Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration detection in the presence of robust human motion interference. Operating within the 60.48 GHz band, the proposed system aims to detect human respiration even when confronted with substantial human motion interference within close proximity. Subsequently, a neural network is trained using the collected data by us to enable human respiration detection. The experimental results demonstrate a consistently high accuracy rate over 90\% of the human respiration detection under interference, given an adequate sensing duration. Finally, an empirical model is derived analytically to achieve the respiratory rate counting in 10 seconds. △ Less

Submitted 4 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to WCNC2024 Workshop

arXiv:2310.02347 [pdf, other]

Strong Mixed-Integer Formulations for Transmission Expansion Planning with FACTS Devices

Authors: Kevin Wu, Mathieu Tanneau, Pascal Van Hentenryck

Abstract: Transmission Network Expansion Planning (TNEP) problems find the most economical way of expanding a given grid given long-term growth in generation capacity and demand patterns. The recent development of Flexible AC Transmission System (FACTS) devices, which can dynamically re-route power flows by adjusting individual branches' impedance, call for their integration into TNEP problems. However, the… ▽ More Transmission Network Expansion Planning (TNEP) problems find the most economical way of expanding a given grid given long-term growth in generation capacity and demand patterns. The recent development of Flexible AC Transmission System (FACTS) devices, which can dynamically re-route power flows by adjusting individual branches' impedance, call for their integration into TNEP problems. However, the resulting TNEP+FACTS formulations are significantly harder to solve than traditional TNEP instances, due to the nonlinearity of FACTS behavior. This paper proposes a new mixed-integer formulation for TNEP+FACTS, which directly represents the change in power flow induced by individual FACTS devices. The proposed formulation uses an extended formulation and facet-defining constraints, which are stronger than big-M constraints typically used in the literature. The paper conducts numerical experiments on a synthetic model of the Texas system with high renewable penetration. The results demonstrate the computational superiority of the proposed approach, which achieves a 4x speedup over state-of-the-art formulations, and highlight the potential of FACTS devices to mitigate congestion. △ Less

Submitted 8 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

arXiv:2310.00153 [pdf]

Conformal Metamaterials with Active Tunability and Self-adaptivity for Magnetic Resonance Imaging

Authors: Ke Wu, Xia Zhu, Xiaoguang Zhao, Stephan W. Anderson, Xin Zhang

Abstract: Ongoing effort has been devoted to applying metamaterials to boost the imaging performance of magnetic resonance imaging owing to their unique capacity for electromagnetic field confinement and enhancement. However, there are still major obstacles to widespread clinical adoption of conventional metamaterials due to several notable restrictions, namely: their typically bulky and rigid structures, d… ▽ More Ongoing effort has been devoted to applying metamaterials to boost the imaging performance of magnetic resonance imaging owing to their unique capacity for electromagnetic field confinement and enhancement. However, there are still major obstacles to widespread clinical adoption of conventional metamaterials due to several notable restrictions, namely: their typically bulky and rigid structures, deviations in their optimal resonance frequency, and their inevitable interference with the transmission RF field in MRI. Herein, we address these restrictions and report a conformal, smart metamaterial, which may not only be readily tuned to achieve the desired, precise frequency match with MRI by a controlling circuit, but is also capable of selectively amplifying the magnetic field during the RF reception phase by sensing the excitation signal strength passively, thereby remaining off during the RF transmission phase and thereby ensuring its optimal performance when applied to MRI as an additive technology. By addressing a host of current technological challenges, the metamaterial presented herein paves the way toward the wide-ranging utilization of metamaterials in clinical MRI, thereby translating this promising technology to the MRI bedside. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 21 pages, 7 figures

arXiv:2309.13890 [pdf, other]

Bitstream-Corrupted Video Recovery: A Novel Benchmark Dataset and Method

Authors: Tianyi Liu, Kejun Wu, Yi Wang, Wenyang Liu, Kim-Hui Yap, Lap-Pui Chau

Abstract: The past decade has witnessed great strides in video recovery by specialist technologies, like video inpainting, completion, and error concealment. However, they typically simulate the missing content by manual-designed error masks, thus failing to fill in the realistic video loss in video communication (e.g., telepresence, live streaming, and internet video) and multimedia forensics. To address t… ▽ More The past decade has witnessed great strides in video recovery by specialist technologies, like video inpainting, completion, and error concealment. However, they typically simulate the missing content by manual-designed error masks, thus failing to fill in the realistic video loss in video communication (e.g., telepresence, live streaming, and internet video) and multimedia forensics. To address this, we introduce the bitstream-corrupted video (BSCV) benchmark, the first benchmark dataset with more than 28,000 video clips, which can be used for bitstream-corrupted video recovery in the real world. The BSCV is a collection of 1) a proposed three-parameter corruption model for video bitstream, 2) a large-scale dataset containing rich error patterns, multiple corruption levels, and flexible dataset branches, and 3) a plug-and-play module in video recovery framework that serves as a benchmark. We evaluate state-of-the-art video inpainting methods on the BSCV dataset, demonstrating existing approaches' limitations and our framework's advantages in solving the bitstream-corrupted video recovery problem. The benchmark and dataset are released at https://github.com/LIUTIGHE/BSCV-Dataset. △ Less

Submitted 26 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted by NeurIPS Dataset and Benchmark Track 2023

arXiv:2309.11811 [pdf, other]

Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction

Authors: Yu Tian, Qiyang Zhao, Zine el abidine Kherroubi, Fouzi Boukhalfa, Kebin Wu, Faouzi Bader

Abstract: Wireless communications at high-frequency bands with large antenna arrays face challenges in beam management, which can potentially be improved by multimodality sensing information from cameras, LiDAR, radar, and GPS. In this paper, we present a multimodal transformer deep learning framework for sensing-assisted beam prediction. We employ a convolutional neural network to extract the features from… ▽ More Wireless communications at high-frequency bands with large antenna arrays face challenges in beam management, which can potentially be improved by multimodality sensing information from cameras, LiDAR, radar, and GPS. In this paper, we present a multimodal transformer deep learning framework for sensing-assisted beam prediction. We employ a convolutional neural network to extract the features from a sequence of images, point clouds, and radar raw data sampled over time. At each convolutional layer, we use transformer encoders to learn the hidden relations between feature tokens from different modalities and time instances over abstraction space and produce encoded vectors for the next-level feature extraction. We train the model on a combination of different modalities with supervised learning. We try to enhance the model over imbalanced data by utilizing focal loss and exponential moving average. We also evaluate data processing and augmentation techniques such as image enhancement, segmentation, background filtering, multimodal data flipping, radar signal transformation, and GPS angle calibration. Experimental results show that our solution trained on image and GPS data produces the best distance-based accuracy of predicted beams at 78.44%, with effective generalization to unseen day scenarios near 73% and night scenarios over 84%. This outperforms using other modalities and arbitrary data processing techniques, which demonstrates the effectiveness of transformers with feature fusion in performing radio beam prediction from images and GPS. Furthermore, our solution could be pretrained from large sequences of multimodality wireless data, on fine-tuning for multiple downstream radio network tasks. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2307.12264 [pdf, ps, other]

QoE-Driven Video Transmission: Energy-Efficient Multi-UAV Network Optimization

Authors: Kesong Wu, Xianbin Cao, Peng Yang, Zongyang Yu, Dapeng Oliver Wu, Tony Q. S. Quek

Abstract: This paper is concerned with the issue of improving video subscribers' quality of experience (QoE) by deploying a multi-unmanned aerial vehicle (UAV) network. Different from existing works, we characterize subscribers' QoE by video bitrates, latency, and frame freezing and propose to improve their QoE by energy-efficiently and dynamically optimizing the multi-UAV network in terms of serving UAV se… ▽ More This paper is concerned with the issue of improving video subscribers' quality of experience (QoE) by deploying a multi-unmanned aerial vehicle (UAV) network. Different from existing works, we characterize subscribers' QoE by video bitrates, latency, and frame freezing and propose to improve their QoE by energy-efficiently and dynamically optimizing the multi-UAV network in terms of serving UAV selection, UAV trajectory, and UAV transmit power. The dynamic multi-UAV network optimization problem is formulated as a challenging sequential-decision problem with the goal of maximizing subscribers' QoE while minimizing the total network power consumption, subject to some physical resource constraints. We propose a novel network optimization algorithm to solve this challenging problem, in which a Lyapunov technique is first explored to decompose the sequential-decision problem into several repeatedly optimized sub-problems to avoid the curse of dimensionality. To solve the sub-problems, iterative and approximate optimization mechanisms with provable performance guarantees are then developed. Finally, we design extensive simulations to verify the effectiveness of the proposed algorithm. Simulation results show that the proposed algorithm can effectively improve the QoE of subscribers and is 66.75\% more energy-efficient than benchmarks. △ Less

Submitted 23 July, 2023; originally announced July 2023.

arXiv:2306.17270 [pdf, other]

A Unified Framework for Online Data-Driven Predictive Control with Robust Safety Guarantees

Authors: Amin Vahidi-Moghaddam, Kaian Chen, Kaixiang Zhang, Zhaojian Li, Yan Wang, Kai Wu

Abstract: Despite great successes, model predictive control (MPC) relies on an accurate dynamical model and requires high onboard computational power, impeding its wider adoption in engineering systems, especially for nonlinear real-time systems with limited computation power. These shortcomings of MPC motivate this work to make such a control framework more practically viable for real-world applications. S… ▽ More Despite great successes, model predictive control (MPC) relies on an accurate dynamical model and requires high onboard computational power, impeding its wider adoption in engineering systems, especially for nonlinear real-time systems with limited computation power. These shortcomings of MPC motivate this work to make such a control framework more practically viable for real-world applications. Specifically, to remove the required accurate dynamical model and reduce the computational cost for nonlinear MPC (NMPC), this paper develops a unified online data-driven predictive control pipeline to efficiently control a system with guaranteed safety without incurring large computational complexity. The new aspect of this idea is learning not only the real system but also the control policy, which results in a reasonable computational cost for the data-driven predictive controllers. More specifically, we first develop a spatial temporal filter (STF)-based concurrent learning scheme to systematically identify system dynamics for general nonlinear systems. We then develop a robust control barrier function (RCBF) for safety guarantees in the presence of model uncertainties and learn the RCBF-based NMPC policy. Furthermore, to mitigate the performance degradation due to the existing model uncertainties, we propose an online policy correction scheme through perturbation analysis and design of an ancillary feedback controller. Finally, extensive simulations on two applications, cart-inverted pendulum and automotive powertrain control, are performed to demonstrate the efficacy of the proposed framework, which shows comparable performance with much lower computational cost in comparison with several benchmark algorithms. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.08779 [pdf]

Theory of Periodic Sequence: Bridging Time-Domain and Frequency-Domain for Computational Electromagnetics

Authors: Ben You, Ke Wu

Abstract: Time-periodic form or expression is a ubiquitous natural and man-made phenomenon observable in all the scientific and engineering disciplines. In this article, we propose a theory of periodic sequence (TPS), which can be formulated as a foundational theory for computational sciences and engineering, to transform arbitrary time-periodic electromagnetic (EM) problems into a computational space with… ▽ More Time-periodic form or expression is a ubiquitous natural and man-made phenomenon observable in all the scientific and engineering disciplines. In this article, we propose a theory of periodic sequence (TPS), which can be formulated as a foundational theory for computational sciences and engineering, to transform arbitrary time-periodic electromagnetic (EM) problems into a computational space with mapped discrete events, which is characterized in neither frequency domain nor time domain. Within the TPS framework, periodic-sequential Maxwell's curl equations are decomposed and decoupled to independent and paralleled instances via designated mappings. The fundamental solutions and mapped responses of EM periodic sequences are elucidated, and corroborated by RF/microwave measurements. The nature of outstanding computational parallelism and the unique frequency-independent property make TPS a promising methodology for computational electromagnetics such as analysis of high-speed signal integrity and broadband RF transmission. △ Less

Submitted 14 June, 2023; originally announced June 2023.

arXiv:2306.07873 [pdf, other]

Low-Complexity Soft-Decision Detection for Combating DFE Burst Errors in IM/DD Links

Authors: Kaiquan Wu, Gabriele Liga, Jamal Riani, Alex Alvarado

Abstract: The deployment of non-binary pulse amplitude modulation (PAM) and soft decision (SD)-forward error correction (FEC) in future intensity-modulation (IM)/direct-detection (DD) links is inevitable. However, high-speed IM/DD links suffer from inter-symbol interference (ISI) due to bandwidth-limited hardware. Traditional approaches to mitigate the effects of ISI are filters and trellis-based algorithms… ▽ More The deployment of non-binary pulse amplitude modulation (PAM) and soft decision (SD)-forward error correction (FEC) in future intensity-modulation (IM)/direct-detection (DD) links is inevitable. However, high-speed IM/DD links suffer from inter-symbol interference (ISI) due to bandwidth-limited hardware. Traditional approaches to mitigate the effects of ISI are filters and trellis-based algorithms targeting symbol-wise maximum a posteriori (MAP) detection. The former approach includes decision-feedback equalizer (DFE), and the latter includes Max-Log-MAP (MLM) and soft-output Viterbi algorithm (SOVA). Although DFE is easy to implement, it introduces error propagation. Such burst errors distort the log-likelihood ratios (LLRs) required by SD-FEC, causing performance degradation. On the other hand, MLM and SOVA provide near-optimum performance, but their complexity is very high for high-order PAM. In this paper, we consider a one-tap partial response channel model, which is relevant for high-speed IM/DD links. We propose to combine DFE with either MLM or SOVA in a low-complexity architecture. The key idea is to allow MLM or SOVA to detect only 3 typical DFE symbol errors, and use the detected error information to generate LLRs in a modified demapper. The proposed structure enables a tradeoff between complexity and performance: (i) the complexity of MLM or SOVA is reduced and (ii) the decoding penalty due to error propagation is mitigated. Compared to SOVA detection, the proposed scheme can achieve a significant complexity reduction of up to 94% for PAM-8 transmission. Simulation and experimental results show that the resulting SNR loss is roughly 0.3 to 0.4 dB for PAM-4, and becomes marginal 0.18 dB for PAM-8. △ Less

Submitted 28 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: This manuscript has been submitted to JLT

arXiv:2306.06603 [pdf, ps, other]

Task-Oriented Integrated Sensing, Computation and Communication for Wireless Edge AI

Authors: Hong Xing, Guangxu Zhu, Dongzhu Liu, Haifeng Wen, Kaibin Huang, Kaishun Wu

Abstract: With the advent of emerging IoT applications such as autonomous driving, digital-twin and metaverse etc. featuring massive data sensing, analyzing and inference as well critical latency in beyond 5G (B5G) networks, edge artificial intelligence (AI) has been proposed to provide high-performance computation of a conventional cloud down to the network edge. Recently, convergence of wireless sensing,… ▽ More With the advent of emerging IoT applications such as autonomous driving, digital-twin and metaverse etc. featuring massive data sensing, analyzing and inference as well critical latency in beyond 5G (B5G) networks, edge artificial intelligence (AI) has been proposed to provide high-performance computation of a conventional cloud down to the network edge. Recently, convergence of wireless sensing, computation and communication (SC${}^2$) for specific edge AI tasks, has aroused paradigm shift by enabling (partial) sharing of the radio-frequency (RF) transceivers and information processing pipelines among these three fundamental functionalities of IoT. However, most existing design frameworks separate these designs incurring unnecessary signaling overhead and waste of energy, and it is therefore of paramount importance to advance fully integrated sensing, computation and communication (ISCC) to achieve ultra-reliable and low-latency edge intelligence acquisition. In this article, we provide an overview of principles of enabling ISCC technologies followed by two concrete use cases of edge AI tasks demonstrating the advantage of task-oriented ISCC, and pointed out some practical challenges in edge AI design with advanced ISCC solutions. △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: 18 pages, 6 figures, submitted for possible journal publication

arXiv:2305.15349 [pdf, other]

On the Convergence of Black-Box Variational Inference

Authors: Kyurae Kim, Jisu Oh, Kaiwen Wu, Yi-An Ma, Jacob R. Gardner

Abstract: We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference. While preliminary investigations worked on simplified versions of BBVI (e.g., bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications. Our results hold for log-smooth posterior dens… ▽ More We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference. While preliminary investigations worked on simplified versions of BBVI (e.g., bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications. Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family. Also, our analysis reveals that certain algorithm design choices commonly employed in practice, particularly, nonlinear parameterizations of the scale of the variational approximation, can result in suboptimal convergence rates. Fortunately, running BBVI with proximal stochastic gradient descent fixes these limitations, and thus achieves the strongest known convergence rate guarantees. We evaluate this theoretical insight by comparing proximal SGD against other standard implementations of BBVI on large-scale Bayesian inference problems. △ Less

Submitted 10 January, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: Accepted to NeurIPS'23; previous title: "Black-Box Variational Inference Converges"

arXiv:2304.06983 [pdf, other]

A Byte Sequence is Worth an Image: CNN for File Fragment Classification Using Bit Shift and n-Gram Embeddings

Authors: Wenyang Liu, Yi Wang, Kejun Wu, Kim-Hui Yap, Lap-Pui Chau

Abstract: File fragment classification (FFC) on small chunks of memory is essential in memory forensics and Internet security. Existing methods mainly treat file fragments as 1d byte signals and utilize the captured inter-byte features for classification, while the bit information within bytes, i.e., intra-byte information, is seldom considered. This is inherently inapt for classifying variable-length codin… ▽ More File fragment classification (FFC) on small chunks of memory is essential in memory forensics and Internet security. Existing methods mainly treat file fragments as 1d byte signals and utilize the captured inter-byte features for classification, while the bit information within bytes, i.e., intra-byte information, is seldom considered. This is inherently inapt for classifying variable-length coding files whose symbols are represented as the variable number of bits. Conversely, we propose Byte2Image, a novel data augmentation technique, to introduce the neglected intra-byte information into file fragments and re-treat them as 2d gray-scale images, which allows us to capture both inter-byte and intra-byte correlations simultaneously through powerful convolutional neural networks (CNNs). Specifically, to convert file fragments to 2d images, we employ a sliding byte window to expose the neglected intra-byte information and stack their n-gram features row by row. We further propose a byte sequence \& image fusion network as a classifier, which can jointly model the raw 1d byte sequence and the converted 2d image to perform FFC. Experiments on FFT-75 dataset validate that our proposed method can achieve notable accuracy improvements over state-of-the-art methods in nearly all scenarios. The code will be released at https://github.com/wenyang001/Byte2Image. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: Accepted by AICAS 2023

arXiv:2304.05084 [pdf, other]

A Self-attention Knowledge Domain Adaptation Network for Commercial Lithium-ion Batteries State-of-health Estimation under Shallow Cycles

Authors: Xin Chen, Yuwen Qin, Weidong Zhao, Qiming Yang, Ningbo Cai, Kai Wu

Abstract: Accurate state-of-health (SOH) estimation is critical to guarantee the safety, efficiency and reliability of battery-powered applications. Most SOH estimation methods focus on the 0-100\% full state-of-charge (SOC) range that has similar distributions. However, the batteries in real-world applications usually work in the partial SOC range under shallow-cycle conditions and follow different degrada… ▽ More Accurate state-of-health (SOH) estimation is critical to guarantee the safety, efficiency and reliability of battery-powered applications. Most SOH estimation methods focus on the 0-100\% full state-of-charge (SOC) range that has similar distributions. However, the batteries in real-world applications usually work in the partial SOC range under shallow-cycle conditions and follow different degradation profiles with no labeled data available, thus making SOH estimation challenging. To estimate shallow-cycle battery SOH, a novel unsupervised deep transfer learning method is proposed to bridge different domains using self-attention distillation module and multi-kernel maximum mean discrepancy technique. The proposed method automatically extracts domain-variant features from charge curves to transfer knowledge from the large-scale labeled full cycles to the unlabeled shallow cycles. The CALCE and SNL battery datasets are employed to verify the effectiveness of the proposed method to estimate the battery SOH for different SOC ranges, temperatures, and discharge rates. The proposed method achieves a root-mean-square error within 2\% and outperforms other transfer learning methods for different SOC ranges. When applied to batteries with different operating conditions and from different manufacturers, the proposed method still exhibits superior SOH estimation performance. The proposed method is the first attempt at accurately estimating battery SOH under shallow-cycle conditions without needing a full-cycle characteristic test. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2303.07138 [pdf, other]

Transferable Deep Learning Power System Short-Term Voltage Stability Assessment with Physics-Informed Topological Feature Engineering

Authors: Zijian Feng, Xin Chen, Zijian Lv, Peiyuan Sun, Kai Wu

Abstract: Deep learning (DL) algorithms have been widely applied to short-term voltage stability (STVS) assessment in power systems. However, transferring the knowledge learned in one power grid to other power grids with topology changes is still a challenging task. This paper proposed a transferable DL-based model for STVS assessment by constructing the topology-aware voltage dynamic features from raw PMU… ▽ More Deep learning (DL) algorithms have been widely applied to short-term voltage stability (STVS) assessment in power systems. However, transferring the knowledge learned in one power grid to other power grids with topology changes is still a challenging task. This paper proposed a transferable DL-based model for STVS assessment by constructing the topology-aware voltage dynamic features from raw PMU data. Since the reactive power flow and grid topology are essential to voltage stability, the topology-aware and physics-informed voltage dynamic features are utilized to effectively represent the topological and temporal patterns from post-disturbance system dynamic trajectories. The proposed DL-based STVS assessment model is tested under random operating conditions on the New England 39-bus system. It has 99.99\% classification accuracy of the short-term voltage stability status using the topology-aware and physics-informed voltage dynamic features. In addition to high accuracy, the experiments show good adaptability to PMU errors. Moreover, The proposed STVS assessment method has outstanding performance on new grid topologies after fine-tuning. In particular, the highest accuracy reaches 99.68\% in evaluation, which demonstrates a good knowledge transfer ability of the proposed model for power grid topology change. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: This work has been submitted to the IEEE Transactions on Power Systems for possible publication

arXiv:2301.11501 [pdf, ps, other]

Practical Frequency-Hopping MIMO Joint Radar Communications: Design and Experiment

Authors: Jiangtao Liu, Kai Wu, Tao Su, J. Andrew Zhang

Abstract: Joint radar and communications (JRC) can realize two radio frequency (RF) functions using one set of resources, greatly saving hardware, energy and spectrum for wireless systems needing both functions. Frequency-hopping (FH) MIMO radar is a popular candidate for JRC, as the achieved communication symbol rate can greatly exceed radar pulse repetition frequency. However, practical transceiver imperf… ▽ More Joint radar and communications (JRC) can realize two radio frequency (RF) functions using one set of resources, greatly saving hardware, energy and spectrum for wireless systems needing both functions. Frequency-hopping (FH) MIMO radar is a popular candidate for JRC, as the achieved communication symbol rate can greatly exceed radar pulse repetition frequency. However, practical transceiver imperfections can fail many existing theoretical designs. In this work, we unveil for the first time the non-trivial impact of hardware imperfections on FH-MIMO JRC and analytically model the impact. We also design new waveforms and, accordingly, develop a low-complexity algorithm to jointly estimate the hardware imperfections of unsynchronized receiver. Moreover, employing low-cost software-defined radios and commercial off-the-shelf (COTS) products, we build the first FH-MIMO JRC experiment platform with radar and communications simultaneously validated over the air. Corroborated by simulation and experiment results, the proposed designs achieves high performances for both radar and communications. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: 11 pages; 12 figures

Showing 1–50 of 98 results for author: wu, K