Search | arXiv e-print repository

doi 10.1109/TCOMM.2025.3587076

Computation-resource-efficient Task-oriented Communications

Authors: Jingwen Fu, Ming Xiao, Chao Ren, Mikael Skoglund

Abstract: The rapid development of deep-learning enabled task-oriented communications (TOC) significantly shifts the paradigm of wireless communications. However, the high computation demands, particularly in resource-constrained systems e.g., mobile phones and UAVs, make TOC challenging for many tasks. To address the problem, we propose a novel TOC method with two models: a static and a dynamic model. In t… ▽ More The rapid development of deep-learning enabled task-oriented communications (TOC) significantly shifts the paradigm of wireless communications. However, the high computation demands, particularly in resource-constrained systems e.g., mobile phones and UAVs, make TOC challenging for many tasks. To address the problem, we propose a novel TOC method with two models: a static and a dynamic model. In the static model, we apply a neural network (NN) as a task-oriented encoder (TOE) when there is no computation budget constraint. The dynamic model is used when device computation resources are limited, and it uses dynamic NNs with multiple exits as the TOE. The dynamic model sorts input data by complexity with thresholds, allowing the efficient allocation of computation resources. Furthermore, we analyze the convergence of the proposed TOC methods and show that the model converges at rate $O\left(\frac{1}{\sqrt{T}}\right)$ with an epoch of length $T$. Experimental results demonstrate that the static model outperforms baseline models in terms of transmitted dimensions, floating-point operations (FLOPs), and accuracy simultaneously. The dynamic model can further improve accuracy and computational demand, providing an improved solution for resource-constrained systems. △ Less

Submitted 10 July, 2025; originally announced July 2025.

arXiv:2507.01574 [pdf, ps, other]

Vision-Aided ISAC in Low-Altitude Economy Networks via De-Diffused Visual Priors

Authors: Yulan Gao, Ziqiang Ye, Zhonghao Lyu, Ming Xiao, Yue Xiao, Ping Yang, Agata Manolova

Abstract: Emerging low-altitude economy networks (LAENets) require agile and privacy-preserving resource control under dynamic agent mobility and limited infrastructure support. To meet these challenges, we propose a vision-aided integrated sensing and communication (ISAC) framework for UAV-assisted access systems, where onboard masked De-Diffusion models extract compact semantic tokens, including agent typ… ▽ More Emerging low-altitude economy networks (LAENets) require agile and privacy-preserving resource control under dynamic agent mobility and limited infrastructure support. To meet these challenges, we propose a vision-aided integrated sensing and communication (ISAC) framework for UAV-assisted access systems, where onboard masked De-Diffusion models extract compact semantic tokens, including agent type, activity class, and heading orientation, while explicitly suppressing sensitive visual content. These tokens are fused with mmWave radar measurements to construct a semantic risk heatmap reflecting motion density, occlusion, and scene complexity, which guides access technology selection and resource scheduling. We formulate a multi-objective optimization problem to jointly maximize weighted energy and perception efficiency via radio access technology (RAT) assignment, power control, and beamforming, subject to agent-specific QoS constraints. To solve this, we develop De-Diffusion-driven vision-aided risk-aware resource optimization algorithm DeDiff-VARARO, a novel two-stage cross-modal control algorithm: the first stage reconstructs visual scenes from tokens via De-Diffusion model for semantic parsing, while the second stage employs a deep deterministic policy gradient (DDPG)-based policy to adapt RAT selection, power control, and beam assignment based on fused radar-visual states. Simulation results show that DeDiff-VARARO consistently outperforms baselines in reward convergence, link robustness, and semantic fidelity, achieving within $4\%$ of the performance of a raw-image upper bound while preserving user privacy and scalability in dense environments. △ Less

Submitted 2 July, 2025; originally announced July 2025.

arXiv:2506.13995 [pdf, ps, other]

DREAM: On hallucinations in AI-generated content for nuclear medicine imaging

Authors: Menghua Xia, Reimund Bayerlein, Yanis Chemli, Xiaofeng Liu, Jinsong Ouyang, Georges El Fakhri, Ramsey D. Badawi, Quanzheng Li, Chi Liu

Abstract: Artificial intelligence-generated content (AIGC) has shown remarkable performance in nuclear medicine imaging (NMI), offering cost-effective software solutions for tasks such as image enhancement, motion correction, and attenuation correction. However, these advancements come with the risk of hallucinations, generating realistic yet factually incorrect content. Hallucinations can misrepresent anat… ▽ More Artificial intelligence-generated content (AIGC) has shown remarkable performance in nuclear medicine imaging (NMI), offering cost-effective software solutions for tasks such as image enhancement, motion correction, and attenuation correction. However, these advancements come with the risk of hallucinations, generating realistic yet factually incorrect content. Hallucinations can misrepresent anatomical and functional information, compromising diagnostic accuracy and clinical trust. This paper presents a comprehensive perspective of hallucination-related challenges in AIGC for NMI, introducing the DREAM report, which covers recommendations for definition, representative examples, detection and evaluation metrics, underlying causes, and mitigation strategies. This position statement paper aims to initiate a common understanding for discussions and future research toward enhancing AIGC applications in NMI, thereby supporting their safe and effective deployment in clinical practice. △ Less

Submitted 18 June, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

Comments: 12 pages, 7 figures

arXiv:2506.11779 [pdf, ps, other]

Semantic Communications in 6G: Coexistence, Multiple Access, and Satellite Networks

Authors: Ishtiaque Ahmed, Yingzhuo Sun, Jingwen Fu, Alper Kose, Leila Musavian, Ming Xiao, Berna Ozbek

Abstract: The exponential growth of wireless users and bandwidth constraints necessitates innovative communication paradigms for next-generation networks. Semantic Communication (SemCom) emerges as a promising solution by transmitting extracted meaning rather than raw bits, enhancing spectral efficiency and enabling intelligent resource allocation. This paper explores the integration of SemCom with conventi… ▽ More The exponential growth of wireless users and bandwidth constraints necessitates innovative communication paradigms for next-generation networks. Semantic Communication (SemCom) emerges as a promising solution by transmitting extracted meaning rather than raw bits, enhancing spectral efficiency and enabling intelligent resource allocation. This paper explores the integration of SemCom with conventional Bit-based Communication (BitCom) in heterogeneous networks, highlighting key challenges and opportunities. We analyze multiple access techniques, including Non-Orthogonal Multiple Access (NOMA), to support coexisting SemCom and BitCom users. Furthermore, we examine multi-modal SemCom frameworks for handling diverse data types and discuss their applications in satellite networks, where semantic techniques mitigate bandwidth limitations and harsh channel conditions. Finally, we identify future directions for deploying semantic-aware systems in 6G and beyond. △ Less

Submitted 13 June, 2025; originally announced June 2025.

arXiv:2506.06754 [pdf, ps, other]

MIMO Pinching-Antenna-Aided SWIPT

Authors: Haoyun Li, Zhonghao Lyu, Yulan Gao, Ming Xiao, H. Vincent Poor

Abstract: Pinching-antenna systems (PASS) have recently emerged as a promising technology for improving wireless communications by establishing or strengthening reliable line-of-sight (LoS) links by adjusting the positions of pinching antennas (PAs). Motivated by these benefits, we propose a novel PASS-aided multi-input multi-output (MIMO) system for simultaneous wireless information and power transfer (SWI… ▽ More Pinching-antenna systems (PASS) have recently emerged as a promising technology for improving wireless communications by establishing or strengthening reliable line-of-sight (LoS) links by adjusting the positions of pinching antennas (PAs). Motivated by these benefits, we propose a novel PASS-aided multi-input multi-output (MIMO) system for simultaneous wireless information and power transfer (SWIPT), where the PASS are equipped with multiple waveguides to provide information transmission and wireless power transfer (WPT) for several multiple antenna information decoding receivers (IDRs), and energy harvesting receivers (EHRs), respectively. Based on the system, we consider maximizing the sum-rate of all IDRs while guaranteeing the minimum harvested energy of each EHR by jointly optimizing the pinching beamforming and the PA positions. To solve this highly non-convex problem, we iteratively optimize the pinching beamforming based on a weighted minimum mean-squared-error (WMMSE) method and update the PA positions with a Gauss-Seidel-based approach in an alternating optimization (AO) framework. Numerical results verify the significant superiority of the PASS compared with conventional designs. △ Less

Submitted 7 June, 2025; originally announced June 2025.

arXiv:2506.05637 [pdf, ps, other]

Joint User Association and Beamforming Design for ISAC Networks with Large Language Models

Authors: Haoyun Li, Ming Xiao, Kezhi Wang, Robert Schober, Dong In Kim, Yong Liang Guan

Abstract: Integrated sensing and communication (ISAC) has been envisioned to play a more important role in future wireless networks. However, the design of ISAC networks is challenging, especially when there are multiple communication and sensing (C\&S) nodes and multiple sensing targets. We investigate a multi-base station (BS) ISAC network in which multiple BSs equipped with multiple antennas simultaneous… ▽ More Integrated sensing and communication (ISAC) has been envisioned to play a more important role in future wireless networks. However, the design of ISAC networks is challenging, especially when there are multiple communication and sensing (C\&S) nodes and multiple sensing targets. We investigate a multi-base station (BS) ISAC network in which multiple BSs equipped with multiple antennas simultaneously provide C\&S services for multiple ground communication users (CUs) and targets. To enhance the overall performance of C\&S, we formulate a joint user association (UA) and multi-BS transmit beamforming optimization problem with the objective of maximizing the total sum rate of all CUs while ensuring both the minimum target detection and parameter estimation requirements. To efficiently solve the highly non-convex mixed integer nonlinear programming (MINLP) optimization problem, we propose an alternating optimization (AO)-based algorithm that decomposes the problem into two sub-problems, i.e., UA optimization and multi-BS transmit beamforming optimization. Inspired by large language models (LLMs) for prediction and inference, we propose a unified framework integrating LLMs with convex-based optimization methods. First, we propose a comprehensive design of prompt engineering, including few-shot, chain of thought, and self-reflection techniques to guide LLMs in solving the binary integer programming UA optimization problem. Second, we utilize convex-based optimization methods to handle the non-convex beamforming optimization problem based on fractional programming (FP), majorization minimization (MM), and the alternating direction method of multipliers (ADMM) with an optimized UA from LLMs. Numerical results demonstrate that our proposed LLM-enabled AO-based algorithm achieves fast convergence and near upper-bound performance with the GPT-o1 model, outperforming various benchmark schemes. △ Less

Submitted 5 June, 2025; originally announced June 2025.

arXiv:2505.09940 [pdf, other]

Low-Complexity Hybrid Beamforming for Multi-Cell mmWave Massive MIMO: A Primitive Kronecker Decomposition Approach

Authors: Teng Sun, Guangxu Zhu, Xiaofan Li, Jiancun Fan, Minghua Xia

Abstract: To circumvent the high path loss of mmWave propagation and reduce the hardware cost of massive multiple-input multiple-output antenna systems, full-dimensional hybrid beamforming is critical in 5G and beyond wireless communications. Concerning an uplink multi-cell system with a large-scale uniform planar antenna array, this paper designs an efficient hybrid beamformer using primitive Kronecker dec… ▽ More To circumvent the high path loss of mmWave propagation and reduce the hardware cost of massive multiple-input multiple-output antenna systems, full-dimensional hybrid beamforming is critical in 5G and beyond wireless communications. Concerning an uplink multi-cell system with a large-scale uniform planar antenna array, this paper designs an efficient hybrid beamformer using primitive Kronecker decomposition and dynamic factor allocation, where the analog beamformer applies to null the inter-cell interference and simultaneously enhances the desired signals. In contrast, the digital beamformer mitigates the intra-cell interference using the minimum mean square error (MMSE) criterion. Then, due to the low accuracy of phase shifters inherent in the analog beamformer, a low-complexity hybrid beamformer is developed to slow its adjustment speed. Next, an optimality analysis from a subspace perspective is performed, and a sufficient condition for optimal antenna configuration is established. Finally, simulation results demonstrate that the achievable sum rate of the proposed beamformer approaches that of the optimal pure digital MMSE scheme, yet with much lower computational complexity and hardware cost. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 12 pages, 6 figures, 2 tables; accepted for publication in Signal Processing

arXiv:2505.07559 [pdf, ps, other]

Pinching-Antenna Systems (PASS) Aided Over-the-air Computation

Authors: Zhonghao Lyu, Haoyun Li, Yulan Gao, Ming Xiao, H. Vincent Poor

Abstract: Over-the-air computation (AirComp) enables fast data aggregation for edge intelligence applications. However the performance of AirComp can be severely degraded by channel misalignments. Pinching antenna systems (PASS) have recently emerged as a promising solution for physically reshaping favorable wireless channels to reduce misalignments and thus AirComp errors, via low-cost, fully passive, and… ▽ More Over-the-air computation (AirComp) enables fast data aggregation for edge intelligence applications. However the performance of AirComp can be severely degraded by channel misalignments. Pinching antenna systems (PASS) have recently emerged as a promising solution for physically reshaping favorable wireless channels to reduce misalignments and thus AirComp errors, via low-cost, fully passive, and highly reconfigurable antenna deployment. Motivated by these benefits, we propose a novel PASS-aided AirComp system that introduces new design degrees of freedom through flexible pinching antenna (PA) placement. To improve performance, we consider a mean squared error (MSE) minimization problem by jointly optimizing the PA position, transmit power, and decoding vector. To solve this highly non-convex problem, we propose an alternating optimization based framework with Gauss-Seidel based PA position updates. Simulation results show that our proposed joint PA position and communication design significantly outperforms various benchmark schemes in AirComp accuracy. △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: 5 figures

arXiv:2504.10806 [pdf, other]

ACSNet: A Deep Neural Network for Compound GNSS Jamming Signal Classification

Authors: Min Jiang, Ziqiang Ye, Yue Xiao, Yulan Gao, Ming Xiao, Dusit Niyato

Abstract: In the global navigation satellite system (GNSS), identifying not only single but also compound jamming signals is crucial for ensuring reliable navigation and positioning, particularly in future wireless communication scenarios such as the space-air-ground integrated network (SAGIN). However, conventional techniques often struggle with low recognition accuracy and high computational complexity, e… ▽ More In the global navigation satellite system (GNSS), identifying not only single but also compound jamming signals is crucial for ensuring reliable navigation and positioning, particularly in future wireless communication scenarios such as the space-air-ground integrated network (SAGIN). However, conventional techniques often struggle with low recognition accuracy and high computational complexity, especially under low jamming-to-noise ratio (JNR) conditions. To overcome the challenge of accurately identifying compound jamming signals embedded within GNSS signals, we propose ACSNet, a novel convolutional neural network designed specifically for this purpose. Unlike traditional methods that tend to exhibit lower accuracy and higher computational demands, particularly in low JNR environments, ACSNet addresses these issues by integrating asymmetric convolution blocks, which enhance its sensitivity to subtle signal variations. Simulations demonstrate that ACSNet significantly improves accuracy in low JNR regions and shows robust resilience to power ratio (PR) variations, confirming its effectiveness and efficiency for practical GNSS interference management applications. △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2503.16635 [pdf, other]

Fed-NDIF: A Noise-Embedded Federated Diffusion Model For Low-Count Whole-Body PET Denoising

Authors: Yinchi Zhou, Huidong Xie, Menghua Xia, Qiong Liu, Bo Zhou, Tianqi Chen, Jun Hou, Liang Guo, Xinyuan Zheng, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Nicha C. Dvorneka, Chi Liu

Abstract: Low-count positron emission tomography (LCPET) imaging can reduce patients' exposure to radiation but often suffers from increased image noise and reduced lesion detectability, necessitating effective denoising techniques. Diffusion models have shown promise in LCPET denoising for recovering degraded image quality. However, training such models requires large and diverse datasets, which are challe… ▽ More Low-count positron emission tomography (LCPET) imaging can reduce patients' exposure to radiation but often suffers from increased image noise and reduced lesion detectability, necessitating effective denoising techniques. Diffusion models have shown promise in LCPET denoising for recovering degraded image quality. However, training such models requires large and diverse datasets, which are challenging to obtain in the medical domain. To address data scarcity and privacy concerns, we combine diffusion models with federated learning -- a decentralized training approach where models are trained individually at different sites, and their parameters are aggregated on a central server over multiple iterations. The variation in scanner types and image noise levels within and across institutions poses additional challenges for federated learning in LCPET denoising. In this study, we propose a novel noise-embedded federated learning diffusion model (Fed-NDIF) to address these challenges, leveraging a multicenter dataset and varying count levels. Our approach incorporates liver normalized standard deviation (NSTD) noise embedding into a 2.5D diffusion model and utilizes the Federated Averaging (FedAvg) algorithm to aggregate locally trained models into a global model, which is subsequently fine-tuned on local datasets to optimize performance and obtain personalized models. Extensive validation on datasets from the University of Bern, Ruijin Hospital in Shanghai, and Yale-New Haven Hospital demonstrates the superior performance of our method in enhancing image quality and improving lesion quantification. The Fed-NDIF model shows significant improvements in PSNR, SSIM, and NMSE of the entire 3D volume, as well as enhanced lesion detectability and quantification, compared to local diffusion models and federated UNet-based models. △ Less

Submitted 20 March, 2025; originally announced March 2025.

arXiv:2503.13257 [pdf, other]

Anatomically and Metabolically Informed Diffusion for Unified Denoising and Segmentation in Low-Count PET Imaging

Authors: Menghua Xia, Kuan-Yin Ko, Der-Shiun Wang, Ming-Kai Chen, Qiong Liu, Huidong Xie, Liang Guo, Wei Ji, Jinsong Ouyang, Reimund Bayerlein, Benjamin A. Spencer, Quanzheng Li, Ramsey D. Badawi, Georges El Fakhri, Chi Liu

Abstract: Positron emission tomography (PET) image denoising, along with lesion and organ segmentation, are critical steps in PET-aided diagnosis. However, existing methods typically treat these tasks independently, overlooking inherent synergies between them as correlated steps in the analysis pipeline. In this work, we present the anatomically and metabolically informed diffusion (AMDiff) model, a unified… ▽ More Positron emission tomography (PET) image denoising, along with lesion and organ segmentation, are critical steps in PET-aided diagnosis. However, existing methods typically treat these tasks independently, overlooking inherent synergies between them as correlated steps in the analysis pipeline. In this work, we present the anatomically and metabolically informed diffusion (AMDiff) model, a unified framework for denoising and lesion/organ segmentation in low-count PET imaging. By integrating multi-task functionality and exploiting the mutual benefits of these tasks, AMDiff enables direct quantification of clinical metrics, such as total lesion glycolysis (TLG), from low-count inputs. The AMDiff model incorporates a semantic-informed denoiser based on diffusion strategy and a denoising-informed segmenter utilizing nnMamba architecture. The segmenter constrains denoised outputs via a lesion-organ-specific regularizer, while the denoiser enhances the segmenter by providing enriched image information through a denoising revision module. These components are connected via a warming-up mechanism to optimize multitask interactions. Experiments on multi-vendor, multi-center, and multi-noise-level datasets demonstrate the superior performance of AMDiff. For test cases below 20% of the clinical count levels from participating sites, AMDiff achieves TLG quantification biases of -26.98%, outperforming its ablated versions which yield biases of -35.85% (without the lesion-organ-specific regularizer) and -40.79% (without the denoising revision module). △ Less

Submitted 17 March, 2025; originally announced March 2025.

arXiv:2502.18022 [pdf, other]

Multi-Cell Coordinated Beamforming for Integrate Communication and Multi-TMT Localization

Authors: Meidong Xia, Wei Xu, Jindan Xu, Zhenyao He, Zhaohui Yang, Derrick Wing Kwan Ng

Abstract: This paper investigates integrated localization and communication in a multi-cell system and proposes a coordinated beamforming algorithm to enhance target localization accuracy while preserving communication performance. Within this integrated sensing and communication (ISAC) system, the Cramer-Rao lower bound (CRLB) is adopted to quantify the accuracy of target localization, with its closed-form… ▽ More This paper investigates integrated localization and communication in a multi-cell system and proposes a coordinated beamforming algorithm to enhance target localization accuracy while preserving communication performance. Within this integrated sensing and communication (ISAC) system, the Cramer-Rao lower bound (CRLB) is adopted to quantify the accuracy of target localization, with its closed-form expression derived for the first time. It is shown that the nuisance parameters can be disregarded without impacting the CRLB of time of arrival (TOA)-based target localization. Capitalizing on the derived CRLB, we formulate a nonconvex coordinated beamforming problem to minimize the CRLB while satisfying signal-to-interference-plus-noise ratio (SINR) constraints in communication. To facilitate the development of solution, we reformulate the original problem into a more tractable form and solve it through semi-definite programming (SDP). Notably, we show that the proposed algorithm can always obtain rank-one global optimal solutions under mild conditions. Finally, numerical results demonstrate the superiority of the proposed algorithm over benchmark algorithms and reveal the performance trade-off between localization accuracy and communication SINR. △ Less

Submitted 25 February, 2025; originally announced February 2025.

Journal ref: 2025 IEEE International Conference on Communications

arXiv:2502.05845 [pdf]

Exploiting the Hidden Capacity of MMC Through Accurate Quantification of Modulation Indices

Authors: Qianhao Sun, Jingwei Meng, Ruofan Li, Mingchao Xia, Qifang Chen, Jiejie Zhou, Meiqi Fan, Peiqian Guo

Abstract: The modular multilevel converter (MMC) has become increasingly important in voltage-source converter-based high-voltage direct current (VSC-HVDC) systems. Direct and indirect modulation are widely used as mainstream modulation techniques in MMCs. However, due to the challenge of quantitatively evaluating the operation of different modulation schemes, the academic and industrial communities still h… ▽ More The modular multilevel converter (MMC) has become increasingly important in voltage-source converter-based high-voltage direct current (VSC-HVDC) systems. Direct and indirect modulation are widely used as mainstream modulation techniques in MMCs. However, due to the challenge of quantitatively evaluating the operation of different modulation schemes, the academic and industrial communities still hold differing opinions on their performance. To address this controversy, this paper employs the state-of-the-art computational methods and quantitative metrics to compare the performance among different modulation schemes. The findings indicate that direct modulation offers superior modulation potential for MMCs, highlighting its higher ac voltage output capability and broader linear PQ operation region. Conversely, indirect modulation is disadvantaged in linear modulation, which indicates inferior output voltage capability. Furthermore, this paper delves into the conditions whereby direct and indirect modulation techniques become equivalent in steady-state. The study findings suggest that the modulation capability of direct modulation is the same as that of indirect modulation in steady-state when additional controls, including closed-loop capacitor voltage control and circulating current suppression control (CCSC), are simultaneously active. Simulation and experiments verify the correctness and validity. △ Less

Submitted 9 February, 2025; originally announced February 2025.

arXiv:2502.05842 [pdf]

A Grid-Forming HVDC Series Tapping Converter Using Extended Techniques of Flex-LCC

Authors: Qianhao Sun, Ruofan Li, Jichen Wang, Mingchao Xia, Qifang Chen, Meiqi Fan, Gen Li, Xuebo Qiao

Abstract: This paper discusses an extension technology for the previously proposed Flexible Line-Commutated Converter (Flex LCC) [1]. The proposed extension involves modifying the arm internal-electromotive-force control, redesigning the main-circuit parameters, and integrating a low-power coordination strategy. As a result, the Flex-LCC transforms from a grid-forming (GFM) voltage source converter (VSC) ba… ▽ More This paper discusses an extension technology for the previously proposed Flexible Line-Commutated Converter (Flex LCC) [1]. The proposed extension involves modifying the arm internal-electromotive-force control, redesigning the main-circuit parameters, and integrating a low-power coordination strategy. As a result, the Flex-LCC transforms from a grid-forming (GFM) voltage source converter (VSC) based on series-connected LCC and FBMMC into a novel GFM HVDC series tapping converter, referred to as the Extended Flex-LCC (EFLCC). The EFLCC provides dc characteristics resembling those of current source converters (CSCs) and ac characteristics resembling those of GFM VSCs. This makes it easier to integrate relatively small renewable energy sources (RESs) that operate in islanded or weak-grid supported conditions with an existing LCC-HVDC. Meanwhile, the EFLCC distinguishes itself by requiring fewer full-controlled switches and less energy storage, resulting in lower losses and costs compared to the FBMMC HVDC series tap solution. In particular, the reduced capacity requirement and the wide allowable range of valve-side ac voltages in the FBMMC part facilitate the matching of current-carrying capacities between full-controlled switches and thyristors. The application scenario, system-level analysis, implementation, converter-level operation, and comparison of the EFLCC are presented in detail in this paper. The theoretical analysis is confirmed by experimental and simulation results. △ Less

Submitted 9 February, 2025; originally announced February 2025.

arXiv:2501.17059 [pdf, ps, other]

Channel Estimation for XL-MIMO Systems with Decentralized Baseband Processing: Integrating Local Reconstruction with Global Refinement

Authors: Anzheng Tang, Jun-Bo Wang, Yijin Pan, Cheng Zeng, Yijian Chen, Hongkang Yu, Ming Xiao, Rodrigo C. de Lamare, Jiangzhou Wang

Abstract: In this paper, we investigate the channel estimation problem for extremely large-scale multiple-input multiple-output (XL-MIMO) systems with a hybrid analog-digital architecture, implemented within a decentralized baseband processing (DBP) framework with a star topology. Existing centralized and fully decentralized channel estimation methods face limitations due to excessive computational complexi… ▽ More In this paper, we investigate the channel estimation problem for extremely large-scale multiple-input multiple-output (XL-MIMO) systems with a hybrid analog-digital architecture, implemented within a decentralized baseband processing (DBP) framework with a star topology. Existing centralized and fully decentralized channel estimation methods face limitations due to excessive computational complexity or degraded performance. To overcome these challenges, we propose a novel two-stage channel estimation scheme that integrates local sparse reconstruction with global fusion and refinement. Specifically, in the first stage, by exploiting the sparsity of channels in the angular-delay domain, the local reconstruction task is formulated as a sparse signal recovery problem. To solve it, we develop a graph neural networks-enhanced sparse Bayesian learning (SBL-GNNs) algorithm, which effectively captures dependencies among channel coefficients, significantly improving estimation accuracy. In the second stage, the local estimates from the local processing units (LPUs) are aligned into a global angular domain for fusion at the central processing unit (CPU). Based on the aggregated observations, the channel refinement is modeled as a Bayesian denoising problem. To efficiently solve it, we devise a variational message passing algorithm that incorporates a Markov chain-based hierarchical sparse prior, effectively leveraging both the sparsity and the correlations of the channels in the global angular-delay domain. Simulation results validate the effectiveness and superiority of the proposed SBL-GNNs algorithm over existing methods, demonstrating improved estimation performance and reduced computational complexity. △ Less

Submitted 26 April, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

Comments: This manuscript has been accepted by IEEE TCOM

arXiv:2501.07808 [pdf]

A Low-cost and Ultra-lightweight Binary Neural Network for Traffic Signal Recognition

Authors: Mingke Xiao, Yue Su, Liang Yu, Guanglong Qu, Yutong Jia, Yukuan Chang, Xu Zhang

Abstract: The deployment of neural networks in vehicle platforms and wearable Artificial Intelligence-of-Things (AIOT) scenarios has become a research area that has attracted much attention. With the continuous evolution of deep learning technology, many image classification models are committed to improving recognition accuracy, but this is often accompanied by problems such as large model resource usage,… ▽ More The deployment of neural networks in vehicle platforms and wearable Artificial Intelligence-of-Things (AIOT) scenarios has become a research area that has attracted much attention. With the continuous evolution of deep learning technology, many image classification models are committed to improving recognition accuracy, but this is often accompanied by problems such as large model resource usage, complex structure, and high power consumption, which makes it challenging to deploy on resource-constrained platforms. Herein, we propose an ultra-lightweight binary neural network (BNN) model designed for hardware deployment, and conduct image classification research based on the German Traffic Sign Recognition Benchmark (GTSRB) dataset. In addition, we also verify it on the Chinese Traffic Sign (CTS) and Belgian Traffic Sign (BTS) datasets. The proposed model shows excellent recognition performance with an accuracy of up to 97.64%, making it one of the best performing BNN models in the GTSRB dataset. Compared with the full-precision model, the accuracy loss is controlled within 1%, and the parameter storage overhead of the model is only 10% of that of the full-precision model. More importantly, our network model only relies on logical operations and low-bit width fixed-point addition and subtraction operations during the inference phase, which greatly simplifies the design complexity of the processing element (PE). Our research shows the great potential of BNN in the hardware deployment of computer vision models, especially in the field of computer vision tasks related to autonomous driving. △ Less

Submitted 13 January, 2025; originally announced January 2025.

arXiv:2412.16573 [pdf, other]

A Generalizable 3D Diffusion Framework for Low-Dose and Few-View Cardiac SPECT

Authors: Huidong Xie, Weijie Gan, Wei Ji, Xiongchao Chen, Alaa Alashi, Stephanie L. Thorn, Bo Zhou, Qiong Liu, Menghua Xia, Xueqi Guo, Yi-Hwa Liu, Hongyu An, Ulugbek S. Kamilov, Ge Wang, Albert J. Sinusas, Chi Liu

Abstract: Myocardial perfusion imaging using SPECT is widely utilized to diagnose coronary artery diseases, but image quality can be negatively affected in low-dose and few-view acquisition settings. Although various deep learning methods have been introduced to improve image quality from low-dose or few-view SPECT data, previous approaches often fail to generalize across different acquisition settings, lim… ▽ More Myocardial perfusion imaging using SPECT is widely utilized to diagnose coronary artery diseases, but image quality can be negatively affected in low-dose and few-view acquisition settings. Although various deep learning methods have been introduced to improve image quality from low-dose or few-view SPECT data, previous approaches often fail to generalize across different acquisition settings, limiting their applicability in reality. This work introduced DiffSPECT-3D, a diffusion framework for 3D cardiac SPECT imaging that effectively adapts to different acquisition settings without requiring further network re-training or fine-tuning. Using both image and projection data, a consistency strategy is proposed to ensure that diffusion sampling at each step aligns with the low-dose/few-view projection measurements, the image data, and the scanner geometry, thus enabling generalization to different low-dose/few-view settings. Incorporating anatomical spatial information from CT and total variation constraint, we proposed a 2.5D conditional strategy to allow the DiffSPECT-3D to observe 3D contextual information from the entire image volume, addressing the 3D memory issues in diffusion model. We extensively evaluated the proposed method on 1,325 clinical 99mTc tetrofosmin stress/rest studies from 795 patients. Each study was reconstructed into 5 different low-count and 5 different few-view levels for model evaluations, ranging from 1% to 50% and from 1 view to 9 view, respectively. Validated against cardiac catheterization results and diagnostic comments from nuclear cardiologists, the presented results show the potential to achieve low-dose and few-view SPECT imaging without compromising clinical performance. Additionally, DiffSPECT-3D could be directly applied to full-dose SPECT images to further improve image quality, especially in a low-dose stress-first cardiac SPECT imaging protocol. △ Less

Submitted 21 December, 2024; originally announced December 2024.

Comments: 13 pages, 6 figures, 2 tables. Paper under review. Oral presentation at IEEE MIC 2024

arXiv:2410.18382 [pdf, other]

Sensing-Communication-Computing-Control Closed-Loop Optimization for 6G Unmanned Robotic Systems

Authors: Xinran Fang, Chengleyang Lei, Wei Feng, Yunfei Chen, Ming Xiao, Ning Ge, Chengxiang Wang

Abstract: Rapid advancements in field robots have brought a new kind of cyber physical system (CPS)--unmanned robotic system--under the spotlight. In the upcoming sixth-generation (6G) era, these systems hold great potential to replace humans in hazardous tasks. This paper investigates an unmanned robotic system comprising a multi-functional unmanned aerial vehicle (UAV), sensors, and actuators. The UAV car… ▽ More Rapid advancements in field robots have brought a new kind of cyber physical system (CPS)--unmanned robotic system--under the spotlight. In the upcoming sixth-generation (6G) era, these systems hold great potential to replace humans in hazardous tasks. This paper investigates an unmanned robotic system comprising a multi-functional unmanned aerial vehicle (UAV), sensors, and actuators. The UAV carries communication and computing modules, acting as an edge information hub (EIH) that transfers and processes information. During the task execution, the EIH gathers sensing data, calculates control commands, and transmits commands to actuators--leading to reflex-arc-like sensing-communication-computing-control ($\mathbf{SC}^3$) loops. Unlike existing studies that design $\mathbf{SC}^3$ loop components separately, we take each $\mathbf{SC}^3$ loop as an integrated structure and propose a goal-oriented closed-loop optimization scheme. This scheme jointly optimizes uplink and downlink (UL&DL) communication and computing within and across the $\mathbf{SC}^3$ loops to minimize the total linear quadratic regulator (LQR) cost. We derive optimal closed-form solutions for intra-loop allocation and propose an efficient iterative algorithm for inter-loop optimization. Under the condition of adequate CPU frequency availability, we derive an approximate closed-form solution for inter-loop bandwidth allocation. Simulation results demonstrate that the proposed scheme achieves a two-tier task-level balance within and across $\mathbf{SC}^3$ loops. △ Less

Submitted 23 October, 2024; originally announced October 2024.

arXiv:2410.11002 [pdf, other]

Optimizing Radio Access Technology Selection and Precoding in CV-Aided ISAC Systems

Authors: Yulan Gao, Ziqiang Ye, Ming Xiao, Yue Xiao

Abstract: Integrated Sensing and Communication (ISAC) systems promise to revolutionize wireless networks by concurrently supporting high-resolution sensing and high-performance communication. This paper presents a novel radio access technology (RAT) selection framework that capitalizes on vision sensing from base station (BS) cameras to optimize both communication and perception capabilities within the ISAC… ▽ More Integrated Sensing and Communication (ISAC) systems promise to revolutionize wireless networks by concurrently supporting high-resolution sensing and high-performance communication. This paper presents a novel radio access technology (RAT) selection framework that capitalizes on vision sensing from base station (BS) cameras to optimize both communication and perception capabilities within the ISAC system. Our framework strategically employs two distinct RATs, LTE and millimeter wave (mmWave), to enhance system performance. We propose a vision-based user localization method that employs a 3D detection technique to capture the spatial distribution of users within the surrounding environment. This is followed by geometric calculations to accurately determine the state of mmWave communication links between the BS and individual users. Additionally, we integrate the SlowFast model to recognize user activities, facilitating adaptive transmission rate allocation based on observed behaviors. We develop a Deep Deterministic Policy Gradient (DDPG)-based algorithm, utilizing the joint distribution of users and their activities, designed to maximize the total transmission rate for all users through joint RAT selection and precoding optimization, while adhering to constraints on sensing mutual information and minimum transmission rates. Numerical simulation results demonstrate the effectiveness of the proposed framework in dynamically adjusting resource allocation, ensuring high-quality communication under challenging conditions. △ Less

Submitted 14 October, 2024; originally announced October 2024.

arXiv:2410.05062 [pdf, other]

Large Language Model Based Multi-Objective Optimization for Integrated Sensing and Communications in UAV Networks

Authors: Haoyun Li, Ming Xiao, Kezhi Wang, Dong In Kim, Merouane Debbah

Abstract: This letter investigates an unmanned aerial vehicle (UAV) network with integrated sensing and communication (ISAC) systems, where multiple UAVs simultaneously sense the locations of ground users and provide communication services with radars. To find the trade-off between communication and sensing (C\&S) in the system, we formulate a multi-objective optimization problem (MOP) to maximize the total… ▽ More This letter investigates an unmanned aerial vehicle (UAV) network with integrated sensing and communication (ISAC) systems, where multiple UAVs simultaneously sense the locations of ground users and provide communication services with radars. To find the trade-off between communication and sensing (C\&S) in the system, we formulate a multi-objective optimization problem (MOP) to maximize the total network utility and the localization Cramér-Rao bounds (CRB) of ground users, which jointly optimizes the deployment and power control of UAVs. Inspired by the huge potential of large language models (LLM) for prediction and inference, we propose an LLM-enabled decomposition-based multi-objective evolutionary algorithm (LEDMA) for solving the highly non-convex MOP. We first adopt a decomposition-based scheme to decompose the MOP into a series of optimization sub-problems. We second integrate LLMs as black-box search operators with MOP-specifically designed prompt engineering into the framework of MOEA to solve optimization sub-problems simultaneously. Numerical results demonstrate that the proposed LEDMA can find the clear trade-off between C\&S and outperforms baseline MOEAs in terms of obtained Pareto fronts and convergence. △ Less

Submitted 26 November, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

arXiv:2409.11543 [pdf, other]

Noise-aware Dynamic Image Denoising and Positron Range Correction for Rubidium-82 Cardiac PET Imaging via Self-supervision

Authors: Huidong Xie, Liang Guo, Alexandre Velo, Zhao Liu, Qiong Liu, Xueqi Guo, Bo Zhou, Xiongchao Chen, Yu-Jung Tsai, Tianshun Miao, Menghua Xia, Yi-Hwa Liu, Ian S. Armstrong, Ge Wang, Richard E. Carson, Albert J. Sinusas, Chi Liu

Abstract: Rb-82 is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of 82-Rb, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of 82-Rb results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric… ▽ More Rb-82 is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of 82-Rb, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of 82-Rb results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric images. The noise levels also vary substantially in different dynamic frames due to radiotracer decay and short half-life. Existing denoising methods are not applicable for this task due to the lack of paired training inputs/labels and inability to generalize across varying noise levels. Second, 82-Rb emits high-energy positrons. Compared with other tracers such as 18-F, 82-Rb travels a longer distance before annihilation, which negatively affect image spatial resolution. Here, the goal of this study is to propose a self-supervised method for simultaneous (1) noise-aware dynamic image denoising and (2) positron range correction for 82-Rb cardiac PET imaging. Tested on a series of PET scans from a cohort of normal volunteers, the proposed method produced images with superior visual quality. To demonstrate the improvement in image quantification, we compared image-derived input functions (IDIFs) with arterial input functions (AIFs) from continuous arterial blood samples. The IDIF derived from the proposed method led to lower AUC differences, decreasing from 11.09% to 7.58% on average, compared to the original dynamic frames. The proposed method also improved the quantification of myocardium blood flow (MBF), as validated against 15-O-water scans, with mean MBF differences decreased from 0.43 to 0.09, compared to the original dynamic frames. We also conducted a generalizability experiment on 37 patient scans obtained from a different country using a different scanner. △ Less

Submitted 17 September, 2024; originally announced September 2024.

Comments: 15 Pages, 10 Figures, 5 tables. Paper Under review. Oral Presentation at IEEE MIC 2023

arXiv:2408.13056 [pdf, other]

GNSS Interference Classification Using Federated Reservoir Computing

Authors: Ziqiang Ye, Yulan Gao, Xinyue Liu, Yue Xiao, Ming Xiao, Saviour Zammit

Abstract: The expanding use of Unmanned Aerial Vehicles (UAVs) in vital areas like traffic management, surveillance, and environmental monitoring highlights the need for robust communication and navigation systems. Particularly vulnerable are Global Navigation Satellite Systems (GNSS), which face a spectrum of interference and jamming threats that can significantly undermine their performance. While traditi… ▽ More The expanding use of Unmanned Aerial Vehicles (UAVs) in vital areas like traffic management, surveillance, and environmental monitoring highlights the need for robust communication and navigation systems. Particularly vulnerable are Global Navigation Satellite Systems (GNSS), which face a spectrum of interference and jamming threats that can significantly undermine their performance. While traditional deep learning approaches are adept at mitigating these issues, they often fall short for UAV applications due to significant computational demands and the complexities of managing large, centralized datasets. In response, this paper introduces Federated Reservoir Computing (FedRC) as a potent and efficient solution tailored to enhance interference classification in GNSS systems used by UAVs. Our experimental results demonstrate that FedRC not only achieves faster convergence but also sustains lower loss levels than traditional models, highlighting its exceptional adaptability and operational efficiency. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.08979 [pdf, ps, other]

Electroencephalogram Emotion Recognition via AUC Maximization

Authors: Minheng Xiao

Abstract: Imbalanced datasets pose significant challenges in areas including neuroscience, cognitive science, and medical diagnostics, where accurately detecting minority classes is essential for robust model performance. This study addresses the issue of class imbalance, using the `Liking' label in the DEAP dataset as an example. Such imbalances are often overlooked by prior research, which typically focus… ▽ More Imbalanced datasets pose significant challenges in areas including neuroscience, cognitive science, and medical diagnostics, where accurately detecting minority classes is essential for robust model performance. This study addresses the issue of class imbalance, using the `Liking' label in the DEAP dataset as an example. Such imbalances are often overlooked by prior research, which typically focuses on the more balanced arousal and valence labels and predominantly uses accuracy metrics to measure model performance. To tackle this issue, we adopt numerical optimization techniques aimed at maximizing the area under the curve (AUC), thus enhancing the detection of underrepresented classes. Our approach, which begins with a linear classifier, is compared against traditional linear classifiers, including logistic regression and support vector machines (SVM). Our method significantly outperforms these models, increasing recall from 41.6\% to 79.7\% and improving the F1-score from 0.506 to 0.632. These results highlight the efficacy of AUC maximization via numerical optimization in managing imbalanced datasets, providing an effective solution for enhancing predictive accuracy in detecting minority but crucial classes in out-of-sample datasets. △ Less

Submitted 10 June, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

arXiv:2407.17691 [pdf, other]

System-Level Simulation Framework for NB-IoT: Key Features and Performance Evaluation

Authors: Shutao Zhang, Wenkun Wen, Peiran Wu, Hongqing Huang, Liya Zhu, Yijia Guo, Tingting Yang, Minghua Xia

Abstract: Narrowband Internet of Things (NB-IoT) is a technology specifically designated by the 3rd Generation Partnership Project (3GPP) to meet the explosive demand for massive machine-type communications (mMTC), and it is evolving to RedCap. Industrial companies have increasingly adopted NB-IoT as the solution for mMTC due to its lightweight design and comprehensive technical specifications released by 3… ▽ More Narrowband Internet of Things (NB-IoT) is a technology specifically designated by the 3rd Generation Partnership Project (3GPP) to meet the explosive demand for massive machine-type communications (mMTC), and it is evolving to RedCap. Industrial companies have increasingly adopted NB-IoT as the solution for mMTC due to its lightweight design and comprehensive technical specifications released by 3GPP. This paper presents a system-level simulation framework for NB-IoT networks to evaluate their performance. The system-level simulator is structured into four parts: initialization, pre-generation, main simulation loop, and post-processing. Additionally, three essential features are investigated to enhance coverage, support massive connections, and ensure low power consumption, respectively. Simulation results demonstrate that the cumulative distribution function curves of the signal-to-interference-and-noise ratio fully comply with industrial standards. Furthermore, the throughput performance explains how NB-IoT networks realize massive connections at the cost of data rate. This work highlights its practical utility and paves the way for developing NB-IoT networks. △ Less

Submitted 13 August, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

arXiv:2407.15330 [pdf, other]

A Methodology for Power Dispatch Based on Traction Station Clusters in the Flexible Traction Power Supply System

Authors: Ruofan Li, Qianhao Sun, Qifang Chen, Mingchao Xia

Abstract: The flexible traction power supply system (FTPSS) eliminates the neutral zone but leads to increased complexity in power flow coordinated control and power mismatch. To address these challenges, the methodology for power dispatch (PD) based on traction station clusters (TSCs) in FTPSS is proposed, in which each TSC with a consistent structure performs independent local phase angle control. First,… ▽ More The flexible traction power supply system (FTPSS) eliminates the neutral zone but leads to increased complexity in power flow coordinated control and power mismatch. To address these challenges, the methodology for power dispatch (PD) based on traction station clusters (TSCs) in FTPSS is proposed, in which each TSC with a consistent structure performs independent local phase angle control. First, to simplify the PD problem of TSCs, the system is transformed into an equivalent model with constant topology, resulting in it can be solved by univariate numerical optimization with higher computational performance. Next, the calculation method of the feasible phase angle domain under strict and relaxed power circulation constraints are described, respectively, which ensures that power circulation can be either eliminated or precisely controlled. Finally, the PD method with three unique modes for uncertain train loads is introduced to enhance power flow flexibility: specified power distribution coefficients between traction substations (TSs), constant output power of TSs, and maximum consumption of renewable resources within TSs. In the experimental section, the performance of the TSC methodology for PD is verified through detailed train operation scenarios. △ Less

Submitted 21 July, 2024; originally announced July 2024.

arXiv:2407.05928 [pdf, other]

CA-FedRC: Codebook Adaptation via Federated Reservoir Computing in 5G NR

Authors: Ziqiang Ye, Sikai Liao, Yulan Gao, Shu Fang, Yue Xiao, Ming Xiao, Saviour Zammit

Abstract: With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback ove… ▽ More With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback overhead while enhancing performance. However, existing methods of 5G NR codebooks adaptation require significant overhead for model training and feedback or fall short in performance. To address these limitations, this letter introduces a federated reservoir computing framework designed for efficient codebook adaptation in computationally and feedback resource-constrained mobile devices. This framework utilizes a novel series of indicators as input training data, striking an effective balance between performance and feedback overhead. Compared to conventional models, the proposed codebook adaptation via federated reservoir computing (CA-FedRC), achieves rapid convergence and significant loss reduction in both speed and accuracy. Extensive simulations under various channel conditions demonstrate that our algorithm not only reduces resource consumption of users but also accurately identifies channel types, thereby optimizing the trade-off between spectrum efficiency, computational complexity, and feedback overhead. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2406.08374 [pdf, other]

2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

Authors: Tianqi Chen, Jun Hou, Yinchi Zhou, Huidong Xie, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, James S. Duncan, Chi Liu, Bo Zhou

Abstract: Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate t… ▽ More Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate the non-attenuation-corrected low-dose PET (NAC-LDPET) into attenuation-corrected standard-dose PET (AC-SDPET). Recently, diffusion models have emerged as a new state-of-the-art deep learning method for image-to-image translation, better than traditional CNN-based methods. However, due to the high computation cost and memory burden, it is largely limited to 2D applications. To address these challenges, we developed a novel 2.5D Multi-view Averaging Diffusion Model (MADM) for 3D image-to-image translation with application on NAC-LDPET to AC-SDPET translation. Specifically, MADM employs separate diffusion models for axial, coronal, and sagittal views, whose outputs are averaged in each sampling step to ensure the 3D generation quality from multiple views. To accelerate the 3D sampling process, we also proposed a strategy to use the CNN-based 3D generation as a prior for the diffusion model. Our experimental results on human patient studies suggested that MADM can generate high-quality 3D translation images, outperforming previous CNN-based and Diffusion-based baseline methods. △ Less

Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: 15 pages, 7 figures

arXiv:2405.12996 [pdf, ps, other]

Dose-aware Diffusion Model for 3D PET Image Denoising: Multi-institutional Validation with Reader Study and Real Low-dose Data

Authors: Huidong Xie, Weijie Gan, Reimund Bayerlein, Bo Zhou, Ming-Kai Chen, Michal Kulon, Annemarie Boustani, Kuan-Yin Ko, Der-Shiun Wang, Benjamin A. Spencer, Wei Ji, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, Yinchi Zhou, Hui Liu, Liang Guo, Hongyu An, Ulugbek S. Kamilov, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Ge Wang , et al. (2 additional authors not shown)

Abstract: Reducing scan times, radiation dose, and enhancing image quality for lower-performance scanners, are critical in low-dose PET imaging. Deep learning techniques have been investigated for PET image denoising. However, existing models have often resulted in compromised image quality when achieving low-count/low-dose PET and have limited generalizability to different image noise-levels, acquisition p… ▽ More Reducing scan times, radiation dose, and enhancing image quality for lower-performance scanners, are critical in low-dose PET imaging. Deep learning techniques have been investigated for PET image denoising. However, existing models have often resulted in compromised image quality when achieving low-count/low-dose PET and have limited generalizability to different image noise-levels, acquisition protocols, and patient populations. Recently, diffusion models have emerged as the new state-of-the-art generative model to generate high-quality samples and have demonstrated strong potential for medical imaging tasks. However, for low-dose PET imaging, existing diffusion models failed to generate consistent 3D reconstructions, unable to generalize across varying noise-levels, often produced visually-appealing but distorted image details, and produced images with biased tracer uptake. Here, we develop DDPET-3D, a dose-aware diffusion model for 3D low-dose PET imaging to address these challenges. Collected from 4 medical centers globally with different scanners and clinical protocols, we evaluated the proposed model using a total of 9,783 18F-FDG studies with low-dose levels ranging from 1% to 50%. With a cross-center, cross-scanner validation, the proposed DDPET-3D demonstrated its potential to generalize to different low-dose levels, different scanners, and different clinical protocols. As confirmed with reader studies performed by board-certified nuclear medicine physicians, experienced readers judged the images to be similar or superior to the full-dose images and previous DL baselines based on qualitative visual impression. Lesion-level quantitative accuracy was evaluated using a Monte Carlo simulation study and a lesion segmentation network. The presented results show the potential to achieve low-dose PET while maintaining image quality. Real low-dose scans was also included for evaluation. △ Less

Submitted 16 June, 2025; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: 18 Pages, 16 Figures, 5 Tables. Paper under review. First-place Freek J. Beekman Young Investigator Award at SNMMI 2024. Code available after paper publication. arXiv admin note: substantial text overlap with arXiv:2311.04248

arXiv:2405.12377 [pdf]

Spatio-temporal Attention-based Hidden Physics-informed Neural Network for Remaining Useful Life Prediction

Authors: Feilong Jiang, Xiaonan Hou, Min Xia

Abstract: Predicting the Remaining Useful Life (RUL) is essential in Prognostic Health Management (PHM) for industrial systems. Although deep learning approaches have achieved considerable success in predicting RUL, challenges such as low prediction accuracy and interpretability pose significant challenges, hindering their practical implementation. In this work, we introduce a Spatio-temporal Attention-base… ▽ More Predicting the Remaining Useful Life (RUL) is essential in Prognostic Health Management (PHM) for industrial systems. Although deep learning approaches have achieved considerable success in predicting RUL, challenges such as low prediction accuracy and interpretability pose significant challenges, hindering their practical implementation. In this work, we introduce a Spatio-temporal Attention-based Hidden Physics-informed Neural Network (STA-HPINN) for RUL prediction, which can utilize the associated physics of the system degradation. The spatio-temporal attention mechanism can extract important features from the input data. With the self-attention mechanism on both the sensor dimension and time step dimension, the proposed model can effectively extract degradation information. The hidden physics-informed neural network is utilized to capture the physics mechanisms that govern the evolution of RUL. With the constraint of physics, the model can achieve higher accuracy and reasonable predictions. The approach is validated on a benchmark dataset, demonstrating exceptional performance when compared to cutting-edge methods, especially in the case of complex conditions. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.17994 [pdf]

LeqMod: Adaptable Lesion-Quantification-Consistent Modulation for Deep Learning Low-Count PET Image Denoising

Authors: Menghua Xia, Huidong Xie, Qiong Liu, Bo Zhou, Hanzhong Wang, Biao Li, Axel Rominger, Quanzheng Li, Ramsey D. Badawi, Kuangyu Shi, Georges El Fakhri, Chi Liu

Abstract: Deep learning-based positron emission tomography (PET) image denoising offers the potential to reduce radiation exposure and scanning time by transforming low-count images into high-count equivalents. However, existing methods typically blur crucial details, leading to inaccurate lesion quantification. This paper proposes a lesion-perceived and quantification-consistent modulation (LeqMod) strateg… ▽ More Deep learning-based positron emission tomography (PET) image denoising offers the potential to reduce radiation exposure and scanning time by transforming low-count images into high-count equivalents. However, existing methods typically blur crucial details, leading to inaccurate lesion quantification. This paper proposes a lesion-perceived and quantification-consistent modulation (LeqMod) strategy for enhanced PET image denoising, via employing downstream lesion quantification analysis as auxiliary tools. The LeqMod is a plug-and-play design adaptable to a wide range of model architectures, modulating the sampling and optimization procedures of model training without adding any computational burden to the inference phase. Specifically, the LeqMod consists of two components, the lesion-perceived modulation (LeMod) and the multiscale quantification-consistent modulation (QuMod). The LeMod enhances lesion contrast and visibility by allocating higher sampling weights and stricter loss criteria to lesion-present samples determined by an auxiliary segmentation network than lesion-absent ones. The QuMod further emphasizes quantification accuracy for both the mean and maximum standardized uptake value (SUVmean and SUVmax) across multiscale sub-regions throughout the entire image, thereby reducing biases of denoised results relative to high-count references. Experiments conducted on large PET datasets from multiple centers and vendors, and varying noise levels demonstrated the LeqMod efficacy across various denoising frameworks. Compared to frameworks without LeqMod, the integration of LeqMod reduces the lesion SUVmax bias by 5.92% on average and increases the peak signal-to-noise ratio (PSNR) by 0.36 on average, when denoising images across participating sites. △ Less

Submitted 4 March, 2025; v1 submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.09226 [pdf]

Breast Cancer Image Classification Method Based on Deep Transfer Learning

Authors: Weimin Wang, Yufeng Li, Xu Yan, Mingxuan Xiao, Min Gao

Abstract: To address the issues of limited samples, time-consuming feature design, and low accuracy in detection and classification of breast cancer pathological images, a breast cancer image classification model algorithm combining deep learning and transfer learning is proposed. This algorithm is based on the DenseNet structure of deep neural networks, and constructs a network model by introducing attenti… ▽ More To address the issues of limited samples, time-consuming feature design, and low accuracy in detection and classification of breast cancer pathological images, a breast cancer image classification model algorithm combining deep learning and transfer learning is proposed. This algorithm is based on the DenseNet structure of deep neural networks, and constructs a network model by introducing attention mechanisms, and trains the enhanced dataset using multi-level transfer learning. Experimental results demonstrate that the algorithm achieves an efficiency of over 84.0\% in the test set, with a significantly improved classification accuracy compared to previous models, making it applicable to medical breast cancer detection tasks. △ Less

Submitted 11 September, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

Comments: 12 pages, 8 figures, 2024 International Conference on Image Processing, Machine Learning and Pattern Recognition

arXiv:2404.08713 [pdf, other]

Survival Prediction Across Diverse Cancer Types Using Neural Networks

Authors: Xu Yan, Weimin Wang, MingXuan Xiao, Yufeng Li, Min Gao

Abstract: Gastric cancer and Colon adenocarcinoma represent widespread and challenging malignancies with high mortality rates and complex treatment landscapes. In response to the critical need for accurate prognosis in cancer patients, the medical community has embraced the 5-year survival rate as a vital metric for estimating patient outcomes. This study introduces a pioneering approach to enhance survival… ▽ More Gastric cancer and Colon adenocarcinoma represent widespread and challenging malignancies with high mortality rates and complex treatment landscapes. In response to the critical need for accurate prognosis in cancer patients, the medical community has embraced the 5-year survival rate as a vital metric for estimating patient outcomes. This study introduces a pioneering approach to enhance survival prediction models for gastric and Colon adenocarcinoma patients. Leveraging advanced image analysis techniques, we sliced whole slide images (WSI) of these cancers, extracting comprehensive features to capture nuanced tumor characteristics. Subsequently, we constructed patient-level graphs, encapsulating intricate spatial relationships within tumor tissues. These graphs served as inputs for a sophisticated 4-layer graph convolutional neural network (GCN), designed to exploit the inherent connectivity of the data for comprehensive analysis and prediction. By integrating patients' total survival time and survival status, we computed C-index values for gastric cancer and Colon adenocarcinoma, yielding 0.57 and 0.64, respectively. Significantly surpassing previous convolutional neural network models, these results underscore the efficacy of our approach in accurately predicting patient survival outcomes. This research holds profound implications for both the medical and AI communities, offering insights into cancer biology and progression while advancing personalized treatment strategies. Ultimately, our study represents a significant stride in leveraging AI-driven methodologies to revolutionize cancer prognosis and improve patient outcomes on a global scale. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.08279 [pdf, other]

Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example

Authors: MingXuan Xiao, Yufeng Li, Xu Yan, Min Gao, Weimin Wang

Abstract: Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis often relies on the pathology of cells in the lesion. The pathological diagnosis of breast cancer not only requires professionals and time, but also sometimes involves subjective judgment. To address the challenges of dependence on pathologists expertise and the time-consuming nature of achieving accurate breast… ▽ More Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis often relies on the pathology of cells in the lesion. The pathological diagnosis of breast cancer not only requires professionals and time, but also sometimes involves subjective judgment. To address the challenges of dependence on pathologists expertise and the time-consuming nature of achieving accurate breast pathological image classification, this paper introduces an approach utilizing convolutional neural networks (CNNs) for the rapid categorization of pathological images, aiming to enhance the efficiency of breast pathological image detection. And the approach enables the rapid and automatic classification of pathological images into benign and malignant groups. The methodology involves utilizing a convolutional neural network (CNN) model leveraging the Inceptionv3 architecture and transfer learning algorithm for extracting features from pathological images. Utilizing a neural network with fully connected layers and employing the SoftMax function for image classification. Additionally, the concept of image partitioning is introduced to handle high-resolution images. To achieve the ultimate classification outcome, the classification probabilities of each image block are aggregated using three algorithms: summation, product, and maximum. Experimental validation was conducted on the BreaKHis public dataset, resulting in accuracy rates surpassing 0.92 across all four magnification coefficients (40X, 100X, 200X, and 400X). It demonstrates that the proposed method effectively enhances the accuracy in classifying pathological images of breast cancer. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.05257 [pdf, other]

Sensing-Resistance-Oriented Beamforming for Privacy Protection from ISAC Devices

Authors: Teng Ma, Yue Xiao, Xia Lei, Ming Xiao

Abstract: With the evolution of integrated sensing and communication (ISAC) technology, a growing number of devices go beyond conventional communication functions with sensing abilities. Therefore, future networks are divinable to encounter new privacy concerns on sensing, such as the exposure of position information to unintended receivers. In contrast to traditional privacy preserving schemes aiming to pr… ▽ More With the evolution of integrated sensing and communication (ISAC) technology, a growing number of devices go beyond conventional communication functions with sensing abilities. Therefore, future networks are divinable to encounter new privacy concerns on sensing, such as the exposure of position information to unintended receivers. In contrast to traditional privacy preserving schemes aiming to prevent eavesdropping, this contribution conceives a novel beamforming design toward sensing resistance (SR). Specifically, we expect to guarantee the communication quality while masking the real direction of the SR transmitter during the communication. To evaluate the SR performance, a metric termed angular-domain peak-to-average ratio (ADPAR) is first defined and analyzed. Then, we resort to the null-space technique to conceal the real direction, hence to convert the optimization problem to a more tractable form. Moreover, semidefinite relaxation along with index optimization is further utilized to obtain the optimal beamformer. Finally, simulation results demonstrate the feasibility of the proposed SR-oriented beamforming design toward privacy protection from ISAC receivers. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Accepted for presentation at WS29 ICC 2024 Workshop - ISAC6G

arXiv:2404.00780 [pdf, ps, other]

Supplementary File: Cooperative Gradient Coding for Semi-Decentralized Federated Learning

Authors: Shudi Weng, Chengxi Li, Ming Xiao, Mikael Skoglund

Abstract: Stragglers' effects are known to degrade FL performance. In this paper, we investigate federated learning (FL) over wireless networks in the presence of communication stragglers, where the power-constrained clients collaboratively train a global model by iteratively optimizing a local objective function with their local datasets and transmitting local model updates to the central parameter server… ▽ More Stragglers' effects are known to degrade FL performance. In this paper, we investigate federated learning (FL) over wireless networks in the presence of communication stragglers, where the power-constrained clients collaboratively train a global model by iteratively optimizing a local objective function with their local datasets and transmitting local model updates to the central parameter server (PS) through fading channels. To tackle communication stragglers without dataset sharing or prior information about the network at PS, we propose cooperative gradient coding (CoGC) for semi-decentralized FL to enable the exact global model recovery at PS. Furthermore, we conduct a thorough theoretical analysis of the proposed approach. Namely, an outage analysis of the proposed approach is provided, followed by a convergence analysis based on the failure probability of the global model recovery at PS. Nevertheless, simulation results reveal the superiority of the proposed approach in the presence of stragglers under imbalanced data distribution. △ Less

Submitted 8 August, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.16397 [pdf, other]

doi 10.1109/TWC.2024.3457157

RadioGAT: A Joint Model-based and Data-driven Framework for Multi-band Radiomap Reconstruction via Graph Attention Networks

Authors: Xiaojie Li, Songyang Zhang, Hang Li, Xiaoyang Li, Lexi Xu, Haigao Xu, Hui Mei, Guangxu Zhu, Nan Qi, Ming Xiao

Abstract: Multi-band radiomap reconstruction (MB-RMR) is a key component in wireless communications for tasks such as spectrum management and network planning. However, traditional machine-learning-based MB-RMR methods, which rely heavily on simulated data or complete structured ground truth, face significant deployment challenges. These challenges stem from the differences between simulated and actual data… ▽ More Multi-band radiomap reconstruction (MB-RMR) is a key component in wireless communications for tasks such as spectrum management and network planning. However, traditional machine-learning-based MB-RMR methods, which rely heavily on simulated data or complete structured ground truth, face significant deployment challenges. These challenges stem from the differences between simulated and actual data, as well as the scarcity of real-world measurements. To address these challenges, our study presents RadioGAT, a novel framework based on Graph Attention Network (GAT) tailored for MB-RMR within a single area, eliminating the need for multi-region datasets. RadioGAT innovatively merges model-based spatial-spectral correlation encoding with data-driven radiomap generalization, thus minimizing the reliance on extensive data sources. The framework begins by transforming sparse multi-band data into a graph structure through an innovative encoding strategy that leverages radio propagation models to capture the spatial-spectral correlation inherent in the data. This graph-based representation not only simplifies data handling but also enables tailored label sampling during training, significantly enhancing the framework's adaptability for deployment. Subsequently, The GAT is employed to generalize the radiomap information across various frequency bands. Extensive experiments using raytracing datasets based on real-world environments have demonstrated RadioGAT's enhanced accuracy in supervised learning settings and its robustness in semi-supervised scenarios. These results underscore RadioGAT's effectiveness and practicality for MB-RMR in environments with limited data availability. △ Less

Submitted 29 July, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: IEEE Transactions on Wireless Communications, early access, 2024

Journal ref: IEEE Transactions on Wireless Communications, vol. 23, no. 11, pp. 17777-17792, Nov. 2024

arXiv:2403.14905 [pdf, other]

Adaptive Coded Federated Learning: Privacy Preservation and Straggler Mitigation

Authors: Chengxi Li, Ming Xiao, Mikael Skoglund

Abstract: In this article, we address the problem of federated learning in the presence of stragglers. For this problem, a coded federated learning framework has been proposed, where the central server aggregates gradients received from the non-stragglers and gradient computed from a privacy-preservation global coded dataset to mitigate the negative impact of the stragglers. However, when aggregating these… ▽ More In this article, we address the problem of federated learning in the presence of stragglers. For this problem, a coded federated learning framework has been proposed, where the central server aggregates gradients received from the non-stragglers and gradient computed from a privacy-preservation global coded dataset to mitigate the negative impact of the stragglers. However, when aggregating these gradients, fixed weights are consistently applied across iterations, neglecting the generation process of the global coded dataset and the dynamic nature of the trained model over iterations. This oversight may result in diminished learning performance. To overcome this drawback, we propose a new method named adaptive coded federated learning (ACFL). In ACFL, before the training, each device uploads a coded local dataset with additive noise to the central server to generate a global coded dataset under privacy preservation requirements. During each iteration of the training, the central server aggregates the gradients received from the non-stragglers and the gradient computed from the global coded dataset, where an adaptive policy for varying the aggregation weights is designed. Under this policy, we optimize the performance in terms of privacy and learning, where the learning performance is analyzed through convergence analysis and the privacy performance is characterized via mutual information differential privacy. Finally, we perform simulations to demonstrate the superiority of ACFL compared with the non-adaptive methods. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2402.15939 [pdf]

Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Authors: Zi Wang, Min Xiao, Yirong Zhou, Chengyan Wang, Naiming Wu, Yi Li, Yiwen Gong, Shufu Chang, Yinyin Chen, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Di Guo, Guang Yang, Xiaobo Qu

Abstract: Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge necessitates extensive training data in deep learning reconstruction methods. In this work, we propose a novel and efficient approach, leveraging a… ▽ More Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge necessitates extensive training data in deep learning reconstruction methods. In this work, we propose a novel and efficient approach, leveraging a dimension-reduced separable learning scheme that can perform exceptionally well even with highly limited training data. We design this new approach by incorporating spatiotemporal priors into the development of a Deep Separable Spatiotemporal Learning network (DeepSSL), which unrolls an iteration process of a 2D spatiotemporal reconstruction model with both temporal low-rankness and spatial sparsity. Intermediate outputs can also be visualized to provide insights into the network behavior and enhance interpretability. Extensive results on cardiac cine datasets demonstrate that the proposed DeepSSL surpasses state-of-the-art methods both visually and quantitatively, while reducing the demand for training cases by up to 75%. Additionally, its preliminary adaptability to unseen cardiac patients has been verified through a blind reader study conducted by experienced radiologists and cardiologists. Furthermore, DeepSSL enhances the accuracy of the downstream task of cardiac segmentation and exhibits robustness in prospectively undersampled real-time cardiac MRI. △ Less

Submitted 2 October, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

Comments: 12 pages, 14 figures, 4 tables

arXiv:2401.00153 [pdf, other]

USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis

Authors: Jing Jiao, Jin Zhou, Xiaokang Li, Menghua Xia, Yi Huang, Lihong Huang, Na Wang, Xiaofan Zhang, Shichong Zhou, Yuanyuan Wang, Yi Guo

Abstract: Inadequate generality across different organs and tasks constrains the application of ultrasound (US) image analysis methods in smart healthcare. Building a universal US foundation model holds the potential to address these issues. Nevertheless, the development of such foundational models encounters intrinsic challenges in US analysis, i.e., insufficient databases, low quality, and ineffective fea… ▽ More Inadequate generality across different organs and tasks constrains the application of ultrasound (US) image analysis methods in smart healthcare. Building a universal US foundation model holds the potential to address these issues. Nevertheless, the development of such foundational models encounters intrinsic challenges in US analysis, i.e., insufficient databases, low quality, and ineffective features. In this paper, we present a universal US foundation model, named USFM, generalized to diverse tasks and organs towards label efficient US image analysis. First, a large-scale Multi-organ, Multi-center, and Multi-device US database was built, comprehensively containing over two million US images. Organ-balanced sampling was employed for unbiased learning. Then, USFM is self-supervised pre-trained on the sufficient US database. To extract the effective features from low-quality US images, we proposed a spatial-frequency dual masked image modeling method. A productive spatial noise addition-recovery approach was designed to learn meaningful US information robustly, while a novel frequency band-stop masking learning approach was also employed to extract complex, implicit grayscale distribution and textural variations. Extensive experiments were conducted on the various tasks of segmentation, classification, and image enhancement from diverse organs and diseases. Comparisons with representative US image analysis models illustrate the universality and effectiveness of USFM. The label efficiency experiments suggest the USFM obtains robust performance with only 20% annotation, laying the groundwork for the rapid development of US models in clinical practices. △ Less

Submitted 2 January, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: Submit to MedIA, 17 pages, 11 figures

arXiv:2312.15668 [pdf, ps, other]

Air-to-Ground Communications Beyond 5G: UAV Swarm Formation Control and Tracking

Authors: Xiao Fan, Peiran Wu, Minghua Xia

Abstract: Unmanned aerial vehicle (UAV) communications have been widely accepted as promising technologies to support air-to-ground communications in the forthcoming sixth-generation (6G) wireless networks. This paper proposes a novel air-to-ground communication model consisting of aerial base stations served by UAVs and terrestrial user equipments (UEs) by integrating the technique of coordinated multi-poi… ▽ More Unmanned aerial vehicle (UAV) communications have been widely accepted as promising technologies to support air-to-ground communications in the forthcoming sixth-generation (6G) wireless networks. This paper proposes a novel air-to-ground communication model consisting of aerial base stations served by UAVs and terrestrial user equipments (UEs) by integrating the technique of coordinated multi-point (CoMP) transmission with the theory of stochastic geometry. In particular, a CoMP set consisting of multiple UAVs is developed based on the theory of Poisson-Delaunay tetrahedralization. Effective UAV formation control and UAV swarm tracking schemes for two typical scenarios, including static and mobile UEs, are also developed using the multi-agent system theory to ensure that collaborative UAVs can efficiently reach target spatial positions for mission execution. Thanks to the ease of mathematical tractability, this model provides explicit performance expressions for a typical UE's coverage probability and achievable ergodic rate. Extensive simulation and numerical results corroborate that the proposed scheme outperforms UAV communications without CoMP transmission and obtains similar performance to the conventional CoMP scheme while avoiding search overhead. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 14 pages, 9 figures, to appear in IEEE TWC

arXiv:2312.15244 [pdf, ps, other]

Fluid Antenna Array Enhanced Over-the-Air Computation

Authors: Deyou Zhang, Sicong Ye, Ming Xiao, Kezhi Wang, Marco Di Renzo, Mikael Skoglund

Abstract: Over-the-air computation (AirComp) has emerged as a promising technology for fast wireless data aggregation by harnessing the superposition property of wireless multiple-access channels. This paper investigates a fluid antenna (FA) array-enhanced AirComp system, employing the new degrees of freedom achieved by antenna movements. Specifically, we jointly optimize the transceiver design and antenna… ▽ More Over-the-air computation (AirComp) has emerged as a promising technology for fast wireless data aggregation by harnessing the superposition property of wireless multiple-access channels. This paper investigates a fluid antenna (FA) array-enhanced AirComp system, employing the new degrees of freedom achieved by antenna movements. Specifically, we jointly optimize the transceiver design and antenna position vector (APV) to minimize the mean squared error (MSE) between target and estimated function values. To tackle the resulting highly non-convex problem, we adopt an alternating optimization technique to decompose it into three subproblems. These subproblems are then iteratively solved until convergence, leading to a locally optimal solution. Numerical results show that FA arrays with the proposed transceiver and APV design significantly outperform the traditional fixed-position antenna arrays in terms of MSE. △ Less

Submitted 13 February, 2025; v1 submitted 23 December, 2023; originally announced December 2023.

arXiv:2311.18418 [pdf, ps, other]

Beamforming Design for Active RIS-Aided Over-the-Air Computation

Authors: Deyou Zhang, Ming Xiao, Mikael Skoglund, H. Vincent Poor

Abstract: Over-the-air computation (AirComp) is emerging as a promising technology for wireless data aggregation. However, its performance is hampered by users with poor channel conditions. To mitigate such a performance bottleneck, this paper introduces an active reconfigurable intelligence surface (RIS) into the AirComp system. Specifically, we begin by exploring the ideal RIS model and propose a joint op… ▽ More Over-the-air computation (AirComp) is emerging as a promising technology for wireless data aggregation. However, its performance is hampered by users with poor channel conditions. To mitigate such a performance bottleneck, this paper introduces an active reconfigurable intelligence surface (RIS) into the AirComp system. Specifically, we begin by exploring the ideal RIS model and propose a joint optimization of the transceiver design and RIS configuration to minimize the mean squared error (MSE) between the target and estimated function values. To manage the resultant tri-convex optimization problem, we employ the alternating optimization (AO) technique to decompose it into three convex subproblems, each solvable optimally. Subsequently, we investigate two specific cases and analyze their respective asymptotic performance to reveal the superiority of the active RIS in mitigating the MSE relative to its passive counterpart. Lastly, we adapt our transceiver and RIS configuration design to account for the self-interference of the active RIS. To handle the resultant highly non-convex problem, we further devise a two-layer AO framework. Simulation results demonstrate the superiority of the active RIS in enhancing AirComp performance compared to its passive counterpart. △ Less

Submitted 30 November, 2023; originally announced November 2023.

arXiv:2311.03982 [pdf, ps, other]

Federated Learning via Active RIS Assisted Over-the-Air Computation

Authors: Deyou Zhang, Ming Xiao, Mikael Skoglund, H. Vincent Poor

Abstract: In this paper, we propose leveraging the active reconfigurable intelligence surface (RIS) to support reliable gradient aggregation for over-the-air computation (AirComp) enabled federated learning (FL) systems. An analysis of the FL convergence property reveals that minimizing gradient aggregation errors in each training round is crucial for narrowing the convergence gap. As such, we formulate an… ▽ More In this paper, we propose leveraging the active reconfigurable intelligence surface (RIS) to support reliable gradient aggregation for over-the-air computation (AirComp) enabled federated learning (FL) systems. An analysis of the FL convergence property reveals that minimizing gradient aggregation errors in each training round is crucial for narrowing the convergence gap. As such, we formulate an optimization problem, aiming to minimize these errors by jointly optimizing the transceiver design and RIS configuration. To handle the formulated highly non-convex problem, we devise a two-layer alternative optimization framework to decompose it into several convex subproblems, each solvable optimally. Simulation results demonstrate the superiority of the active RIS in reducing gradient aggregation errors compared to its passive counterpart. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: This paper was submitted to the IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN), Stockholm, Sweden, 2024

arXiv:2311.03974 [pdf, ps, other]

NOMA Enabled Multi-Access Edge Computing: A Joint MU-MIMO Precoding and Computation Offloading Design

Authors: Deyou Zhang, Meng Wang, Shuo Shi, Ming Xiao

Abstract: This letter investigates computation offloading and transmit precoding co-design for multi-access edge computing (MEC), where multiple MEC users (MUs) equipped with multiple antennas access the MEC server in a non-orthogonal multiple access manner. We aim to minimize the total energy consumption of all MUs while satisfying the latency constraints by jointly optimizing the computational frequency,… ▽ More This letter investigates computation offloading and transmit precoding co-design for multi-access edge computing (MEC), where multiple MEC users (MUs) equipped with multiple antennas access the MEC server in a non-orthogonal multiple access manner. We aim to minimize the total energy consumption of all MUs while satisfying the latency constraints by jointly optimizing the computational frequency, offloading ratio, and precoding matrix of each MU. For tractability, we first decompose the original problem into three subproblems and then solve these subproblems iteratively until convergence. Simulation results validate the convergence of the proposed method and demonstrate its superiority over baseline algorithms. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.00483 [pdf, other]

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

Authors: Xiaohua Jiang, Yihao Guo, Jian Huang, Yuting Wu, Meiyi Luo, Zhaoyang Xu, Qianni Zhang, Xingru Huang, Hong He, Shaowei Jiang, Jing Ye, Mang Xiao

Abstract: The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in… ▽ More The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in available datasets and noise artifacts. To surmount these challenges, we introduced Stochastic Defect Injection (SDi) to augment the representational diversity of challenging indistinct-boundary objects within training corpora. Consequently, we propose the Dual-Encoder Fourier Group Harmonics Network (DEFN) to tailor noise filtration, amplify detailed feature recognition, and bolster representation across diverse medical imaging scenarios. By incorporating Dynamic Weight Composing (DWC) loss dynamically adjusts model's focus based on training progression, DEFN achieves SOTA performance on the OIMHS public dataset, showcasing effectiveness in indistinct boundary contexts. Source code for DEFN is available at: https://github.com/IMOP-lab/DEFN-pytorch. △ Less

Submitted 19 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: 36pages,16figures,7tables

MSC Class: 68; 92 ACM Class: I.4; J.3

arXiv:2310.07405 [pdf, ps, other]

IRS Assisted Federated Learning A Broadband Over-the-Air Aggregation Approach

Authors: Deyou Zhang, Ming Xiao, Zhibo Pang, Lihui Wang, H. Vincent Poor

Abstract: We consider a broadband over-the-air computation empowered model aggregation approach for wireless federated learning (FL) systems and propose to leverage an intelligent reflecting surface (IRS) to combat wireless fading and noise. We first investigate the conventional node-selection based framework, where a few edge nodes are dropped in model aggregation to control the aggregation error. We analy… ▽ More We consider a broadband over-the-air computation empowered model aggregation approach for wireless federated learning (FL) systems and propose to leverage an intelligent reflecting surface (IRS) to combat wireless fading and noise. We first investigate the conventional node-selection based framework, where a few edge nodes are dropped in model aggregation to control the aggregation error. We analyze the performance of this node-selection based framework and derive an upper bound on its performance loss, which is shown to be related to the selected edge nodes. Then, we seek to minimize the mean-squared error (MSE) between the desired global gradient parameters and the actually received ones by optimizing the selected edge nodes, their transmit equalization coefficients, the IRS phase shifts, and the receive factors of the cloud server. By resorting to the matrix lifting technique and difference-of-convex programming, we successfully transform the formulated optimization problem into a convex one and solve it using off-the-shelf solvers. To improve learning performance, we further propose a weight-selection based FL framework. In such a framework, we assign each edge node a proper weight coefficient in model aggregation instead of discarding any of them to reduce the aggregation error, i.e., amplitude alignment of the received local gradient parameters from different edge nodes is not required. We also analyze the performance of this weight-selection based framework and derive an upper bound on its performance loss, followed by minimizing the MSE via optimizing the weight coefficients of the edge nodes, their transmit equalization coefficients, the IRS phase shifts, and the receive factors of the cloud server. Furthermore, we use the MNIST dataset for simulations to evaluate the performance of both node-selection and weight-selection based FL frameworks. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: This paper has been accepted by IEEE Transactions on Wireless Communications

arXiv:2309.06681 [pdf]

A plug-and-play synthetic data deep learning for undersampled magnetic resonance image reconstruction

Authors: Min Xiao, Zi Wang, Jiefeng Guo, Xiaobo Qu

Abstract: Magnetic resonance imaging (MRI) plays an important role in modern medical diagnostic but suffers from prolonged scan time. Current deep learning methods for undersampled MRI reconstruction exhibit good performance in image de-aliasing which can be tailored to the specific k-space undersampling scenario. But it is very troublesome to configure different deep networks when the sampling setting chan… ▽ More Magnetic resonance imaging (MRI) plays an important role in modern medical diagnostic but suffers from prolonged scan time. Current deep learning methods for undersampled MRI reconstruction exhibit good performance in image de-aliasing which can be tailored to the specific k-space undersampling scenario. But it is very troublesome to configure different deep networks when the sampling setting changes. In this work, we propose a deep plug-and-play method for undersampled MRI reconstruction, which effectively adapts to different sampling settings. Specifically, the image de-aliasing prior is first learned by a deep denoiser trained to remove general white Gaussian noise from synthetic data. Then the learned deep denoiser is plugged into an iterative algorithm for image reconstruction. Results on in vivo data demonstrate that the proposed method provides nice and robust accelerated image reconstruction performance under different undersampling patterns and sampling rates, both visually and quantitatively. △ Less

Submitted 8 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 5 pages, 3 figures

arXiv:2308.04003 [pdf, ps, other]

doi 10.1109/LWC.2023.3336949

Low-complexity Resource Allocation for Uplink RSMA in Future 6G Wireless Networks

Authors: Jiewen Hu, Gang Liu, Zheng Ma, Ming Xiao, Pingzhi Fan

Abstract: Uplink rate-splitting multiple access (RSMA) requires optimization of decoding order and power allocation, while decoding order is a discrete variable, and it is very complex to find the optimal decoding order if the number of users is large enough. This letter proposes a low-complexity user pairing-based resource allocation algorithm with the objective of minimizing the maximum latency. Closed-fo… ▽ More Uplink rate-splitting multiple access (RSMA) requires optimization of decoding order and power allocation, while decoding order is a discrete variable, and it is very complex to find the optimal decoding order if the number of users is large enough. This letter proposes a low-complexity user pairing-based resource allocation algorithm with the objective of minimizing the maximum latency. Closed-form expressions for power and bandwidth allocation for a given latency are first derived. Then a bisection method is used to determine the minimum latency and optimal resource allocation. Finally, the proposed algorithm is compared with unpaired RSMA using an exhaustive method to obtain the optimal decoding order, unpaired RSMA using a suboptimal decoding order, paired non-orthogonal multiple access (NOMA) and unpaired NOMA. The results show that our proposed algorithm outperforms NOMA and achieves similar performance to unpaired RSMA. In addition, the complexity of the proposed algorithm is significantly reduced. △ Less

Submitted 27 November, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

arXiv:2306.08309 [pdf, other]

doi 10.1109/TVCG.2023.3278691

Taming Reversible Halftoning via Predictive Luminance

Authors: Cheuk-Kit Lau, Menghan Xia, Tien-Tsin Wong

Abstract: Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full restorability to its original version. Our novel base halftoning technique consists of two convolutional neural networks (CNNs) to produce the rev… ▽ More Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full restorability to its original version. Our novel base halftoning technique consists of two convolutional neural networks (CNNs) to produce the reversible halftone patterns, and a noise incentive block (NIB) to mitigate the flatness degradation issue of CNNs. Furthermore, to tackle the conflicts between the blue-noise quality and restoration accuracy in our novel base method, we proposed a predictor-embedded approach to offload predictable information from the network, which in our case is the luminance information resembling from the halftone pattern. Such an approach allows the network to gain more flexibility to produce halftones with better blue-noise quality without compromising the restoration quality. Detailed studies on the multiple-stage training method and loss weightings have been conducted. We have compared our predictor-embedded method and our novel method regarding spectrum analysis on halftone, halftone accuracy, restoration accuracy, and the data embedding studies. Our entropy evaluation evidences our halftone contains less encoding information than our novel base method. The experiments show our predictor-embedded method gains more flexibility to improve the blue-noise quality of halftones and maintains a comparable restoration quality with a higher tolerance for disturbances. △ Less

Submitted 7 February, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: published in IEEE Transactions on Visualization and Computer Graphics

arXiv:2305.00383 [pdf, other]

Edge Learning for Large-Scale Internet of Things With Task-Oriented Efficient Communication

Authors: Haihui Xie, Minghua Xia, Peiran Wu, Shuai Wang, H. Vincent Poor

Abstract: In the Internet of Things (IoT) networks, edge learning for data-driven tasks provides intelligent applications and services. As the network size becomes large, different users may generate distinct datasets. Thus, to suit multiple edge learning tasks for large-scale IoT networks, this paper performs efficient communication under the task-oriented principle by using the collaborative design of wir… ▽ More In the Internet of Things (IoT) networks, edge learning for data-driven tasks provides intelligent applications and services. As the network size becomes large, different users may generate distinct datasets. Thus, to suit multiple edge learning tasks for large-scale IoT networks, this paper performs efficient communication under the task-oriented principle by using the collaborative design of wireless resource allocation and edge learning error prediction. In particular, we start with multi-user scheduling to alleviate co-channel interference in dense networks. Then, we perform optimal power allocation in parallel for different learning tasks. Thanks to the high parallelization of the designed algorithm, extensive experimental results corroborate that the multi-user scheduling and task-oriented power allocation improve the performance of distinct edge learning tasks efficiently compared with the state-of-the-art benchmark algorithms. △ Less

Submitted 30 April, 2023; originally announced May 2023.

Comments: 16 pages, 8 figures; accepted for publication in IEEE TWC

Showing 1–50 of 86 results for author: Xia, M