Search | arXiv e-print repository

Capturing Stable HDR Videos Using a Dual-Camera System

Authors: Qianyu Zhang, Bolun Zheng, Hangjia Pan, Lingyu Zhu, Zunjie Zhu, Zongpeng Li, Shiqi Wang

Abstract: In HDR video reconstruction, exposure fluctuations in reference images from alternating exposure methods often result in flickering. To address this issue, we propose a dual-camera system (DCS) for HDR video acquisition, where one camera is assigned to capture consistent reference sequences, while the other is assigned to capture non-reference sequences for information supplementation. To tackle t… ▽ More In HDR video reconstruction, exposure fluctuations in reference images from alternating exposure methods often result in flickering. To address this issue, we propose a dual-camera system (DCS) for HDR video acquisition, where one camera is assigned to capture consistent reference sequences, while the other is assigned to capture non-reference sequences for information supplementation. To tackle the challenges posed by video data, we introduce an exposure-adaptive fusion network (EAFNet) to achieve more robust results. EAFNet introduced a pre-alignment subnetwork to explore the influence of exposure, selectively emphasizing the valuable features across different exposure levels. Then, the enhanced features are fused by the asymmetric cross-feature fusion subnetwork, which explores reference-dominated attention maps to improve image fusion by aligning cross-scale features and performing cross-feature fusion. Finally, the reconstruction subnetwork adopts a DWT-based multiscale architecture to reduce ghosting artifacts and refine features at different resolutions. Extensive experimental evaluations demonstrate that the proposed method achieves state-of-the-art performance on different datasets, validating the great potential of the DCS in HDR video reconstruction. The codes and data captured by DCS will be available at https://github.com/zqqqyu/DCS. △ Less

Submitted 9 July, 2025; originally announced July 2025.

arXiv:2506.20158 [pdf, ps, other]

Efficient Channel Estimation for Rotatable Antenna-Enabled Wireless Communication

Authors: Xue Xiong, Beixiong Zheng, Wen Wu, Xiaodan Shao, Liang Dai, Ming-Min Zhao, Jie Tang

Abstract: Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. Among them, rotatable antenna (RA) is a promising antenna architecture that exploits additional spatial degrees of freedom (DoFs) to enhance the communication performance. To fully obtain the performance gain provided by RAs,… ▽ More Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. Among them, rotatable antenna (RA) is a promising antenna architecture that exploits additional spatial degrees of freedom (DoFs) to enhance the communication performance. To fully obtain the performance gain provided by RAs, accurate channel state information (CSI) is essential for adjusting the orientation/boresight of each antenna. In this letter, we propose an efficient channel estimation scheme for RA communication systems, where the base station (BS) can sequentially and adaptively adjust the orientations of RAs to enrich the environmental observations from diverse angular perspectives, thereby enhancing the channel estimation accuracy. The proposed scheme includes two main procedures that are conducted alternately during each channel training period. Specifically, the first procedure is to estimate the CSI with given RAs' orientations, involving the angle-of-arrivals (AoAs) information and path gains. Then, based on the estimated CSI, the second procedure adjusts the RAs' orientations to maximize the effective channel gain. Simulation results demonstrate that the proposed channel estimation method outperforms other benchmark schemes. △ Less

Submitted 29 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

Comments: 5 pages, 4 figures

arXiv:2505.24820 [pdf, ps, other]

Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding

Authors: Yu Xi, Xiaoyu Gu, Haoyu Li, Jun Song, Bo Zheng, Kai Yu

Abstract: RNN-T-based keyword spotting (KWS) with autoregressive decoding~(AR) has gained attention due to its streaming architecture and superior performance. However, the simplicity of the prediction network in RNN-T poses an overfitting issue, especially under challenging scenarios, resulting in degraded performance. In this paper, we propose a masked self-distillation (MSD) training strategy that avoids… ▽ More RNN-T-based keyword spotting (KWS) with autoregressive decoding~(AR) has gained attention due to its streaming architecture and superior performance. However, the simplicity of the prediction network in RNN-T poses an overfitting issue, especially under challenging scenarios, resulting in degraded performance. In this paper, we propose a masked self-distillation (MSD) training strategy that avoids RNN-Ts overly relying on prediction networks to alleviate overfitting. Such training enables masked non-autoregressive (NAR) decoding, which fully masks the RNN-T predictor output during KWS decoding. In addition, we propose a semi-autoregressive (SAR) decoding approach to integrate the advantages of AR and NAR decoding. Our experiments across multiple KWS datasets demonstrate that MSD training effectively alleviates overfitting. The SAR decoding method preserves the superior performance of AR decoding while benefits from the overfitting suppression of NAR decoding, achieving excellent results. △ Less

Submitted 30 May, 2025; originally announced May 2025.

arXiv:2505.24687 [pdf, ps, other]

TumorGen: Boundary-Aware Tumor-Mask Synthesis with Rectified Flow Matching

Authors: Shengyuan Liu, Wenting Chen, Boyun Zheng, Wentao Pan, Xiang Li, Yixuan Yuan

Abstract: Tumor data synthesis offers a promising solution to the shortage of annotated medical datasets. However, current approaches either limit tumor diversity by using predefined masks or employ computationally expensive two-stage processes with multiple denoising steps, causing computational inefficiency. Additionally, these methods typically rely on binary masks that fail to capture the gradual transi… ▽ More Tumor data synthesis offers a promising solution to the shortage of annotated medical datasets. However, current approaches either limit tumor diversity by using predefined masks or employ computationally expensive two-stage processes with multiple denoising steps, causing computational inefficiency. Additionally, these methods typically rely on binary masks that fail to capture the gradual transitions characteristic of tumor boundaries. We present TumorGen, a novel Boundary-Aware Tumor-Mask Synthesis with Rectified Flow Matching for efficient 3D tumor synthesis with three key components: a Boundary-Aware Pseudo Mask Generation module that replaces strict binary masks with flexible bounding boxes; a Spatial-Constraint Vector Field Estimator that simultaneously synthesizes tumor latents and masks using rectified flow matching to ensure computational efficiency; and a VAE-guided mask refiner that enhances boundary realism. TumorGen significantly improves computational efficiency by requiring fewer sampling steps while maintaining pathological accuracy through coarse and fine-grained spatial constraints. Experimental results demonstrate TumorGen's superior performance over existing tumor synthesis methods in both efficiency and realism, offering a valuable contribution to AI-driven cancer diagnostics. △ Less

Submitted 30 May, 2025; originally announced May 2025.

Comments: 10 pages, 4 figures

arXiv:2505.07294 [pdf, other]

HuB: Learning Extreme Humanoid Balance

Authors: Tong Zhang, Boyuan Zheng, Ruiqian Nai, Yingdong Hu, Yen-Jen Wang, Geng Chen, Fanqi Lin, Jiongye Li, Chuye Hong, Koushil Sreenath, Yang Gao

Abstract: The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters-both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging. In th… ▽ More The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters-both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging. In this work, we identify three key obstacles: instability from reference motion errors, learning difficulties due to morphological mismatch, and the sim-to-real gap caused by sensor noise and unmodeled dynamics. To address these challenges, we propose HuB (Humanoid Balance), a unified framework that integrates reference motion refinement, balance-aware policy learning, and sim-to-real robustness training, with each component targeting a specific challenge. We validate our approach on the Unitree G1 humanoid robot across challenging quasi-static balance tasks, including extreme single-legged poses such as Swallow Balance and Bruce Lee's Kick. Our policy remains stable even under strong physical disturbances-such as a forceful soccer strike-while baseline methods consistently fail to complete these tasks. Project website: https://hub-robot.github.io △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: Project website: https://hub-robot.github.io

arXiv:2504.19438 [pdf, other]

Dual Attention Driven Lumbar Magnetic Resonance Image Feature Enhancement and Automatic Diagnosis of Herniation

Authors: Lingrui Zhang, Liang Guo, Xiao An, Feng Lin, Binlong Zheng, Jiankun Wang, Zhirui Li

Abstract: Lumbar disc herniation (LDH) is a common musculoskeletal disease that requires magnetic resonance imaging (MRI) for effective clinical management. However, the interpretation of MRI images heavily relies on the expertise of radiologists, leading to delayed diagnosis and high costs for training physicians. Therefore, this paper proposes an innovative automated LDH classification framework. To addre… ▽ More Lumbar disc herniation (LDH) is a common musculoskeletal disease that requires magnetic resonance imaging (MRI) for effective clinical management. However, the interpretation of MRI images heavily relies on the expertise of radiologists, leading to delayed diagnosis and high costs for training physicians. Therefore, this paper proposes an innovative automated LDH classification framework. To address these key issues, the framework utilizes T1-weighted and T2-weighted MRI images from 205 people. The framework extracts clinically actionable LDH features and generates standardized diagnostic outputs by leveraging data augmentation and channel and spatial attention mechanisms. These outputs can help physicians make confident and time-effective care decisions when needed. The proposed framework achieves an area under the receiver operating characteristic curve (AUC-ROC) of 0.969 and an accuracy of 0.9486 for LDH detection. The experimental results demonstrate the performance of the proposed framework. Our framework only requires a small number of datasets for training to demonstrate high diagnostic accuracy. This is expected to be a solution to enhance the LDH detection capabilities of primary hospitals. △ Less

Submitted 27 April, 2025; originally announced April 2025.

Comments: 9 pages, 7 figures

arXiv:2504.10473 [pdf, other]

Rotatable Antenna-Enabled Secure Wireless Communication

Authors: Liang Dai, Beixiong Zheng, Qingjie Wu, Changsheng You, Robert Schober, Rui Zhang

Abstract: Rotatable antenna (RA) is a promising technology that exploits new spatial degree-of-freedom (DoF) to improve wireless communication and sensing performance. In this letter, we investigate an RA-enabled secure communication system where confidential information is transmitted from an RA-based access point (AP) to a single-antenna legitimate user in the presence of multiple eavesdroppers. We aim to… ▽ More Rotatable antenna (RA) is a promising technology that exploits new spatial degree-of-freedom (DoF) to improve wireless communication and sensing performance. In this letter, we investigate an RA-enabled secure communication system where confidential information is transmitted from an RA-based access point (AP) to a single-antenna legitimate user in the presence of multiple eavesdroppers. We aim to maximize the secrecy rate by jointly optimizing the transmit beamforming and the deflection angles of all RAs at the AP. Accordingly, we propose an efficient alternating optimization (AO) algorithm to obtain a high-quality suboptimal solution in an iterative manner, where the generalized Rayleigh quotient-based beamforming is applied and the RAs' deflection angles are optimized by the successive convex approximation (SCA) technique. Our simulation results show that the proposed RA-enabled secure communication system achieves a significantly higher secrecy rate as compared to various benchmark schemes. △ Less

Submitted 30 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

arXiv:2503.20323 [pdf, other]

Derivation and analysis of power offset in fiber-longitudinal power profile estimation using pre-FEC hard-decision data

Authors: Du Tang, Yingjie Jiang, Ji Luo, Yu Chen, Bofang Zheng, Yaojun Qiao

Abstract: Utilizing the precise reference waveform regenerated by post-forward error correction (FEC) data, the fiber-longitudinal power profile estimation based on the minimum-mean-square-error method (MMSE-PPE) has been validated as an effective tool for absolute power monitoring. However, when post-FEC data is unavailable, it becomes necessary to rely on pre-FEC hard-decision data, which inevitably intro… ▽ More Utilizing the precise reference waveform regenerated by post-forward error correction (FEC) data, the fiber-longitudinal power profile estimation based on the minimum-mean-square-error method (MMSE-PPE) has been validated as an effective tool for absolute power monitoring. However, when post-FEC data is unavailable, it becomes necessary to rely on pre-FEC hard-decision data, which inevitably introduces hard-decision errors. These hard-decision errors will result in a power offset that undermines the accuracy of absolute power monitoring. In this paper, we present the first analytical expression for power offset in MMSE-PPE when using pre-FEC hard-decision data, achieved by introducing a virtual hard-decision nonlinear perturbation term. Based on this analytical expression, we also establish the first nonlinear relationship between the power offset and the symbol error rate (SER) of M-ary quadrature amplitude modulation (M-QAM) formats based on Gaussian assumptions. Verified in a numerical 130-GBaud single-wavelength coherent optical fiber transmission system, the correctness of the analytical expression of power offset has been confirmed with 4-QAM, 16-QAM, and 64-QAM formats under different SER situations. Furthermore, the nonlinear relationship between the power offset and SER of $M$-QAM formats has also been thoroughly validated under both linear scale (measured in mW) and logarithmic scale (measured in dB). These theoretical insights offer significant contributions to the design of potential power offset mitigation strategies in MMSE-PPE, thereby enhancing its real-time application. △ Less

Submitted 26 March, 2025; originally announced March 2025.

arXiv:2503.18240 [pdf, other]

A Tutorial on Six-Dimensional Movable Antenna for 6G Networks: Synergizing Positionable and Rotatable Antennas

Authors: Xiaodan Shao, Weidong Mei, Changsheng You, Qingqing Wu, Beixiong Zheng, Cheng-Xiang Wang, Junling Li, Rui Zhang, Robert Schober, Lipeng Zhu, Weihua Zhuang, Xuemin Shen

Abstract: Six-dimensional movable antenna (6DMA) is a new and revolutionary technique that fully exploits the wireless channel spatial variations at the transmitter/receiver by flexibly adjusting the three-dimensional (3D) positions and/or 3D rotations of antennas/antenna surfaces (sub-arrays), thereby improving the performance of wireless networks cost-effectively without the need to deploy addit… ▽ More Six-dimensional movable antenna (6DMA) is a new and revolutionary technique that fully exploits the wireless channel spatial variations at the transmitter/receiver by flexibly adjusting the three-dimensional (3D) positions and/or 3D rotations of antennas/antenna surfaces (sub-arrays), thereby improving the performance of wireless networks cost-effectively without the need to deploy additional antennas. It is thus expected that the integration of new 6DMAs into future sixth-generation (6G) wireless networks will fundamentally enhance antenna agility and adaptability, and introduce new degrees of freedom (DoFs) for system design. Despite its great potential, 6DMA faces new challenges to be efficiently implemented in wireless networks, including corresponding architectures, antenna position and rotation optimization, channel estimation, and system design from both communication and sensing perspectives. In this paper, we provide a tutorial on 6DMA-enhanced wireless networks to address the above issues by unveiling associated new channel models, hardware implementations and practical position/rotation constraints, as well as various appealing applications in wireless networks. Moreover, we discuss two special cases of 6DMA, namely, rotatable 6DMA with fixed antenna position and positionable 6DMA with fixed antenna rotation, and highlight their respective design challenges and applications. We further present prototypes developed for 6DMA-enhanced communication along with experimental results obtained with these prototypes. Finally, we outline promising directions for further investigation. △ Less

Submitted 7 May, 2025; v1 submitted 23 March, 2025; originally announced March 2025.

Comments: 46 pages, submitted to IEEE for publication

arXiv:2503.10472 [pdf, ps, other]

Rotatable Antennas for Integrated Sensing and Communications

Authors: Chao Zhou, Changsheng You, Beixiong Zheng, Xiaodan Shao, Rui Zhang

Abstract: In this letter, we propose to deploy rotatable antennas (RAs) at the base station (BS) to enhance both communication and sensing (C&S) performances, by exploiting a new spatial degree-of-freedom (DoF) offered by array rotation. Specifically, we formulate a multi-objective optimization problem to simultaneously maximize the sum-rate of multiple communication users and minimize the Cramér-Rao bound… ▽ More In this letter, we propose to deploy rotatable antennas (RAs) at the base station (BS) to enhance both communication and sensing (C&S) performances, by exploiting a new spatial degree-of-freedom (DoF) offered by array rotation. Specifically, we formulate a multi-objective optimization problem to simultaneously maximize the sum-rate of multiple communication users and minimize the Cramér-Rao bound (CRB) for target angle estimation, by jointly optimizing the transmit beamforming vectors and the array rotation angle at the BS. To solve this problem, we first equivalently decompose it into two subproblems, corresponding to an inner problem for beamforming optimization and an outer problem for array rotation optimization. Although these two subproblems are non-convex, we obtain their high-quality solutions by applying the block coordinate descent (BCD) technique and one-dimensional exhaustive search, respectively. Moreover, we show that for the communication-only case, RAs provide an additional rotation gain to improve communication performance; while for the sensing-only case, the equivalent spatial aperture can be enlarged by RAs for achieving higher sensing accuracy. Finally, numerical results are presented to showcase the performance gains of RAs over fixed-rotation antennas in integrated sensing and communications (ISAC). △ Less

Submitted 13 March, 2025; originally announced March 2025.

Comments: This work is submitted to IEEE for possible publication

arXiv:2503.05205 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Electromagnetic Stealth over Extended Regions

Authors: Qingjie Wu, Beixiong Zheng, Guangchi Zhang, Derrick Wing Kwan Ng, A. Lee Swindlehurst

Abstract: Compared to traditional electromagnetic stealth (ES) materials, which are effective only within specific frequencies and orientations, intelligent reflecting surface (IRS) technology introduces a novel paradigm for achieving dynamic and adaptive ES by adapting its reflection pattern in real time to neutralize radar probing signals echoed back from the target. In this letter, we study an IRS-aided… ▽ More Compared to traditional electromagnetic stealth (ES) materials, which are effective only within specific frequencies and orientations, intelligent reflecting surface (IRS) technology introduces a novel paradigm for achieving dynamic and adaptive ES by adapting its reflection pattern in real time to neutralize radar probing signals echoed back from the target. In this letter, we study an IRS-aided ES system mounted on an aerial target to evade radar detection admist uncertain/moving radar positions over an extended area. Specifically, we aim to optimize the IRS's passive reflection to minimize the maximum received signal-to-noise ratio (SNR) of the target echo signal in the area. A semi-closed-form solution is derived by first discretizing the continuous spatial frequency deviation to approximate the semi-infinite reflection gain constraint and then leveraging the Lagrange dual method. Simulation results are provided to validate that the proposed IRS-aided ES strategy can consistently reduce the reflection gains for radars located across a large region. △ Less

Submitted 7 March, 2025; originally announced March 2025.

Comments: 5 pages, 4 figures

arXiv:2503.00922 [pdf, ps, other]

Confidence Based Asynchronous Integrated Communication and Localization Networks Using Pulsed UWB Signals

Authors: Fan Liu, Bofeng Zheng, Tingting Zhang, Qinyu Zhang

Abstract: In recent years, UWB has garnered widespread attention in academia and industry due to its low power consumption, wide bandwidth, and high time resolution characteristics. This paper introduces the design of an asynchronous IR-UWB integrated communication and localization (ICL) downlink network, which employs unified waveforms to enable simultaneous data transmission and localization. A differenti… ▽ More In recent years, UWB has garnered widespread attention in academia and industry due to its low power consumption, wide bandwidth, and high time resolution characteristics. This paper introduces the design of an asynchronous IR-UWB integrated communication and localization (ICL) downlink network, which employs unified waveforms to enable simultaneous data transmission and localization. A differential sequential detection strategy has been proposed for data demodulation. To address errors caused by symbol misalignment, a novel symbol confidence metric model is introduced to ensure reliable pulse detection and time-of-arrival (TOA) estimation. Additionally, an asynchronous start-of-frame delimiter (SFD) detection model has been constructed to guide parameter optimization for practical applications. Furthermore, the clock drift estimation has been improved by leveraging the confidence metric within a modified weighted least squares (MWLS) framework. Simulation results demonstrate that the proposed system achieves reliable clock drift estimation, communication, and self-localization simultaneously. The operational range of the confidence metric required for these outcomes is also quantified, providing valuable insights for parameter design and system implementation. Finally, the agent localization accuracy can be achieved within 10 cm at over 90\% confidence, with commercial UWB devices according to practical measurements. △ Less

Submitted 2 March, 2025; originally announced March 2025.

arXiv:2503.00747 [pdf, other]

Unifying Light Field Perception with Field of Parallax

Authors: Fei Teng, Buyin Deng, Boyuan Zheng, Kai Luo, Kunyu Peng, Jiaming Zhang, Kailun Yang

Abstract: Field of Parallax (FoP)}, a spatial field that distills the common features from different LF representations to provide flexible and consistent support for multi-task learning. FoP is built upon three core features--projection difference, adjacency divergence, and contextual consistency--which are essential for cross-task adaptability. To implement FoP, we design a two-step angular adapter: the f… ▽ More Field of Parallax (FoP)}, a spatial field that distills the common features from different LF representations to provide flexible and consistent support for multi-task learning. FoP is built upon three core features--projection difference, adjacency divergence, and contextual consistency--which are essential for cross-task adaptability. To implement FoP, we design a two-step angular adapter: the first step captures angular-specific differences, while the second step consolidates contextual consistency to ensure robust representation. Leveraging the FoP-based representation, we introduce the LFX framework, the first to handle arbitrary LF representations seamlessly, unifying LF multi-task vision. We evaluated LFX across three different tasks, achieving new state-of-the-art results, compared with previous task-specific architectures: 84.74% in mIoU for semantic segmentation on UrbanLF, 0.84% in AP for object detection on PKU, and 0.030 in MAE and 0.026 in MAE for salient object detection on Duftv2 and PKU, respectively. The source code will be made publicly available at https://github.com/warriordby/LFX. △ Less

Submitted 2 March, 2025; originally announced March 2025.

Comments: The source code will be made publicly available at https://github.com/warriordby/LFX

arXiv:2502.21036 [pdf, other]

A Demo of Radar Sensing Aided Rotatable Antenna for Wireless Communication System

Authors: Qi Dai, Beixiong Zheng, Qiyao Wang, Xue Xiong, Xiaodan Shao, Lipeng Zhu, Rui Zhang

Abstract: Rotatable antenna (RA) represents a novel antenna architecture that enhances wireless communication system performance by independently or collectively adjusting each antenna's boresight/orientation. In this demonstration, we develop a prototype of radar sensing-aided rotatable antenna that integrates radar sensing with dynamic antenna orientation to enhance wireless communication performance whil… ▽ More Rotatable antenna (RA) represents a novel antenna architecture that enhances wireless communication system performance by independently or collectively adjusting each antenna's boresight/orientation. In this demonstration, we develop a prototype of radar sensing-aided rotatable antenna that integrates radar sensing with dynamic antenna orientation to enhance wireless communication performance while maintaining low hardware costs. The proposed prototype consists of a transmitter (TX) module and a receiver (RX) module, both of which employ universal software radio peripherals (USRPs) for transmitting and receiving signals. Specifically, the TX utilizes a laser radar to detect the RX's location and conveys the angle of arrival (AoA) information to its antenna servo, which enables the RA to align its boresight direction with the identified RX. Experimental results examine the effectiveness of the proposed prototype and indicate that the RA significantly outperforms the traditional fixed-antenna system in terms of increasing received signal-to-noise ratio (SNR). △ Less

Submitted 17 April, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

arXiv:2502.20674 [pdf, other]

doi 10.1109/TVT.2025.3545644

Linear Model of RIS-Aided High-Mobility Communication System

Authors: Shuaijun Li, Jie Tang, Beixiong Zheng, Xiaokai Song, Guixin Pan, Kai-Kit Wong

Abstract: Reconfigurable intelligent surface (RIS)-aided vehicle-to-everything (V2X) communication has emerged as a crucial solution for providing reliable data services to vehicles on the road. However, in delay-sensitive or high-mobility communications, the rapid movement of vehicles can lead to random scattering in the environment and time-selective fading in the channel. In view of this, we investigate… ▽ More Reconfigurable intelligent surface (RIS)-aided vehicle-to-everything (V2X) communication has emerged as a crucial solution for providing reliable data services to vehicles on the road. However, in delay-sensitive or high-mobility communications, the rapid movement of vehicles can lead to random scattering in the environment and time-selective fading in the channel. In view of this, we investigate in this paper an innovative linear model with low-complexity transmitter signal design and receiver detection methods, which boost stability in fast-fading environments and reduce channel training overhead. Specifically, considering the differences in hardware design and signal processing at the receiving end between uplink and downlink communication systems, distinct solutions are proposed. Accordingly, we first integrate the Rician channel introduced by the RIS with the corresponding signal processing algorithms to model the RIS-aided downlink communication system as a Doppler-robust linear model. Inspired by this property, we design a precoding scheme based on the linear model to reduce the complexity of precoding. Then, by leveraging the linear model and the large-scale antenna array at the base station (BS) side, we improve the linear model for the uplink communication system and derive its asymptotic performance in closed-form. Simulation results demonstrate the performance advantages of the proposed RIS-aided high-mobility communication system compared to other benchmark schemes. △ Less

Submitted 27 February, 2025; originally announced February 2025.

Comments: accepted in IEEE Transactions on Vehicular Technology( Early Access )

Journal ref: IEEE Transactions on Vehicular Technology, 25 February 2025

arXiv:2502.17097 [pdf, other]

Rotatable Antenna Enabled Wireless Communication System with Visual Recognition: A Prototype Implementation

Authors: Liang Dai, Beixiong Zheng, Yanhua Tan, Lipeng Zhu, Fangjiong Chen, Rui Zhang

Abstract: Rotatable antenna (RA) is an emerging technology that has great potential to exploit additional spatial degrees of freedom (DoFs) by flexibly altering the three-dimensional (3D) orientation/boresight of each antenna. In this demonstration, we present a prototype of the RA-enabled wireless communication system with a visual recognition module to evaluate the performance gains provided by the RA in… ▽ More Rotatable antenna (RA) is an emerging technology that has great potential to exploit additional spatial degrees of freedom (DoFs) by flexibly altering the three-dimensional (3D) orientation/boresight of each antenna. In this demonstration, we present a prototype of the RA-enabled wireless communication system with a visual recognition module to evaluate the performance gains provided by the RA in practical environments. In particular, a mechanically-driven RA is developed by integrating a digital servo motor, a directional antenna, and a microcontroller, which enables the dynamic adjustment of the RA orientation. Moreover, the orientation adjustment of the RA is guided by the user's direction information provided by the visual recognition module, thereby significantly enhancing system response speed and self-orientation accuracy. The experimental results demonstrate that the RA-enabled communication system achieves significant improvement in communication coverage performance compared to the conventional fixed antenna system. △ Less

Submitted 23 March, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

arXiv:2502.16864 [pdf, other]

Joint Size and Placement Optimization for IRS-Aided Communications with Active and Passive Elements

Authors: Qiaoyan Peng, Qingqing Wu, Wen Chen, Chaoying Huang, Beixiong Zheng, Shaodan Ma, Mengnan Jian, Yijian Chen, Jun Yang

Abstract: Different types of intelligent reflecting surfaces (IRS) are exploited for assisting wireless communications. The joint use of passive IRS (PIRS) and active IRS (AIRS) emerges as a promising solution owing to their complementary advantages. They can be integrated into a single hybrid active-passive IRS (HIRS) or deployed in a distributed manner, which poses challenges in determining the IRS elemen… ▽ More Different types of intelligent reflecting surfaces (IRS) are exploited for assisting wireless communications. The joint use of passive IRS (PIRS) and active IRS (AIRS) emerges as a promising solution owing to their complementary advantages. They can be integrated into a single hybrid active-passive IRS (HIRS) or deployed in a distributed manner, which poses challenges in determining the IRS element allocation and placement for rate maximization. In this paper, we investigate the capacity of an IRS-aided wireless communication system with both active and passive elements. Specifically, we consider three deployment schemes: 1) base station (BS)-HIRS-user (BHU); 2) BS-AIRS-PIRS-user (BAPU); 3) BS-PIRS-AIRS-user (BPAU). Under the line-of-sight channel model, we formulate a rate maximization problem via a joint optimization of the IRS element allocation and placement. We first derive the optimized number of active and passive elements for BHU, BAPU, and BPAU schemes, respectively. Then, low-complexity HIRS/AIRS placement strategies are provided. To obtain more insights, we characterize the system capacity scaling orders for the three schemes with respect to the large total number of IRS elements, amplification power budget, and BS transmit power. Finally, simulation results are presented to validate our theoretical findings and show the performance difference among the BHU, BAPU, and BPAU schemes with the proposed joint design under various system setups. △ Less

Submitted 24 February, 2025; originally announced February 2025.

arXiv:2502.04399 [pdf, other]

Online Location Planning for AI-Defined Vehicles: Optimizing Joint Tasks of Order Serving and Spatio-Temporal Heterogeneous Model Fine-Tuning

Authors: Bokeng Zheng, Bo Rao, Tianxiang Zhu, Chee Wei Tan, Jingpu Duan, Zhi Zhou, Xu Chen, Xiaoxi Zhang

Abstract: Advances in artificial intelligence (AI) including foundation models (FMs), are increasingly transforming human society, with smart city driving the evolution of urban living.Meanwhile, vehicle crowdsensing (VCS) has emerged as a key enabler, leveraging vehicles' mobility and sensor-equipped capabilities. In particular, ride-hailing vehicles can effectively facilitate flexible data collection and… ▽ More Advances in artificial intelligence (AI) including foundation models (FMs), are increasingly transforming human society, with smart city driving the evolution of urban living.Meanwhile, vehicle crowdsensing (VCS) has emerged as a key enabler, leveraging vehicles' mobility and sensor-equipped capabilities. In particular, ride-hailing vehicles can effectively facilitate flexible data collection and contribute towards urban intelligence, despite resource limitations. Therefore, this work explores a promising scenario, where edge-assisted vehicles perform joint tasks of order serving and the emerging foundation model fine-tuning using various urban data. However, integrating the VCS AI task with the conventional order serving task is challenging, due to their inconsistent spatio-temporal characteristics: (i) The distributions of ride orders and data point-of-interests (PoIs) may not coincide in geography, both following a priori unknown patterns; (ii) they have distinct forms of temporal effects, i.e., prolonged waiting makes orders become instantly invalid while data with increased staleness gradually reduces its utility for model fine-tuning.To overcome these obstacles, we propose an online framework based on multi-agent reinforcement learning (MARL) with careful augmentation. A new quality-of-service (QoS) metric is designed to characterize and balance the utility of the two joint tasks, under the effects of varying data volumes and staleness. We also integrate graph neural networks (GNNs) with MARL to enhance state representations, capturing graph-structured, time-varying dependencies among vehicles and across locations. Extensive experiments on our testbed simulator, utilizing various real-world foundation model fine-tuning tasks and the New York City Taxi ride order dataset, demonstrate the advantage of our proposed method. △ Less

Submitted 6 February, 2025; originally announced February 2025.

arXiv:2501.03727 [pdf, ps, other]

Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives

Authors: Jinchao Li, Yuejiao Wang, Junan Li, Jiawen Kang, Bo Zheng, Simon Wong, Brian Mak, Helene Fung, Jean Woo, Man-Wai Mak, Timothy Kwok, Vincent Mok, Xianmin Gong, Xixin Wu, Xunying Liu, Patrick Wong, Helen Meng

Abstract: Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management. Given that language impairments manifest early in NCD progression, visual-stimulated narrative (VSN)-based analysis offers a promising avenue for NCD detection. Current VSN-based NCD detection methods primarily focus on linguistic microstructures (e.g., pauses, lexical diversity), which ar… ▽ More Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management. Given that language impairments manifest early in NCD progression, visual-stimulated narrative (VSN)-based analysis offers a promising avenue for NCD detection. Current VSN-based NCD detection methods primarily focus on linguistic microstructures (e.g., pauses, lexical diversity), which are potentially linked to bottom-up (stimulus-driven) cognitive processing. While these features illuminate basic language abilities, the higher-order linguistic macrostructures (e.g., thematic or logical development), which may reflect top-down (concept-driven) cognitive abilities, remain underexplored. These patterns are crucial for NCD detection yet challenging to quantify due to their abstract and complex nature. To bridge this gap, we propose two novel dynamic macrostructural approaches: (1) Dynamic Topic Model (DTM) to track topic evolution over time, and (2) Text-Image Temporal Alignment Network (TITAN) to measure cross-modal consistency between speech and visual stimuli. Experimental results validated the efficiency of proposed approaches in NCD detection, with TITAN achieving superior performance both on the CU-MARVEL-RABBIT corpus (F1 = 0.7238) and the ADReSS corpus (F1 = 0.8889). The feature contribution analysis revealed that macrostructural features (e.g., topic variability, topic change rate, and topic consistency) constituted the most significant contributors in the model's decision pathways, outperforming investigated microstructural features. These findings underscore the critical role of macrostructural patterns in understanding cognitive impairment mechanisms in NCDs. △ Less

Submitted 18 June, 2025; v1 submitted 7 January, 2025; originally announced January 2025.

Comments: 13 pages, 7 figures, submitted to JSTSP

arXiv:2501.02595 [pdf, ps, other]

Rotatable Antenna Enabled Wireless Communication: Modeling and Optimization

Authors: Beixiong Zheng, Qingjie Wu, Tiantian Ma, Rui Zhang

Abstract: Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. In this paper, we propose a new rotatable antenna (RA) model to improve the performance of wireless communication systems. Different from conventional fixed antennas, the proposed RA system can flexibly and independently alter… ▽ More Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. In this paper, we propose a new rotatable antenna (RA) model to improve the performance of wireless communication systems. Different from conventional fixed antennas, the proposed RA system can flexibly and independently alter the three-dimensional (3D) boresight direction of each antenna to achieve a desired array directional gain pattern. Specifically, we investigate an RA-enabled uplink communication system, where the receive beamforming and the boresight directions of all RAs at the base station (BS) are jointly optimized to maximize the minimum signal-to-interference-plus-noise ratio (SINR) among all the users. In the special single-user and free-space propagation setup, the optimal boresight directions of RAs are derived in closed form with the maximum-ratio combining (MRC) beamformer applied at the BS. Moreover, we analyze the asymptotic performance with an infinite number of antennas based on this solution, which theoretically proves that the RA system can achieve a higher array gain than the fixed-antenna system. In the general multi-user and multipath channel setup, we first propose an alternating optimization (AO) algorithm to alternately optimize the receive beamforming and the boresight directions of RAs in an iterative manner. Then, a two-stage algorithm that solves the formulated problem without the need for iteration is proposed to further reduce computational complexity. Simulation results are provided to validate our analytical results and demonstrate that the proposed RA system can significantly improve the communication performance as compared to other benchmark schemes. △ Less

Submitted 24 June, 2025; v1 submitted 5 January, 2025; originally announced January 2025.

Comments: 16 pages, 11 figures

arXiv:2412.04720 [pdf, other]

Passive Six-Dimensional Movable Antenna (6DMA)-Assisted Multiuser Communication

Authors: Haozhe Wang, Xiaodan Shao, Beixiong Zheng, Xiaoming Shi, Rui Zhang

Abstract: Six-dimensional movable antenna (6DMA) is a promising solution for enhancing wireless network capacity through the adjustment of both three-dimensional (3D) positions and 3D rotations of distributed antenna surfaces. Previous works mainly consider 6DMA surfaces composed of active antenna elements, thus termed as active 6DMA. In this letter, we propose a new passive 6DMA system consisting of distri… ▽ More Six-dimensional movable antenna (6DMA) is a promising solution for enhancing wireless network capacity through the adjustment of both three-dimensional (3D) positions and 3D rotations of distributed antenna surfaces. Previous works mainly consider 6DMA surfaces composed of active antenna elements, thus termed as active 6DMA. In this letter, we propose a new passive 6DMA system consisting of distributed passive intelligent reflecting surfaces (IRSs) that can be adjusted in terms of 3D position and 3D rotation. Specifically, we study a passive 6DMA-aided multiuser uplink system and aim to maximize the users' achievable sum rate by jointly optimizing the 3D positions, 3D rotations, and reflection coefficients of all passive 6DMA surfaces, as well as the receive beamforming matrix at the base station (BS). To solve this challenging non-convex optimization problem, we propose an alternating optimization (AO) algorithm that decomposes it into three subproblems and solves them alternately in an iterative manner. Numerical results are presented to investigate the performance of the proposed passive 6DMA system under different configurations and demonstrate its superior performance over the traditional fixed-IRS counterpart for both directive and isotropic radiation patterns of passive reflecting elements. △ Less

Submitted 5 December, 2024; originally announced December 2024.

arXiv:2412.01270 [pdf, other]

6DMA-Aided Cell-Free Massive MIMO Communication

Authors: Xiaoming Shi, Xiaodan Shao, Beixiong Zheng, Rui Zhang

Abstract: In this letter, we propose a six-dimensional movable antenna (6DMA)-aided cell-free massive multiple-input multiple-output (MIMO) system to fully exploit its macro spatial diversity, where a set of distributed access points (APs), each equipped with multiple 6DMA surfaces, cooperatively serve all users in a given area. Connected to a central processing unit (CPU) via fronthaul links, 6DMA-APs can… ▽ More In this letter, we propose a six-dimensional movable antenna (6DMA)-aided cell-free massive multiple-input multiple-output (MIMO) system to fully exploit its macro spatial diversity, where a set of distributed access points (APs), each equipped with multiple 6DMA surfaces, cooperatively serve all users in a given area. Connected to a central processing unit (CPU) via fronthaul links, 6DMA-APs can optimize their combining vectors for decoding the users' information based on either local channel state information (CSI) or global CSI shared among them. We aim to maximize the average achievable sum-rate via jointly optimizing the rotation angles of all 6DMA surfaces at all APs, based on the users' spatial distribution. Since the formulated problem is non-convex and highly non-linear, we propose a Bayesian optimization-based algorithm to solve it efficiently. Simulation results show that, by enhancing signal power and mitigating interference through reduced channel cross-correlation among users, 6DMA-APs with optimized rotations can significantly improve the average sum-rate, as compared to the conventional cell-free network with fixed-position antennas and that with only a single centralized AP with optimally rotated 6DMAs, especially when the user distribution is more spatially diverse. △ Less

Submitted 2 December, 2024; originally announced December 2024.

arXiv:2411.08411 [pdf, other]

Modeling and Optimization for Rotatable Antenna Enabled Wireless Communication

Authors: Qingjie Wu, Beixiong Zheng, Tiantian Ma, Rui Zhang

Abstract: Fluid antenna system (FAS)/movable antenna (MA) has emerged as a promising technology to fully exploit the spatial degrees of freedom (DoFs). In this paper, we propose a new rotatable antenna (RA) model, as a simplified implementation of six-dimensional movable antenna (6DMA), to improve the performance of wireless communication systems. Different from conventional fixed antenna, the proposed RA s… ▽ More Fluid antenna system (FAS)/movable antenna (MA) has emerged as a promising technology to fully exploit the spatial degrees of freedom (DoFs). In this paper, we propose a new rotatable antenna (RA) model, as a simplified implementation of six-dimensional movable antenna (6DMA), to improve the performance of wireless communication systems. Different from conventional fixed antenna, the proposed RA system can independently and flexibly change the three-dimensional (3D) orientation/boresight of each antenna by adjusting its deflection angles to achieve desired channel realizations. Specifically, we study an RA-enabled uplink communication system, where the receive beamforming and the deflection angles of all RAs are jointly optimized to maximize the minimum signal-to-interference-plus-noise ratio (SINR) among all the users. In the special single-user and free-space propagation setup, the optimal deflection angles are derived in closed form with the maximum-ratio combining (MRC) beamformer applied at the base station (BS). In the general multi-user and multi-path setup, we propose an alternating optimization (AO) algorithm to alternately optimize the receive beamforming and the deflection angles in an iterative manner. Simulation results are provided to demonstrate that the proposed RA-enabled system can significantly outperform other benchmark schemes. △ Less

Submitted 26 February, 2025; v1 submitted 13 November, 2024; originally announced November 2024.

Comments: 7 pages, 6 figures

arXiv:2409.14088 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Multiuser Communication: Co-design of Transmit Diversity and Active/Passive Precoding

Authors: Beixiong Zheng, Tiantian Ma, Jie Tang, Changsheng You, Shaoe Lin, Kai-Kit Wong

Abstract: Intelligent reflecting surface (IRS) has become a cost-effective solution for constructing a smart and adaptive radio environment. Most previous works on IRS have jointly designed the active and passive precoding based on perfectly or partially known channel state information (CSI). However, in delay-sensitive or high-mobility communications, it is imperative to explore more effective methods for… ▽ More Intelligent reflecting surface (IRS) has become a cost-effective solution for constructing a smart and adaptive radio environment. Most previous works on IRS have jointly designed the active and passive precoding based on perfectly or partially known channel state information (CSI). However, in delay-sensitive or high-mobility communications, it is imperative to explore more effective methods for leveraging IRS to enhance communication reliability without the need for any CSI. In this paper, we investigate an innovative IRS-aided multiuser communication system, which integrates an IRS with its aided multi-antenna base station (BS) to simultaneously serve multiple high-mobility users through transmit diversity and multiple low-mobility users through active/passive precoding. In specific, we first reveal that when dynamically tuning the IRS's common phase-shift shared with all reflecting elements, its passive precoding gain to any low-mobility user remains unchanged. Inspired by this property, we utilize the design of common phase-shift at the IRS for achieving transmit diversity to serve high-mobility users, yet without requiring any CSI at the BS. Meanwhile, the active/passive precoding design is incorporated into the IRS-integrated BS to serve low-mobility users (assuming the CSI is known). Then, taking into account the interference among different users, we formulate and solve a joint optimization problem of the IRS's reflect precoding and the BS's transmit precoding, with the aim of minimizing the total transmit power at the BS. △ Less

Submitted 21 September, 2024; originally announced September 2024.

Comments: 13 pages, 9 figures, Early Access in IEEE TWC

arXiv:2408.17014 [pdf, ps, other]

Channel Estimation for XL-IRS Assisted Wireless Systems with Double-sided Visibility Regions

Authors: Chao Zhou, Changsheng You, Shiqi Gong, Bin Lyu, Beixiong Zheng, Yi Gong

Abstract: In this paper, we study efficient channel estimation design for an extremely large-scale intelligent reflecting surface (XL-IRS) assisted multi-user communication systems, where both the base station (BS) and users are located in the near-field region of the XL-IRS. Two unique channel characteristics of XL-IRS are considered, namely, the near-field spherical wavefronts and double-sided visibility… ▽ More In this paper, we study efficient channel estimation design for an extremely large-scale intelligent reflecting surface (XL-IRS) assisted multi-user communication systems, where both the base station (BS) and users are located in the near-field region of the XL-IRS. Two unique channel characteristics of XL-IRS are considered, namely, the near-field spherical wavefronts and double-sided visibility regions (VRs) at the BS and users, which render the channel estimation for XL-IRS highly challenging. To address this issue, we propose in this paper an efficient three-step XL-IRS channel estimation method. Specifically, in the first step, an anchor node is delicately deployed near the XL-IRS to estimate the cascaded BS-IRS-anchor channel. Then, an efficient VR detection method is devised to estimate the VR information between the BS and XL-IRS. In this way, only the channels from the visible XL-IRS elements to the BS are estimated, thereby reducing the dimension of the cascaded BS-IRS-users channels to be estimated. Third, by leveraging the common BS-IRS channel, the cascaded channels for all users are consecutively estimated accounting for the VRs of the IRS-user channels. Finally, numerical results are provided to demonstrate the effectiveness of our proposed channel estimation scheme as compared to various benchmark schemes. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 6 pages, 5 figures

arXiv:2407.03655 [pdf, other]

Pathological Semantics-Preserving Learning for H&E-to-IHC Virtual Staining

Authors: Fuqiang Chen, Ranran Zhang, Boyun Zheng, Yiwen Sun, Jiahui He, Wenjian Qin

Abstract: Conventional hematoxylin-eosin (H&E) staining is limited to revealing cell morphology and distribution, whereas immunohistochemical (IHC) staining provides precise and specific visualization of protein activation at the molecular level. Virtual staining technology has emerged as a solution for highly efficient IHC examination, which directly transforms H&E-stained images to IHC-stained images. How… ▽ More Conventional hematoxylin-eosin (H&E) staining is limited to revealing cell morphology and distribution, whereas immunohistochemical (IHC) staining provides precise and specific visualization of protein activation at the molecular level. Virtual staining technology has emerged as a solution for highly efficient IHC examination, which directly transforms H&E-stained images to IHC-stained images. However, virtual staining is challenged by the insufficient mining of pathological semantics and the spatial misalignment of pathological semantics. To address these issues, we propose the Pathological Semantics-Preserving Learning method for Virtual Staining (PSPStain), which directly incorporates the molecular-level semantic information and enhances semantics interaction despite any spatial inconsistency. Specifically, PSPStain comprises two novel learning strategies: 1) Protein-Aware Learning Strategy (PALS) with Focal Optical Density (FOD) map maintains the coherence of protein expression level, which represents molecular-level semantic information; 2) Prototype-Consistent Learning Strategy (PCLS), which enhances cross-image semantic interaction by prototypical consistency learning. We evaluate PSPStain on two public datasets using five metrics: three clinically relevant metrics and two for image quality. Extensive experiments indicate that PSPStain outperforms current state-of-the-art H&E-to-IHC virtual staining methods and demonstrates a high pathological correlation between the staging of real and virtual stains. △ Less

Submitted 28 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

Comments: accepted by MICCAI2024

arXiv:2405.06951 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Radar Spoofing

Authors: Haozhe Wang, Beixiong Zheng, Xiaodan Shao, Rui Zhang

Abstract: Electronic countermeasure (ECM) technology plays a critical role in modern electronic warfare, which can interfere with enemy radar detection systems by noise or deceptive signals. However, the conventional active jamming strategy incurs additional hardware and power costs and has the potential threat of exposing the target itself. To tackle the above challenges, we propose a new intelligent refle… ▽ More Electronic countermeasure (ECM) technology plays a critical role in modern electronic warfare, which can interfere with enemy radar detection systems by noise or deceptive signals. However, the conventional active jamming strategy incurs additional hardware and power costs and has the potential threat of exposing the target itself. To tackle the above challenges, we propose a new intelligent reflecting surface (IRS)-aided radar spoofing strategy in this letter, where IRS is deployed on the surface of a target to help eliminate the signals reflected towards the hostile radar to shield the target, while simultaneously redirecting its reflected signal towards a surrounding clutter to generate deceptive angle-of-arrival (AoA) sensing information for the radar. We optimize the IRS's reflection to maximize the received signal power at the radar from the direction of the selected clutter subject to the constraint that its received power from the direction of the target is lower than a given detection threshold. We first solve this non-convex optimization problem using the semidefinite relaxation (SDR) method and further propose a lower-complexity solution for real-time implementation. Simulation results validate the efficacy of our proposed IRS-aided spoofing system as compared to various benchmark schemes. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 5 pages, 4 figures

arXiv:2404.15946 [pdf]

Mammo-CLIP: Leveraging Contrastive Language-Image Pre-training (CLIP) for Enhanced Breast Cancer Diagnosis with Multi-view Mammography

Authors: Xuxin Chen, Yuheng Li, Mingzhe Hu, Ella Salari, Xiaoqian Chen, Richard L. J. Qiu, Bin Zheng, Xiaofeng Yang

Abstract: Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, developing multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces challenges and no such CAD schemes have been used in clinical practice. To overcome the challenges, we investigate a new approach based on Contrastive Language-Image Pre-tr… ▽ More Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, developing multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces challenges and no such CAD schemes have been used in clinical practice. To overcome the challenges, we investigate a new approach based on Contrastive Language-Image Pre-training (CLIP), which has sparked interest across various medical imaging tasks. By solving the challenges in (1) effectively adapting the single-view CLIP for multi-view feature fusion and (2) efficiently fine-tuning this parameter-dense model with limited samples and computational resources, we introduce Mammo-CLIP, the first multi-modal framework to process multi-view mammograms and corresponding simple texts. Mammo-CLIP uses an early feature fusion strategy to learn multi-view relationships in four mammograms acquired from the CC and MLO views of the left and right breasts. To enhance learning efficiency, plug-and-play adapters are added into CLIP image and text encoders for fine-tuning parameters and limiting updates to about 1% of the parameters. For framework evaluation, we assembled two datasets retrospectively. The first dataset, comprising 470 malignant and 479 benign cases, was used for few-shot fine-tuning and internal evaluation of the proposed Mammo-CLIP via 5-fold cross-validation. The second dataset, including 60 malignant and 294 benign cases, was used to test generalizability of Mammo-CLIP. Study results show that Mammo-CLIP outperforms the state-of-art cross-view transformer in AUC (0.841 vs. 0.817, 0.837 vs. 0.807) on both datasets. It also surpasses previous two CLIP-based methods by 20.3% and 14.3%. This study highlights the potential of applying the finetuned vision-language models for developing next-generation, image-text-based CAD schemes of breast cancer. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.08366 [pdf, other]

Intelligent Reflecting Surface-Enabled Anti-Detection for Secure Sensing and Communications

Authors: Beixiong Zheng, Xue Xiong, Tiantian Ma, Jie Tang, Derrick Wing Kwan Ng, A. Lee Swindlehurst, Rui Zhang

Abstract: The ever-increasing reliance on wireless communication and sensing has led to growing concerns over the vulnerability of sensitive information to unauthorized detection and interception. Traditional anti-detection methods are often inadequate, suffering from limited adaptability and diminished effectiveness against advanced detection technologies. To overcome these challenges, this article present… ▽ More The ever-increasing reliance on wireless communication and sensing has led to growing concerns over the vulnerability of sensitive information to unauthorized detection and interception. Traditional anti-detection methods are often inadequate, suffering from limited adaptability and diminished effectiveness against advanced detection technologies. To overcome these challenges, this article presents the intelligent reflecting surface (IRS) as a groundbreaking technology for enabling flexible electromagnetic manipulation, which has the potential to revolutionize anti-detection in both electromagnetic stealth/spoofing (evading radar detection) and covert communications (facilitating secure information exchange). We explore the fundamental principles of IRS and its advantages over traditional anti-detection techniques and discuss various design challenges associated with implementing IRS-based anti-detection systems. Through the examination of case studies and future research directions, we provide a comprehensive overview of the potential of IRS technology to serve as a formidable shield in the modern wireless landscape. △ Less

Submitted 21 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: 7 pages, 5 figures

arXiv:2404.06830 [pdf, ps, other]

EMF Exposure Mitigation via MAC Scheduling

Authors: Silvio Mandelli, Lorenzo Maggi, Bill Zheng, Christophe Grangeat, Azra Zejnilagic

Abstract: International standards bodies define Electromagnetic field (EMF) emission requirements that can be translated into control of the base station actual Effective Isotropic Radiated Power (EIRP), i.e., averaged over a sliding time window. In this work we show how to comply with such requirements by designing a water-filling power allocation method operating at the MAC scheduler level. Our method ens… ▽ More International standards bodies define Electromagnetic field (EMF) emission requirements that can be translated into control of the base station actual Effective Isotropic Radiated Power (EIRP), i.e., averaged over a sliding time window. In this work we show how to comply with such requirements by designing a water-filling power allocation method operating at the MAC scheduler level. Our method ensures throughput fairness across users while constraining the EIRP to a value that is produced by an outer-loop procedure which is not the focus of our paper. The low computational complexity of our technique is appealing given the tight computational requirements of the MAC scheduler. Our proposal is evaluated against the prior art approaches through massive-MIMO system level simulations that include realistic modeling of physical and MAC level cellular procedures. We conclude that our proposal effectively mitigates EMF exposure with considerably less impact on network performance, making it a standout candidate for 5G and future 6G MAC scheduler implementations. △ Less

Submitted 19 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 5 pages, 3 figures. This work has been submitted to the IEEE for possible publication

arXiv:2403.17627 [pdf, other]

Waveform Design for Joint Communication and SAR Imaging Under Random Signaling

Authors: Bowen Zheng, Fan Liu

Abstract: Conventional synthetic aperture radar (SAR) imaging systems typically employ deterministic signal designs, which lack the capability to convey communication information and are thus not suitable for integrated sensing and communication (ISAC) scenarios. In this letter, we propose a joint communication and SAR imaging (JCASAR) system based on orthogonal frequency-division multiplexing (OFDM) signal… ▽ More Conventional synthetic aperture radar (SAR) imaging systems typically employ deterministic signal designs, which lack the capability to convey communication information and are thus not suitable for integrated sensing and communication (ISAC) scenarios. In this letter, we propose a joint communication and SAR imaging (JCASAR) system based on orthogonal frequency-division multiplexing (OFDM) signal with cyclic prefix (CP), which is capable of reconstructing the target profile while serving a communication user. In contrast to traditional matched filters, we propose a least squares (LS) estimator for range profiling. Then the SAR image is obtained followed by range cell migration correction (RCMC) and azimuth processing. By minimizing the mean squared error (MSE) of the proposed LS estimator, we investigate the optimal waveform design for SAR imaging, and JCASAR under random signaling, where power allocation strategies are conceived for Gaussian-distributed ISAC signals, in an effort to strike a flexible performance tradeoff between the communication and SAR imaging tasks. Numerical results are provided to validate the effectiveness of the proposed ISAC waveform design for JCASAR systems. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 5 pages

arXiv:2403.12352 [pdf, other]

A New Intelligent Reflecting Surface-Aided Electromagnetic Stealth Strategy

Authors: Xue Xiong, Beixiong Zheng, A. Lee Swindlehurst, Jie Tang, Wen Wu

Abstract: Electromagnetic wave absorbing material (EWAM) plays an essential role in manufacturing stealth aircraft, which can achieve the electromagnetic stealth (ES) by reducing the strength of the signal reflected back to the radar system. However, the stealth performance is limited by the coating thickness, incident wave angles, and working frequencies. To tackle these limitations, we propose a new intel… ▽ More Electromagnetic wave absorbing material (EWAM) plays an essential role in manufacturing stealth aircraft, which can achieve the electromagnetic stealth (ES) by reducing the strength of the signal reflected back to the radar system. However, the stealth performance is limited by the coating thickness, incident wave angles, and working frequencies. To tackle these limitations, we propose a new intelligent reflecting surface (IRS)-aided ES system where an IRS is deployed at the target to synergize with EWAM for effectively mitigating the echo signal and thus reducing the radar detection probability. Considering the monotonic relationship between the detection probability and the received signal-to-noise-ratio (SNR) at the radar, we formulate an optimization problem that minimizes the SNR under the reflection constraint of each IRS element, and a semi-closed-form solution is derived by using Karush-Kuhn-Tucker (KKT) conditions. Simulation results validate the superiority of the proposed IRS-aided ES system compared to various benchmarks. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 5 pages, 4 figures

arXiv:2403.11556 [pdf, other]

Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement

Authors: Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yan

Abstract: Video compression artifacts arise due to the quantization operation in the frequency domain. The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result. In this work, we propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement. HFUR consists of two modules: implicit frequen… ▽ More Video compression artifacts arise due to the quantization operation in the frequency domain. The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result. In this work, we propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement. HFUR consists of two modules: implicit frequency upsampling module (ImpFreqUp) and hierarchical and iterative refinement module (HIR). ImpFreqUp exploits DCT-domain prior derived through implicit DCT transform, and accurately reconstructs the DCT-domain loss via a coarse-to-fine transfer. Consequently, HIR is introduced to facilitate cross-collaboration and information compensation between the scales, thus further refine the feature maps and promote the visual quality of the final output. We demonstrate the effectiveness of the proposed modules via ablation experiments and visualized results. Extensive experiments on public benchmarks show that HFUR achieves state-of-the-art performance for both constant bit rate and constant QP modes. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2401.02678 [pdf, other]

MusicAOG: an Energy-Based Model for Learning and Sampling a Hierarchical Representation of Symbolic Music

Authors: Yikai Qian, Tianle Wang, Xinyi Tong, Xin Jin, Duo Xu, Bo Zheng, Tiezheng Ge, Feng Yu, Song-Chun Zhu

Abstract: In addressing the challenge of interpretability and generalizability of artificial music intelligence, this paper introduces a novel symbolic representation that amalgamates both explicit and implicit musical information across diverse traditions and granularities. Utilizing a hierarchical and-or graph representation, the model employs nodes and edges to encapsulate a broad spectrum of musical ele… ▽ More In addressing the challenge of interpretability and generalizability of artificial music intelligence, this paper introduces a novel symbolic representation that amalgamates both explicit and implicit musical information across diverse traditions and granularities. Utilizing a hierarchical and-or graph representation, the model employs nodes and edges to encapsulate a broad spectrum of musical elements, including structures, textures, rhythms, and harmonies. This hierarchical approach expands the representability across various scales of music. This representation serves as the foundation for an energy-based model, uniquely tailored to learn musical concepts through a flexible algorithm framework relying on the minimax entropy principle. Utilizing an adapted Metropolis-Hastings sampling technique, the model enables fine-grained control over music generation. A comprehensive empirical evaluation, contrasting this novel approach with existing methodologies, manifests considerable advancements in interpretability and controllability. This study marks a substantial contribution to the fields of music analysis, composition, and computational musicology. △ Less

Submitted 5 January, 2024; originally announced January 2024.

arXiv:2312.16918 [pdf, other]

Intelligent Surfaces Empowered Wireless Network: Recent Advances and The Road to 6G

Authors: Qingqing Wu, Beixiong Zheng, Changsheng You, Lipeng Zhu, Kaiming Shen, Xiaodan Shao, Weidong Mei, Boya Di, Hongliang Zhang, Ertugrul Basar, Lingyang Song, Marco Di Renzo, Zhi-Quan Luo, Rui Zhang

Abstract: Intelligent surfaces (ISs) have emerged as a key technology to empower a wide range of appealing applications for wireless networks, due to their low cost, high energy efficiency, flexibility of deployment and capability of constructing favorable wireless channels/radio environments. Moreover, the recent advent of several new IS architectures further expanded their electromagnetic functionalities… ▽ More Intelligent surfaces (ISs) have emerged as a key technology to empower a wide range of appealing applications for wireless networks, due to their low cost, high energy efficiency, flexibility of deployment and capability of constructing favorable wireless channels/radio environments. Moreover, the recent advent of several new IS architectures further expanded their electromagnetic functionalities from passive reflection to active amplification, simultaneous reflection and refraction, as well as holographic beamforming. However, the research on ISs is still in rapid progress and there have been recent technological advances in ISs and their emerging applications that are worthy of a timely review. Thus, we provide in this paper a comprehensive survey on the recent development and advances of ISs aided wireless networks. Specifically, we start with an overview on the anticipated use cases of ISs in future wireless networks such as 6G, followed by a summary of the recent standardization activities related to ISs. Then, the main design issues of the commonly adopted reflection-based IS and their state-of-the-art solutions are presented in detail, including reflection optimization, deployment, signal modulation, wireless sensing, and integrated sensing and communications. Finally, recent progress and new challenges in advanced IS architectures are discussed to inspire futrue research. △ Less

Submitted 24 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

arXiv:2312.01940 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Electromagnetic Stealth Against Radar Detection

Authors: Beixiong Zheng, Xue Xiong, Jie Tang, Rui Zhang

Abstract: While traditional electromagnetic stealth materials/metasurfaces can render a target virtually invisible to some extent, they lack flexibility and adaptability, and can only operate within a limited frequency and angle/direction range, making it challenging to ensure the expected stealth performance. In view of this, we propose in this paper a new intelligent reflecting surface (IRS)-aided electro… ▽ More While traditional electromagnetic stealth materials/metasurfaces can render a target virtually invisible to some extent, they lack flexibility and adaptability, and can only operate within a limited frequency and angle/direction range, making it challenging to ensure the expected stealth performance. In view of this, we propose in this paper a new intelligent reflecting surface (IRS)-aided electromagnetic stealth system mounted on targets to evade radar detection, by utilizing the tunable passive reflecting elements of IRS to achieve flexible and adaptive electromagnetic stealth in a cost-effective manner. Specifically, we optimize the IRS's reflection at the target to minimize the sum received signal power of all adversary radars. We first address the IRS's reflection optimization problem using the Lagrange multiplier method and derive a semi-closed-form optimal solution for the single-radar setup, which is then generalized to the multi-radar case. To meet real-time processing requirements, we further propose low-complexity closed-form solutions based on the reverse alignment/cancellation and minimum mean-square error (MMSE) criteria for the single-radar and multi-radar cases, respectively. Additionally, we propose practical low-complexity estimation schemes at the target to acquire angle-of-arrival (AoA) and/or path gain information via a small number of receive sensing devices. Simulation results validate the performance advantages of our proposed IRS-aided electromagnetic stealth system with the proposed IRS reflection designs. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 13 pages (double-column), 10 figures, submitted in October

arXiv:2310.01342 [pdf, other]

Near-field Integrated Sensing and Communication: Opportunities and Challenges

Authors: Jiayi Cong, Changsheng You, Jiapeng Li, Li Chen, Beixiong Zheng, Yuanwei Liu, Wen Wu, Yi Gong, Shi Jin, Rui Zhang

Abstract: With the extremely large-scale array XL-array deployed in future wireless systems, wireless communication and sensing are expected to operate in the radiative near-field region, which needs to be characterized by the spherical rather than planar wavefronts. Unlike most existing works that considered far-field integrated sensing and communication (ISAC), we study in this article the new near-field… ▽ More With the extremely large-scale array XL-array deployed in future wireless systems, wireless communication and sensing are expected to operate in the radiative near-field region, which needs to be characterized by the spherical rather than planar wavefronts. Unlike most existing works that considered far-field integrated sensing and communication (ISAC), we study in this article the new near-field ISAC, which integrates both functions of sensing and communication in the near-field region. To this end, we first discuss the appealing advantages of near-field communication and sensing over their far-field counterparts, respectively. Then, we introduce three approaches for near-field ISAC, including joint near-field communication and sensing, sensing-assisted near-field communication, and communication-assisted near-field sensing. We discuss their individual research opportunities, new design issues, as well as propose promising solutions. Finally, several important directions in near-field ISAC are also highlighted to motivate future work. △ Less

Submitted 26 July, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: This work has been accpeted by IEEE Wireless Communications Magazine

arXiv:2307.16440 [pdf, other]

Towards Head Computed Tomography Image Reconstruction Standardization with Deep Learning Assisted Automatic Detection

Authors: Bowen Zheng, Chenxi Huang, Yuemei Luo

Abstract: Three-dimensional (3D) reconstruction of head Computed Tomography (CT) images elucidates the intricate spatial relationships of tissue structures, thereby assisting in accurate diagnosis. Nonetheless, securing an optimal head CT scan without deviation is challenging in clinical settings, owing to poor positioning by technicians, patient's physical constraints, or CT scanner tilt angle restrictions… ▽ More Three-dimensional (3D) reconstruction of head Computed Tomography (CT) images elucidates the intricate spatial relationships of tissue structures, thereby assisting in accurate diagnosis. Nonetheless, securing an optimal head CT scan without deviation is challenging in clinical settings, owing to poor positioning by technicians, patient's physical constraints, or CT scanner tilt angle restrictions. Manual formatting and reconstruction not only introduce subjectivity but also strain time and labor resources. To address these issues, we propose an efficient automatic head CT images 3D reconstruction method, improving accuracy and repeatability, as well as diminishing manual intervention. Our approach employs a deep learning-based object detection algorithm, identifying and evaluating orbitomeatal line landmarks to automatically reformat the images prior to reconstruction. Given the dearth of existing evaluations of object detection algorithms in the context of head CT images, we compared ten methods from both theoretical and experimental perspectives. By exploring their precision, efficiency, and robustness, we singled out the lightweight YOLOv8 as the aptest algorithm for our task, with an mAP of 92.77% and impressive robustness against class imbalance. Our qualitative evaluation of standardized reconstruction results demonstrates the clinical practicability and validity of our method. △ Less

Submitted 15 September, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

arXiv:2306.16206 [pdf, other]

Near-Field Beam Management for Extremely Large-Scale Array Communications

Authors: Changsheng You, Yunpu Zhang, Chenyu Wu, Yong Zeng, Beixiong Zheng, Li Chen, Linglong Dai, A. Lee Swindlehurst

Abstract: Extremely large-scale arrays (XL-arrays) have emerged as a promising technology to achieve super-high spectral efficiency and spatial resolution in future wireless systems. The large aperture of XL-arrays means that spherical rather than planar wavefronts must be considered, and a paradigm shift from far-field to near-field communications is necessary. Unlike existing works that have mainly consid… ▽ More Extremely large-scale arrays (XL-arrays) have emerged as a promising technology to achieve super-high spectral efficiency and spatial resolution in future wireless systems. The large aperture of XL-arrays means that spherical rather than planar wavefronts must be considered, and a paradigm shift from far-field to near-field communications is necessary. Unlike existing works that have mainly considered far-field beam management, we study the new near-field beam management for XL-arrays. We first provide an overview of near-field communications and introduce various applications of XL-arrays in both outdoor and indoor scenarios. Then, three typical near-field beam management methods for XL-arrays are discussed: near-field beam training, beam tracking, and beam scheduling. We point out their main design issues and propose promising solutions to address them. Moreover, other important directions in near-field communications are also highlighted to motivate future research. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Comments: We studied the new near-field beam management for XL-arrays. This paper has been submitted to IEEE for possible publication

arXiv:2303.08019 [pdf, other]

Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection

Authors: Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng

Abstract: With the global population aging rapidly, Alzheimer's disease (AD) is particularly prominent in older adults, which has an insidious onset and leads to a gradual, irreversible deterioration in cognitive domains (memory, communication, etc.). Speech-based AD detection opens up the possibility of widespread screening and timely disease intervention. Recent advances in pre-trained models motivate AD… ▽ More With the global population aging rapidly, Alzheimer's disease (AD) is particularly prominent in older adults, which has an insidious onset and leads to a gradual, irreversible deterioration in cognitive domains (memory, communication, etc.). Speech-based AD detection opens up the possibility of widespread screening and timely disease intervention. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features. Based on these features, the paper also proposes a novel task-oriented approach by modeling the relationship between the participants' description and the cognitive task. Experiments are carried out on the ADReSS dataset in a binary classification setup, and models are evaluated on the unseen test set. Results and comparison with recent literature demonstrate the efficiency and superior performance of proposed acoustic, linguistic and task-oriented methods. The findings also show the importance of semantic and syntactic information, and feasibility of automation and generalization with the promising audio-only and task-oriented methods for the AD detection task. △ Less

Submitted 14 March, 2023; originally announced March 2023.

Comments: 5 pages, 3 figures, 3 tables

arXiv:2302.12428 [pdf]

A holistically 3D-printed flexible millimeter-wave Doppler radar: Towards fully printed high-frequency multilayer flexible hybrid electronics systems

Authors: Hong Tang, Yingjie Zhang, Bowen Zheng, Sensong An, Mohammad Haerinia, Yunxi Dong, Yi Huang, Wei Guo, Hualiang Zhang

Abstract: Flexible hybrid electronics (FHE) is an emerging technology enabled through the integration of advanced semiconductor devices and 3D printing technology. It unlocks tremendous market potential by realizing low-cost flexible circuits and systems that can be conformally integrated into various applications. However, the operating frequencies of most reported FHE systems are relatively low. It is als… ▽ More Flexible hybrid electronics (FHE) is an emerging technology enabled through the integration of advanced semiconductor devices and 3D printing technology. It unlocks tremendous market potential by realizing low-cost flexible circuits and systems that can be conformally integrated into various applications. However, the operating frequencies of most reported FHE systems are relatively low. It is also worth to note that reported FHE systems have been limited to relatively simple design concept (since complex systems will impose challenges in aspects such as multilayer interconnections, printing materials, and bonding layers). Here, we report a fully 3D-printed flexible four-layer millimeter-wave Doppler radar (i.e., a millimeter-wave FHE system). The sensing performance and flexibility of the 3D-printed radar are characterized and validated by general field tests and bending tests, respectively. Our results demonstrate the feasibility of developing fully 3D-printed high-frequency multilayer FHE, which can be conformally integrated into irregular surfaces (e.g., vehicle bumpers) for applications such as vehicle radars and wearable electronics. △ Less

Submitted 23 February, 2023; originally announced February 2023.

MSC Class: 78-05

arXiv:2301.07277 [pdf, other]

Mixed Near- and Far-Field Communications for Extremely Large-Scale Array: An Interference Perspective

Authors: Yunpu Zhang, Changsheng You, Li Chen, Beixiong Zheng

Abstract: Extremely large-scale array (XL-array) is envisioned to achieve super-high spectral efficiency in future wireless networks. Different from the existing works that mostly focus on the near-field communications, we consider in this paper a new and practical scenario, called mixed near- and far-field communications, where there exist both near- and far-field users in the network. For this scenario, w… ▽ More Extremely large-scale array (XL-array) is envisioned to achieve super-high spectral efficiency in future wireless networks. Different from the existing works that mostly focus on the near-field communications, we consider in this paper a new and practical scenario, called mixed near- and far-field communications, where there exist both near- and far-field users in the network. For this scenario, we first obtain a closed-form expression for the inter-user interference at the near-field user caused by the far-field beam by using Fresnel functions, based on which the effects of the number of BS antennas, far-field user (FU) angle, near-field user (NU) angle and distance are analyzed. We show that the strong interference exists when the number of the BS antennas and the NU distance are relatively small, and/or the NU and FU angle-difference is small. Then, we further obtain the achievable rate of the NU as well as its rate loss caused by the FU interference. Last, numerical results are provided to corroborate our analytical results. △ Less

Submitted 28 January, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

Comments: We studied the multi-user interference in the mixed near- and far-field communications. This paper has been submitted to IEEE for possible publication

arXiv:2210.16539 [pdf, other]

Exploiting prompt learning with pre-trained language models for Alzheimer's Disease detection

Authors: Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng

Abstract: Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression. Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques. Textual embedding features produced by pre-trained language models (PLMs) such as BERT are widely used in such systems. However, PLM domain f… ▽ More Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression. Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques. Textual embedding features produced by pre-trained language models (PLMs) such as BERT are widely used in such systems. However, PLM domain fine-tuning is commonly based on the masked word or sentence prediction costs that are inconsistent with the back-end AD detection task. To this end, this paper investigates the use of prompt-based fine-tuning of PLMs that consistently uses AD classification errors as the training objective function. Disfluency features based on hesitation or pause filler token frequencies are further incorporated into prompt phrases during PLM fine-tuning. The decision voting based combination among systems using different PLMs (BERT and RoBERTa) or systems with different fine-tuning paradigms (conventional masked-language modelling fine-tuning and prompt-based fine-tuning) is further applied. Mean, standard deviation and the maximum among accuracy scores over 15 experiment runs are adopted as performance measurements for the AD detection system. Mean detection accuracy of 84.20% (with std 2.09%, best 87.5%) and 82.64% (with std 4.0%, best 89.58%) were obtained using manual and ASR speech transcripts respectively on the ADReSS20 test set consisting of 48 elderly speakers. △ Less

Submitted 31 March, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Comments: Accepted ICASSP 2023 (will update with IEEE vision later)

arXiv:2209.06390 [pdf, ps, other]

Multi-Active Multi-Passive (MAMP)-IRS Aided Wireless Communication: A Multi-Hop Beam Routing Design

Authors: Yunpu Zhang, Changsheng You, Beixiong Zheng

Abstract: Prior studies on intelligent reflecting surface (IRS) have mostly considered wireless communication systems aided by a single passive IRS, which, however, has limited control over wireless propagation environment and suffers severe product-distance path-loss. To address these issues, we propose in this paper a new multi-active multi-passive (MAMP)-IRS aided wireless communication system, where a n… ▽ More Prior studies on intelligent reflecting surface (IRS) have mostly considered wireless communication systems aided by a single passive IRS, which, however, has limited control over wireless propagation environment and suffers severe product-distance path-loss. To address these issues, we propose in this paper a new multi-active multi-passive (MAMP)-IRS aided wireless communication system, where a number of active and passive IRSs are deployed to assist the downlink communication in complex environment, by establishing a multi-hop reflection path across active and passive IRSs. An optimization problem is formulated to maximize the achievable rate of a typical user by designing the active-and-passive IRS routing path as well as the joint beamforming of the BS and selected active/passive IRSs. To draw useful insights into the optimal design, we first consider a special case of the single-active multi-passive (SAMP)-IRS aided system. For this case, we propose an efficient algorithm to obtain its optimal solution by first optimizing the joint beamforming given any SAMP-IRS routing path, and then optimizing the routing path by using a new path decomposition method and graph theory. Next, for the general MAMP-IRS aided system, we show that its challenging beam routing optimization problem can be efficiently solved by a new two-phase approach. Its key idea is to first optimize the inner passive-IRS beam routing between each two active IRSs for effective channel power gain maximization, followed by an outer active-IRS beam routing optimization for rate maximization. Last, numerical results are provided to demonstrate the effectiveness of the proposed MAMP-IRS beam routing scheme. △ Less

Submitted 6 January, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: In this updated version, we refine some results in the original paper. We studied the multi-hop beam routing design for a new multi-active multi-passive (MAMP)-IRS aided wireless communication system. This paper has been submitted to IEEE for possible publication. arXiv admin note: text overlap with arXiv:2208.11877

arXiv:2207.03157 [pdf, other]

Roadside IRS-Aided Vehicular Communication: Efficient Channel Estimation and Low-Complexity Beamforming Design

Authors: Zixuan Huang, Beixiong Zheng, Rui Zhang

Abstract: Intelligent reflecting surface (IRS) has emerged as a promising technique to control wireless propagation environment for enhancing the communication performance cost-effectively. However, the rapidly time-varying channel in high-mobility communication scenarios such as vehicular communication renders it challenging to obtain the instantaneous channel state information (CSI) efficiently for IRS wi… ▽ More Intelligent reflecting surface (IRS) has emerged as a promising technique to control wireless propagation environment for enhancing the communication performance cost-effectively. However, the rapidly time-varying channel in high-mobility communication scenarios such as vehicular communication renders it challenging to obtain the instantaneous channel state information (CSI) efficiently for IRS with a large number of reflecting elements. In this paper, we propose a new roadside IRS-aided vehicular communication system to tackle this challenge. Specifically, by exploiting the symmetrical deployment of IRSs with inter-laced equal intervals on both sides of the road and the cooperation among nearby IRS controllers, we propose a new two-stage channel estimation scheme with off-line and online training, respectively, to obtain the static/time-varying CSI required by the proposed low-complexity passive beamforming scheme efficiently. The proposed IRS beamforming and online channel estimation designs leverage the existing uplink pilots in wireless networks and do not require any change of the existing transmission protocol. Moreover, they can be implemented by each of IRS controllers independently, without the need of any real-time feedback from the user's serving BS. Simulation results show that the proposed designs can efficiently achieve the high IRS passive beamforming gain and thus significantly enhance the achievable communication throughput for high-speed vehicular communications. △ Less

Submitted 15 February, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.10096 [pdf]

Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View Mammograms

Authors: Xuxin Chen, Ke Zhang, Neman Abdoli, Patrik W. Gilley, Ximin Wang, Hong Liu, Bin Zheng, Yuchen Qiu

Abstract: Deep convolutional neural networks (CNNs) have been widely used in various medical imaging tasks. However, due to the intrinsic locality of convolution operation, CNNs generally cannot model long-range dependencies well, which are important for accurately identifying or mapping corresponding breast lesion features computed from unregistered multiple mammograms. This motivates us to leverage the ar… ▽ More Deep convolutional neural networks (CNNs) have been widely used in various medical imaging tasks. However, due to the intrinsic locality of convolution operation, CNNs generally cannot model long-range dependencies well, which are important for accurately identifying or mapping corresponding breast lesion features computed from unregistered multiple mammograms. This motivates us to leverage the architecture of Multi-view Vision Transformers to capture long-range relationships of multiple mammograms from the same patient in one examination. For this purpose, we employ local Transformer blocks to separately learn patch relationships within four mammograms acquired from two-view (CC/MLO) of two-side (right/left) breasts. The outputs from different views and sides are concatenated and fed into global Transformer blocks, to jointly learn patch relationships between four images representing two different views of the left and right breasts. To evaluate the proposed model, we retrospectively assembled a dataset involving 949 sets of mammograms, which include 470 malignant cases and 479 normal or benign cases. We trained and evaluated the model using a five-fold cross-validation method. Without any arduous preprocessing steps (e.g., optimal window cropping, chest wall or pectoral muscle removal, two-view image registration, etc.), our four-image (two-view-two-side) Transformer-based model achieves case classification performance with an area under ROC curve (AUC = 0.818), which significantly outperforms AUC = 0.784 achieved by the state-of-the-art multi-view CNNs (p = 0.009). It also outperforms two one-view-two-side models that achieve AUC of 0.724 (CC view) and 0.769 (MLO view), respectively. The study demonstrates the potential of using Transformers to develop high-performing computer-aided diagnosis schemes that combine four mammograms. △ Less

Submitted 20 June, 2022; originally announced June 2022.

arXiv:2203.10219 [pdf, other]

doi 10.1109/TSP.2022.3146791

Efficient DOA Estimation Method for Reconfigurable Intelligent Surfaces Aided UAV Swarm

Authors: Peng Chen, Zhimin Chen, Beixiong Zheng, Xianbin Wang

Abstract: The conventional direction of arrival (DOA) estimation methods are performed with multiple receiving channels. In this paper, a changeling DOA estimation problem is addressed in a different scenario with only one full-functional receiving channel. A new unmanned aerial vehicle (UAV) swarm system using multiple lifted reconfigurable intelligent surface (RIS) is proposed for the DOA estimation. The… ▽ More The conventional direction of arrival (DOA) estimation methods are performed with multiple receiving channels. In this paper, a changeling DOA estimation problem is addressed in a different scenario with only one full-functional receiving channel. A new unmanned aerial vehicle (UAV) swarm system using multiple lifted reconfigurable intelligent surface (RIS) is proposed for the DOA estimation. The UAV movement degrades the DOA estimation performance significantly, and the existing atomic norm minimization (ANM) methods cannot be used in the scenario with array perturbation. Specifically, considering the position perturbation of UAVs, a new atomic norm-based DOA estimation method is proposed, where an atomic norm is defined with the parameter of the position perturbation. Then, a customized semi-definite programming (SDP) method is derived to solve the atomic norm-based method, where different from the traditional SDP method, an additional transforming matrix is formulated. Moreover, a gradient descent method is applied to refine the estimated DOA and the position perturbation further. Simulation results show that the proposed method achieves much better DOA estimation performance in the RIS-aided UAV swarm system with only one receiving channel than various benchmark schemes. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Journal ref: IEEE Transactions on Signal Processing (2022): 743 - 755

arXiv:2202.04370 [pdf, ps, other]

Simultaneous Transmit Diversity and Passive Beamforming with Large-Scale Intelligent Reflecting Surface: Far-Field or Near-Field?

Authors: Beixiong Zheng, Rui Zhang

Abstract: Intelligent reflecting surface (IRS) has emerged as a cost-effective solution to enhance wireless communication performance via passive signal reflection. Existing works on IRS have mainly focused on investigating IRS's passive beamforming/reflection design to boost the communication rate for users assuming that their channel state information (CSI) is fully or partially known. However, how to exp… ▽ More Intelligent reflecting surface (IRS) has emerged as a cost-effective solution to enhance wireless communication performance via passive signal reflection. Existing works on IRS have mainly focused on investigating IRS's passive beamforming/reflection design to boost the communication rate for users assuming that their channel state information (CSI) is fully or partially known. However, how to exploit IRS to improve the wireless transmission reliability without any CSI, which is typical in high-mobility/delay-sensitive communication scenarios, remains largely open. In this paper, we study a new IRS-aided communication system with the IRS integrated to its aided access point (AP) to achieve both functions of transmit diversity and passive beamforming simultaneously. Specifically, we first show an interesting result that the IRS's passive beamforming gain in any direction is invariant to the common phase-shift applied to all of its reflecting elements. Accordingly, we design the common phase-shift of IRS elements to achieve transmit diversity at the AP side without the need of any CSI of the users. In addition, we propose a practical method for the users to estimate the CSI at the receiver side for information decoding. Meanwhile, we show that the conventional passive beamforming gain of IRS can be retained for the other users with their CSI known at the AP. Furthermore, we derive the asymptotic performance of both IRS-aided transmit diversity and passive beamforming in closed-form, by considering the large-scale IRS with an infinite number of elements. Numerical results validate our analysis and show the performance gains of the proposed IRS-aided simultaneous transmit diversity and passive beamforming scheme over other benchmark schemes. △ Less

Submitted 16 July, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: Large-scale IRS-aided simultaneous "transmit diversity" and "passive beamforming": Far-Field or Near-Field? (31 pages, 9 figures)

arXiv:2202.02550 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Spectrum Sensing for Cognitive Radio

Authors: Shaoe Lin, Beixiong Zheng, Fangjiong Chen, Rui Zhang

Abstract: Spectrum sensing is a key enabling technique for cognitive radio (CR), which provides essential information on the spectrum availability. However, due to severe wireless channel fading and path loss, the primary user (PU) signals received at the CR or secondary user (SU) can be practically too weak for reliable detection. To tackle this issue, we consider in this letter a new intelligent reflectin… ▽ More Spectrum sensing is a key enabling technique for cognitive radio (CR), which provides essential information on the spectrum availability. However, due to severe wireless channel fading and path loss, the primary user (PU) signals received at the CR or secondary user (SU) can be practically too weak for reliable detection. To tackle this issue, we consider in this letter a new intelligent reflecting surface (IRS)-aided spectrum sensing scheme for CR, by exploiting the large aperture and passive beamforming gains of IRS to boost the PU signal strength received at the SU to facilitate its spectrum sensing. Specifically, by dynamically changing the IRS reflection over time according to a given codebook, its reflected signal power varies substantially at the SU, which is utilized for opportunistic signal detection. Furthermore, we propose a weighted energy detection method by combining the received signal power values over different IRS reflections, which significantly improves the detection performance. Simulation results validate the performance gain of the proposed IRS-aided spectrum sensing scheme, as compared to different benchmark schemes. △ Less

Submitted 5 February, 2022; originally announced February 2022.

Comments: Accepted by IEEE Wireless Communications Letters (5 pages, 4 figures)

Journal ref: IEEE Wireless Communications Letters, 2022

arXiv:2201.10675 [pdf]

Virtual Adversarial Training for Semi-supervised Breast Mass Classification

Authors: Xuxin Chen, Ximin Wang, Ke Zhang, Kar-Ming Fung, Theresa C. Thai, Kathleen Moore, Robert S. Mannel, Hong Liu, Bin Zheng, Yuchen Qiu

Abstract: This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we… ▽ More This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to leverage and learn useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss aims at enhancing model robustness against virtual adversarial perturbation, thus improving model generalizability. To evaluate the performance of our VAT-based CAD scheme, we retrospectively assembled a total of 1024 breast mass images, with equal number of benign and malignant masses. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740 and 0.760, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: To appear in the conference Biophotonics and Immune Responses of SPIE

Showing 1–50 of 74 results for author: Zheng, B