Search | arXiv e-print repository

DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP

Authors: Amber Yijia Zheng, Yu Zhang, Jun Hu, Raymond A. Yeh, Chen Chen

Abstract: High-quality photography in extreme low-light conditions is challenging but impactful for digital cameras. With advanced computing hardware, traditional camera image signal processor (ISP) algorithms are gradually being replaced by efficient deep networks that enhance noisy raw images more intelligently. However, existing regression-based models often minimize pixel errors and result in oversmooth… ▽ More High-quality photography in extreme low-light conditions is challenging but impactful for digital cameras. With advanced computing hardware, traditional camera image signal processor (ISP) algorithms are gradually being replaced by efficient deep networks that enhance noisy raw images more intelligently. However, existing regression-based models often minimize pixel errors and result in oversmoothing of low-light photos or deep shadows. Recent work has attempted to address this limitation by training a diffusion model from scratch, yet those models still struggle to recover sharp image details and accurate colors. We introduce a novel framework to enhance low-light raw images by retasking pre-trained generative diffusion models with the camera ISP. Extensive experiments demonstrate that our method outperforms the state-of-the-art in perceptual quality across three challenging low-light raw image benchmarks. △ Less

Submitted 29 May, 2025; originally announced May 2025.

arXiv:2501.10657 [pdf, other]

doi 10.1109/LSP.2025.3530849

Channel Estimation and Beamforming Design for MF-RIS-Aided Communication Systems

Authors: Zaihao Pan, Wen Wang, Gaofeng Nie, Ailing Zheng, Wanli Ni

Abstract: In this letter, we study the beamforming design for channel estimation of multi-functional reconfigurable intelligent surface (MF-RIS)-aided multi-user communications that supports simultaneous signal reflection, refraction, and amplification. A least square (LS) based channel estimator is proposed for MF-RIS by considering both the coupled MF-RIS beams and the introduced thermal noise. With the d… ▽ More In this letter, we study the beamforming design for channel estimation of multi-functional reconfigurable intelligent surface (MF-RIS)-aided multi-user communications that supports simultaneous signal reflection, refraction, and amplification. A least square (LS) based channel estimator is proposed for MF-RIS by considering both the coupled MF-RIS beams and the introduced thermal noise. With the discrete fourier transform (DFT)-matrix, the MF-RIS beamforming design problem is simplified under the proposed LS channel estimator. The optimal MF-RIS beamforming design that achieves the Cramér-Rao lower bound (CRLB) of channel estimator is obtained with the proposed alternating optimization algorithm. Simulation results demonstrate the effectiveness of the proposed beamforming design in reducing the impact of thermal noise. △ Less

Submitted 17 January, 2025; originally announced January 2025.

Comments: IEEE Signal Processing Letters

arXiv:2412.01251 [pdf, ps, other]

Multi-Functional RIS Integrated Sensing and Communications for 6G Networks

Authors: Dongsheng Han, Peng Wang, Wanli Ni, Wen Wang, Ailing Zheng, Dusit Niyato, Naofal Al-Dhahir

Abstract: In this paper, we propose a novel multi-functional reconfigurable intelligent surface (MF-RIS) that supports signal reflection, refraction, amplification, and target sensing simultaneously. Our MF-RIS aims to enhance integrated communication and sensing (ISAC) systems, particularly in multi-user and multi-target scenarios. Equipped with reflection and refraction components (i.e., amplifiers and ph… ▽ More In this paper, we propose a novel multi-functional reconfigurable intelligent surface (MF-RIS) that supports signal reflection, refraction, amplification, and target sensing simultaneously. Our MF-RIS aims to enhance integrated communication and sensing (ISAC) systems, particularly in multi-user and multi-target scenarios. Equipped with reflection and refraction components (i.e., amplifiers and phase shifters), MF-RIS is able to adjust the amplitude and phase shift of both communication and sensing signals on demand. Additionally, with the assistance of sensing elements, MF-RIS is capable of capturing the echo signals from multiple targets, thereby mitigating the signal attenuation typically associated with multi-hop links. We propose a MF-RIS-enabled multi-user and multi-target ISAC system, and formulate an optimization problem to maximize the signal-to-interference-plus-noise ratio (SINR) of sensing targets. This problem involves jointly optimizing the transmit beamforming and MF-RIS configurations, subject to constraints on the communication rate, total power budget, and MF-RIS coefficients. We decompose the formulated non-convex problem into three sub-problems, and then solve them via an efficient iterative algorithm. Simulation results demonstrate that: 1) The performance of MF-RIS varies under different operating protocols, and energy splitting (ES) exhibits the best performance in the considered MF-RIS-enabled multi-user multi-target ISAC system; 2) Under the same total power budget, the proposed MF-RIS with ES protocol attains 52.2%, 73.5% and 60.86% sensing SINR gains over active RIS, passive RIS, and simultaneously transmitting and reflecting RIS (STAR-RIS), respectively; 3) The number of sensing elements will no longer improve sensing performance after exceeding a certain number. △ Less

Submitted 27 January, 2025; v1 submitted 2 December, 2024; originally announced December 2024.

arXiv:2410.06584 [pdf, other]

Two Birds With One Stone: Enhancing Communication and Sensing via Multi-Functional RIS

Authors: Wanli Ni, Wen Wang, Ailing Zheng, Peng Wang, Changsheng You, Yonina C. Eldar, Dusit Niyato, Robert Schober

Abstract: In this article, we propose new network architectures that integrate multi-functional reconfigurable intelligent surfaces (MF-RISs) into 6G networks to enhance both communication and sensing capabilities. Firstly, we elaborate how to leverage MF-RISs for improving communication performance in different communication modes including unicast, mulitcast, and broadcast and for different multi-access s… ▽ More In this article, we propose new network architectures that integrate multi-functional reconfigurable intelligent surfaces (MF-RISs) into 6G networks to enhance both communication and sensing capabilities. Firstly, we elaborate how to leverage MF-RISs for improving communication performance in different communication modes including unicast, mulitcast, and broadcast and for different multi-access schemes. Next, we emphasize synergistic benefits of integrating MF-RISs with wireless sensing, enabling more accurate and efficient target detection in 6G networks. Furthermore, we present two schemes that utilize MF-RISs to enhance the performance of integrated sensing and communication (ISAC). We also study multi-objective optimization to achieve the optimal trade-off between communication and sensing performance. Finally, we present numerical results to show the performance improvements offered by MF-RISs compared to conventional RISs in ISAC. We also outline key research directions for MF-RIS under the ambition of 6G. △ Less

Submitted 9 October, 2024; originally announced October 2024.

Comments: 8 pages, 5 figures, submitted to IEEE

Journal ref: IEEE Wireless Communications Magazine, 2025

arXiv:2405.20884 [pdf, other]

Effects of Dataset Sampling Rate for Noise Cancellation through Deep Learning

Authors: Brandon Colelough, Andrew Zheng

Abstract: Background: Active noise cancellation has been a subject of research for decades. Traditional techniques, like the Fast Fourier Transform, have limitations in certain scenarios. This research explores the use of deep neural networks (DNNs) as a superior alternative. Objective: The study aims to determine the effect sampling rate within training data has on lightweight, efficient DNNs that operate… ▽ More Background: Active noise cancellation has been a subject of research for decades. Traditional techniques, like the Fast Fourier Transform, have limitations in certain scenarios. This research explores the use of deep neural networks (DNNs) as a superior alternative. Objective: The study aims to determine the effect sampling rate within training data has on lightweight, efficient DNNs that operate within the processing constraints of mobile devices. Methods: We chose the ConvTasNET network for its proven efficiency in speech separation and enhancement. ConvTasNET was trained on datasets such as WHAM!, LibriMix, and the MS-2023 DNS Challenge. The datasets were sampled at rates of 8kHz, 16kHz, and 48kHz to analyze the effect of sampling rate on noise cancellation efficiency and effectiveness. The model was tested on a core-i7 Intel processor from 2023, assessing the network's ability to produce clear audio while filtering out background noise. Results: Models trained at higher sampling rates (48kHz) provided much better evaluation metrics against Total Harmonic Distortion (THD) and Quality Prediction For Generative Neural Speech Codecs (WARP-Q) values, indicating improved audio quality. However, a trade-off was noted with the processing time being longer for higher sampling rates. Conclusions: The Conv-TasNET network, trained on datasets sampled at higher rates like 48kHz, offers a robust solution for mobile devices in achieving noise cancellation through speech separation and enhancement. Future work involves optimizing the model's efficiency further and testing on mobile devices. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 16 pages, 8 pictures, 3 tables

arXiv:2405.16257 [pdf, other]

From Single to Multi-Functional RIS: Architecture, Key Technologies, Challenges, and Applications

Authors: Wanli Ni, Ailing Zheng, Wen Wang, Dusit Niyato, Naofal Al-Dhahir, Merouane Debbah

Abstract: Although reconfigurable intelligent surfaces (RISs) have demonstrated the potential to boost network capacity and expand coverage by adjusting their electromagnetic properties, existing RIS architectures have certain limitations, such as double-fading attenuation and restricted half-space coverage. In this article, we delve into the progressive development from single to multi-functional RIS (MF-R… ▽ More Although reconfigurable intelligent surfaces (RISs) have demonstrated the potential to boost network capacity and expand coverage by adjusting their electromagnetic properties, existing RIS architectures have certain limitations, such as double-fading attenuation and restricted half-space coverage. In this article, we delve into the progressive development from single to multi-functional RIS (MF-RIS) that enables simultaneous signal amplification, reflection, and refraction. We begin by detailing the hardware design and signal model that distinguish MF-RIS from traditional RISs. Subsequently, we introduce the key technologies underpinning MF-RIS-aided communications, along with the fundamental issues and challenges inherent to its deployment. We then outline the promising applications of MFRIS in the realm of communication, sensing, and computation systems, highlighting its transformative impact on these domains. Lastly, we present simulation results to demonstrate the superiority of MF-RIS in enhancing network performance in terms of spectral efficiency. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 9 pages, 6 figures, submitted to IEEE magazines

arXiv:2405.14878 [pdf, other]

Improving and Evaluating Machine Learning Methods for Forensic Shoeprint Matching

Authors: Divij Jain, Saatvik Kher, Lena Liang, Yufeng Wu, Ashley Zheng, Xizhen Cai, Anna Plantinga, Elizabeth Upton

Abstract: We propose a machine learning pipeline for forensic shoeprint pattern matching that improves on the accuracy and generalisability of existing methods. We extract 2D coordinates from shoeprint scans using edge detection and align the two shoeprints with iterative closest point (ICP). We then extract similarity metrics to quantify how well the two prints match and use these metrics to train a random… ▽ More We propose a machine learning pipeline for forensic shoeprint pattern matching that improves on the accuracy and generalisability of existing methods. We extract 2D coordinates from shoeprint scans using edge detection and align the two shoeprints with iterative closest point (ICP). We then extract similarity metrics to quantify how well the two prints match and use these metrics to train a random forest that generates a probabilistic measurement of how likely two prints are to have originated from the same outsole. We assess the generalisability of machine learning methods trained on lab shoeprint scans to more realistic crime scene shoeprint data by evaluating the accuracy of our methods on several shoeprint scenarios: partial prints, prints with varying levels of blurriness, prints with different amounts of wear, and prints from different shoe models. We find that models trained on one type of shoeprint yield extremely high levels of accuracy when tested on shoeprint pairs of the same scenario but fail to generalise to other scenarios. We also discover that models trained on a variety of scenarios predict almost as accurately as models trained on specific scenarios. △ Less

Submitted 2 April, 2024; originally announced May 2024.

arXiv:2403.14464 [pdf, other]

Synthesizing Controller for Safe Navigation using Control Density Function

Authors: Joseph Moyalan, Sriram S. K. S Narayanan, Andrew Zheng, Umesh Vaidya

Abstract: We consider the problem of navigating a nonlinear dynamical system from some initial set to some target set while avoiding collision with an unsafe set. We extend the concept of density function to control density function (CDF) for solving navigation problems with safety constraints. The occupancy-based interpretation of the measure associated with the density function is instrumental in imposing… ▽ More We consider the problem of navigating a nonlinear dynamical system from some initial set to some target set while avoiding collision with an unsafe set. We extend the concept of density function to control density function (CDF) for solving navigation problems with safety constraints. The occupancy-based interpretation of the measure associated with the density function is instrumental in imposing the safety constraints. The navigation problem with safety constraints is formulated as a quadratic program (QP) using CDF. The existing approach using the control barrier function (CBF) also formulates the navigation problem with safety constraints as QP. One of the main advantages of the proposed QP using CDF compared to QP formulated using CBF is that both the convergence/stability and safety can be combined and imposed using the CDF. Simulation results involving the Duffing oscillator and safe navigation of Dubin car models are provided to verify the main findings of the paper. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2305.02938 [pdf, other]

Off-Road Navigation of Legged Robots Using Linear Transfer Operators

Authors: Joseph Moyalan, Andrew Zheng, Sriram S. K. S Narayanan, Umesh Vaidya

Abstract: This paper presents the implementation of off-road navigation on legged robots using convex optimization through linear transfer operators. Given a traversability measure that captures the off-road environment, we lift the navigation problem into the density space using the Perron-Frobenius (P-F) operator. This allows the problem formulation to be represented as a convex optimization. Due to the o… ▽ More This paper presents the implementation of off-road navigation on legged robots using convex optimization through linear transfer operators. Given a traversability measure that captures the off-road environment, we lift the navigation problem into the density space using the Perron-Frobenius (P-F) operator. This allows the problem formulation to be represented as a convex optimization. Due to the operator acting on an infinite-dimensional density space, we use data collected from the terrain to get a finite-dimension approximation of the convex optimization. Results of the optimal trajectory for off-road navigation are compared with a standard iterative planner, where we show how our convex optimization generates a more traversable path for the legged robot compared to the suboptimal iterative planner. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2301.13630 [pdf, other]

doi 10.1109/LCOMM.2022.3232148

Enhancing NOMA Networks via Reconfigurable Multi-Functional Surface

Authors: Ailing Zheng, Wanli Ni, Wen Wang, Hui Tian

Abstract: By flexibly manipulating the radio propagation environment, reconfigurable intelligent surface (RIS) is a promising technique for future wireless communications. However, the single-side coverage and double-fading attenuation faced by conventional RISs largely restrict their applications. To address this issue, we propose a novel concept of multi-functional RIS (MF-RIS), which provides reflection,… ▽ More By flexibly manipulating the radio propagation environment, reconfigurable intelligent surface (RIS) is a promising technique for future wireless communications. However, the single-side coverage and double-fading attenuation faced by conventional RISs largely restrict their applications. To address this issue, we propose a novel concept of multi-functional RIS (MF-RIS), which provides reflection, transmission, and amplification simultaneously for the incident signal. With the aim of enhancing the performance of a non-orthogonal multiple-access (NOMA) downlink multiuser network, we deploy an MF-RIS to maximize the sum rate by jointly optimizing the active beamforming and MF-RIS coefficients. Then, an alternating optimization algorithm is proposed to solve the formulated non-convex problem by exploiting successive convex approximation and penalty-based method. Numerical results show that the proposed MF-RIS outperforms conventional RISs under different settings. △ Less

Submitted 31 January, 2023; originally announced January 2023.

Comments: This paper has been accepted by IEEE Communications Letters

Journal ref: IEEE Communications Letters, 2023

arXiv:2007.09870 [pdf]

A novel deep learning-based method for monochromatic image synthesis from spectral CT using photon-counting detectors

Authors: Ao Zheng, Hongkai Yang, Li Zhang, Yuxiang Xing

Abstract: With the growing technology of photon-counting detectors (PCD), spectral CT is a widely concerned topic which has the potential of material differentiation. However, due to some non-ideal factors such as cross talk and pulse pile-up of the detectors, direct reconstruction from detected spectrum without any corrections will get a wrong result. Conventional methods try to model these factors using c… ▽ More With the growing technology of photon-counting detectors (PCD), spectral CT is a widely concerned topic which has the potential of material differentiation. However, due to some non-ideal factors such as cross talk and pulse pile-up of the detectors, direct reconstruction from detected spectrum without any corrections will get a wrong result. Conventional methods try to model these factors using calibration and make corrections accordingly, but depend on the preciseness of the model. To solve this problem, in this paper, we proposed a novel deep learning-based monochromatic image synthesis method working in sinogram domain. Different from previous deep learning-based methods aimed at this problem, we designed a novel network architecture according to the physical model of cross talk, and it can solve this problem better in an ingenious way. Our method was tested on a cone-beam CT (CBCT) system equipped with a PCD. After using FDK algorithm on the corrected projection, we got quite more accurate results with less noise, which showed the feasibility of monochromatic image synthesis by our method. △ Less

Submitted 19 July, 2020; originally announced July 2020.

Comments: 9 pages, 4 figures, submitted to the 2020 IEEE Nuclear Science Symposium (NSS) and Medical Imaging Conference (MIC)

arXiv:1910.03746 [pdf]

A cascaded dual-domain deep learning reconstruction method for sparsely spaced multidetector helical CT

Authors: Ao Zheng, Hewei Gao, Li Zhang, Yuxiang Xing

Abstract: Helical CT has been widely used in clinical diagnosis. Sparsely spaced multidetector in z direction can increase the coverage of the detector provided limited detector rows. It can speed up volumetric CT scan, lower the radiation dose and reduce motion artifacts. However, it leads to insufficient data for reconstruction. That means reconstructions from general analytical methods will have severe a… ▽ More Helical CT has been widely used in clinical diagnosis. Sparsely spaced multidetector in z direction can increase the coverage of the detector provided limited detector rows. It can speed up volumetric CT scan, lower the radiation dose and reduce motion artifacts. However, it leads to insufficient data for reconstruction. That means reconstructions from general analytical methods will have severe artifacts. Iterative reconstruction methods might be able to deal with this situation but with the cost of huge computational load. In this work, we propose a cascaded dual-domain deep learning method that completes both data transformation in projection domain and error reduction in image domain. First, a convolutional neural network (CNN) in projection domain is constructed to estimate missing helical projection data and converting helical projection data to 2D fan-beam projection data. This step is to suppress helical artifacts and reduce the following computational cost. Then, an analytical linear operator is followed to transfer the data from projection domain to image domain. Finally, an image domain CNN is added to improve image quality further. These three steps work as an entirety and can be trained end to end. The overall network is trained using a simulated lung CT dataset with Poisson noise from 25 patients. We evaluate the trained network on another three patients and obtain very encouraging results with both visual examination and quantitative comparison. The resulting RRMSE is 6.56% and the SSIM is 99.60%. In addition, we test the trained network on the lung CT dataset with different noise level and a new dental CT dataset to demonstrate the generalization and robustness of our method. △ Less

Submitted 23 October, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

arXiv:1905.01902 [pdf, other]

doi 10.1109/TCBB.2020.2978470

Lesion Segmentation in Ultrasound Using Semi-pixel-wise Cycle Generative Adversarial Nets

Authors: Jie Xing, Zheren Li, Biyuan Wang, Yuji Qi, Bingbin Yu, Farhad G. Zanjani, Aiwen Zheng, Remco Duits, Tao Tan

Abstract: Breast cancer is the most common invasive cancer with the highest cancer occurrence in females. Handheld ultrasound is one of the most efficient ways to identify and diagnose the breast cancer. The area and the shape information of a lesion is very helpful for clinicians to make diagnostic decisions. In this study we propose a new deep-learning scheme, semi-pixel-wise cycle generative adversarial… ▽ More Breast cancer is the most common invasive cancer with the highest cancer occurrence in females. Handheld ultrasound is one of the most efficient ways to identify and diagnose the breast cancer. The area and the shape information of a lesion is very helpful for clinicians to make diagnostic decisions. In this study we propose a new deep-learning scheme, semi-pixel-wise cycle generative adversarial net (SPCGAN) for segmenting the lesion in 2D ultrasound. The method takes the advantage of a fully convolutional neural network (FCN) and a generative adversarial net to segment a lesion by using prior knowledge. We compared the proposed method to a fully connected neural network and the level set segmentation method on a test dataset consisting of 32 malignant lesions and 109 benign lesions. Our proposed method achieved a Dice similarity coefficient (DSC) of 0.92 while FCN and the level set achieved 0.90 and 0.79 respectively. Particularly, for malignant lesions, our method increases the DSC (0.90) of the fully connected neural network to 0.93 significantly (p$<$0.001). The results show that our SPCGAN can obtain robust segmentation results. The framework of SPCGAN is particularly effective when sufficient training samples are not available compared to FCN. Our proposed method may be used to relieve the radiologists' burden for annotation. △ Less

Submitted 17 October, 2020; v1 submitted 6 May, 2019; originally announced May 2019.

Journal ref: IEEE/ACM Transactions on Computational Biology and Bioinformatics, 04 March 2020, pp.1-1

Showing 1–13 of 13 results for author: Zheng, A