-
TransUNext: towards a more advanced U-shaped framework for automatic vessel segmentation in the fundus image
Authors:
Xiang Li,
Mingsi Liu,
Lixin Duan
Abstract:
Purpose: Automatic and accurate segmentation of fundus vessel images has become an essential prerequisite for computer-aided diagnosis of ophthalmic diseases such as diabetes mellitus. The task of high-precision retinal vessel segmentation still faces difficulties due to the low contrast between the branch ends of retinal vessels and the background, the long and thin vessel span, and the variable…
▽ More
Purpose: Automatic and accurate segmentation of fundus vessel images has become an essential prerequisite for computer-aided diagnosis of ophthalmic diseases such as diabetes mellitus. The task of high-precision retinal vessel segmentation still faces difficulties due to the low contrast between the branch ends of retinal vessels and the background, the long and thin vessel span, and the variable morphology of the optic disc and optic cup in fundus vessel images. Methods: We propose a more advanced U-shaped architecture for a hybrid Transformer and CNN: TransUNext, which integrates an Efficient Self-attention Mechanism into the encoder and decoder of U-Net to capture both local features and global dependencies with minimal computational overhead. Meanwhile, the Global Multi-Scale Fusion (GMSF) module is further introduced to upgrade skip-connections, fuse high-level semantic and low-level detailed information, and eliminate high- and low-level semantic differences. Inspired by ConvNeXt, TransNeXt Block is designed to optimize the computational complexity of each base block in U-Net and avoid the information loss caused by the compressed dimension when the information is converted between the feature spaces of different dimensions. Results: We evaluated the proposed method on four public datasets DRIVE, STARE, CHASE-DB1, and HRF. In the experimental results, the AUC (area under the ROC curve) values were 0.9867, 0.9869, 0.9910, and 0.9887, which exceeded the other state-of-the-art.
△ Less
Submitted 4 November, 2024;
originally announced November 2024.
-
Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Authors:
Sha Guo,
Zhuo Chen,
Yang Zhao,
Ning Zhang,
Xiaotong Li,
Lingyu Duan
Abstract:
Traditional image codecs emphasize signal fidelity and human perception, often at the expense of machine vision tasks. Deep learning methods have demonstrated promising coding performance by utilizing rich semantic embeddings optimized for both human and machine vision. However, these compact embeddings struggle to capture fine details such as contours and textures, resulting in imperfect reconstr…
▽ More
Traditional image codecs emphasize signal fidelity and human perception, often at the expense of machine vision tasks. Deep learning methods have demonstrated promising coding performance by utilizing rich semantic embeddings optimized for both human and machine vision. However, these compact embeddings struggle to capture fine details such as contours and textures, resulting in imperfect reconstructions. Furthermore, existing learning-based codecs lack scalability. To address these limitations, this paper introduces a content-adaptive diffusion model for scalable image compression. The proposed method encodes fine textures through a diffusion process, enhancing perceptual quality while preserving essential features for machine vision tasks. The approach employs a Markov palette diffusion model combined with widely used feature extractors and image generators, enabling efficient data compression. By leveraging collaborative texture-semantic feature extraction and pseudo-label generation, the method accurately captures texture information. A content-adaptive Markov palette diffusion model is then applied to represent both low-level textures and high-level semantic content in a scalable manner. This framework offers flexible control over compression ratios by selecting intermediate diffusion states, eliminating the need for retraining deep learning models at different operating points. Extensive experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks such as object detection, segmentation, and facial landmark detection, achieving superior perceptual quality compared to state-of-the-art methods.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
When NOMA Meets AIGC: Enhanced Wireless Federated Learning
Authors:
Ding Xu,
Lingjie Duan,
Hongbo Zhu
Abstract:
Wireless federated learning (WFL) enables devices to collaboratively train a global model via local model training, uploading and aggregating. However, WFL faces the data scarcity/heterogeneity problem (i.e., data are limited and unevenly distributed among devices) that degrades the learning performance. In this regard, artificial intelligence generated content (AIGC) can synthesize various types…
▽ More
Wireless federated learning (WFL) enables devices to collaboratively train a global model via local model training, uploading and aggregating. However, WFL faces the data scarcity/heterogeneity problem (i.e., data are limited and unevenly distributed among devices) that degrades the learning performance. In this regard, artificial intelligence generated content (AIGC) can synthesize various types of data to compensate for the insufficient local data. Nevertheless, downloading synthetic data or uploading local models iteratively takes a lot of time, especially for a large amount of devices. To address this issue, we propose to leverage non-orthogonal multiple access (NOMA) to achieve efficient synthetic data and local model transmission. This paper is the first to combine AIGC and NOMA with WFL to maximally enhance the learning performance. For the proposed NOMA+AIGC-enhanced WFL, the problem of jointly optimizing the synthetic data distribution, two-way communication and computation resource allocation to minimize the global learning error is investigated. The problem belongs to NP-hard mixed integer nonlinear programming, whose optimal solution is intractable to find. We first employ the block coordinate descent method to decouple the complicated-coupled variables, and then resort to our analytical method to derive an efficient low-complexity local optimal solution with partial closed-form results. Extensive simulations validate the superiority of the proposed scheme compared to the existing and benchmark schemes such as the frequency/time division multiple access based AIGC-enhanced schemes.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Fair Computation Offloading for RSMA-Assisted Mobile Edge Computing Networks
Authors:
Ding Xu,
Lingjie Duan,
Haitao Zhao,
Hongbo Zhu
Abstract:
Rate splitting multiple access (RSMA) provides a flexible transmission framework that can be applied in mobile edge computing (MEC) systems. However, the research work on RSMA-assisted MEC systems is still at the infancy and many design issues remain unsolved, such as the MEC server and channel allocation problem in general multi-server and multi-channel scenarios as well as the user fairness issu…
▽ More
Rate splitting multiple access (RSMA) provides a flexible transmission framework that can be applied in mobile edge computing (MEC) systems. However, the research work on RSMA-assisted MEC systems is still at the infancy and many design issues remain unsolved, such as the MEC server and channel allocation problem in general multi-server and multi-channel scenarios as well as the user fairness issues. In this regard, we study an RSMA-assisted MEC system with multiple MEC servers, channels and devices, and consider the fairness among devices. A max-min fairness computation offloading problem to maximize the minimum computation offloading rate is investigated. Since the problem is difficult to solve optimally, we develop an efficient algorithm to obtain a suboptimal solution. Particularly, the time allocation and the computing frequency allocation are derived as closed-form functions of the transmit power allocation and the successive interference cancellation (SIC) decoding order, while the transmit power allocation and the SIC decoding order are jointly optimized via the alternating optimization method, the bisection search method and the successive convex approximation method. For the channel and MEC server allocation problem, we transform it into a hypergraph matching problem and solve it by matching theory. Simulation results demonstrate that the proposed RSMA-assisted MEC system outperforms current MEC systems under various system setups.
△ Less
Submitted 1 August, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
Multi-rater Prism: Learning self-calibrated medical image segmentation from multiple raters
Authors:
Junde Wu,
Huihui Fang,
Yehui Yang,
Yuanpei Liu,
Jing Gao,
Lixin Duan,
Weihua Yang,
Yanwu Xu
Abstract:
In medical image segmentation, it is often necessary to collect opinions from multiple experts to make the final decision. This clinical routine helps to mitigate individual bias. But when data is multiply annotated, standard deep learning models are often not applicable. In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segme…
▽ More
In medical image segmentation, it is often necessary to collect opinions from multiple experts to make the final decision. This clinical routine helps to mitigate individual bias. But when data is multiply annotated, standard deep learning models are often not applicable. In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segmentation from multiple labels. Inspired by the iterative half-quadratic optimization, the proposed MrPrism will combine the multi-rater confidences assignment task and calibrated segmentation task in a recurrent manner. In this recurrent process, MrPrism can learn inter-observer variability taking into account the image semantic properties, and finally converges to a self-calibrated segmentation result reflecting the inter-observer agreement. Specifically, we propose Converging Prism (ConP) and Diverging Prism (DivP) to process the two tasks iteratively. ConP learns calibrated segmentation based on the multi-rater confidence maps estimated by DivP. DivP generates multi-rater confidence maps based on the segmentation masks estimated by ConP. The experimental results show that by recurrently running ConP and DivP, the two tasks can achieve mutual improvement. The final converged segmentation result of MrPrism outperforms state-of-the-art (SOTA) strategies on a wide range of medical image segmentation tasks.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Online Pricing Incentive to Sample Fresh Information
Authors:
Hongbo Li,
Lingjie Duan
Abstract:
Today mobile users such as drivers are invited by content providers (e.g., Tripadvisor) to sample fresh information of diverse paths to control the age of information (AoI). However, selfish drivers prefer to travel through the shortest path instead of the others with extra costs in time and gas. To motivate drivers to route and sample diverse paths, this paper is the first to propose online prici…
▽ More
Today mobile users such as drivers are invited by content providers (e.g., Tripadvisor) to sample fresh information of diverse paths to control the age of information (AoI). However, selfish drivers prefer to travel through the shortest path instead of the others with extra costs in time and gas. To motivate drivers to route and sample diverse paths, this paper is the first to propose online pricing for a provider to economically reward drivers for diverse routing and control the actual AoI dynamics over time and spatial path domains. This online pricing optimization problem should be solved without knowing drivers' costs and even arrivals, and is intractable due to the curse of dimensionality in both time and space. If there is only one non-shortest path, we leverage the Markov decision process (MDP) techniques to analyze the problem. Accordingly, we design a linear-time algorithm for returning optimal online pricing, where a higher pricing reward is needed for a larger AoI. If there are a number of non-shortest paths, we prove that pricing one path at a time is optimal, yet it is not optimal to choose the path with the largest current AoI. Then we propose a new backward-clustered computation method and develop an approximation algorithm to alternate different paths to price over time. Perhaps surprisingly, our analysis of approximation ratio suggests that our algorithm's performance approaches closer to optimum given more paths.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
Calibrate the inter-observer segmentation uncertainty via diagnosis-first principle
Authors:
Junde Wu,
Huihui Fang,
Hoayi Xiong,
Lixin Duan,
Mingkui Tan,
Weihua Yang,
Huiying Liu,
Yanwu Xu
Abstract:
On the medical images, many of the tissues/lesions may be ambiguous. That is why the medical segmentation is typically annotated by a group of clinical experts to mitigate the personal bias. However, this clinical routine also brings new challenges to the application of machine learning algorithms. Without a definite ground-truth, it will be difficult to train and evaluate the deep learning models…
▽ More
On the medical images, many of the tissues/lesions may be ambiguous. That is why the medical segmentation is typically annotated by a group of clinical experts to mitigate the personal bias. However, this clinical routine also brings new challenges to the application of machine learning algorithms. Without a definite ground-truth, it will be difficult to train and evaluate the deep learning models. When the annotations are collected from different graders, a common choice is majority vote. However such a strategy ignores the difference between the grader expertness. In this paper, we consider the task of predicting the segmentation with the calibrated inter-observer uncertainty. We note that in clinical practice, the medical image segmentation is usually used to assist the disease diagnosis. Inspired by this observation, we propose diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty. Following this idea, a framework named Diagnosis First segmentation Framework (DiFF) is proposed to estimate diagnosis-first segmentation from the raw images.Specifically, DiFF will first learn to fuse the multi-rater segmentation labels to a single ground-truth which could maximize the disease diagnosis performance. We dubbed the fused ground-truth as Diagnosis First Ground-truth (DF-GT).Then, we further propose Take and Give Modelto segment DF-GT from the raw image. We verify the effectiveness of DiFF on three different medical segmentation tasks: OD/OC segmentation on fundus images, thyroid nodule segmentation on ultrasound images, and skin lesion segmentation on dermoscopic images. Experimental results show that the proposed DiFF is able to significantly facilitate the corresponding disease diagnosis, which outperforms previous state-of-the-art multi-rater learning methods.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma
Authors:
Chu Han,
Xipeng Pan,
Lixu Yan,
Huan Lin,
Bingbing Li,
Su Yao,
Shanshan Lv,
Zhenwei Shi,
Jinhai Mai,
Jiatai Lin,
Bingchao Zhao,
Zeyan Xu,
Zhizhen Wang,
Yumeng Wang,
Yuan Zhang,
Huihui Wang,
Chao Zhu,
Chunhui Lin,
Lijian Mao,
Min Wu,
Luwen Duan,
Jingsong Zhu,
Dong Hu,
Zijie Fang,
Yang Chen
, et al. (18 additional authors not shown)
Abstract:
Lung cancer is the leading cause of cancer death worldwide, and adenocarcinoma (LUAD) is the most common subtype. Exploiting the potential value of the histopathology images can promote precision medicine in oncology. Tissue segmentation is the basic upstream task of histopathology image analysis. Existing deep learning models have achieved superior segmentation performance but require sufficient…
▽ More
Lung cancer is the leading cause of cancer death worldwide, and adenocarcinoma (LUAD) is the most common subtype. Exploiting the potential value of the histopathology images can promote precision medicine in oncology. Tissue segmentation is the basic upstream task of histopathology image analysis. Existing deep learning models have achieved superior segmentation performance but require sufficient pixel-level annotations, which is time-consuming and expensive. To enrich the label resources of LUAD and to alleviate the annotation efforts, we organize this challenge WSSS4LUAD to call for the outstanding weakly-supervised semantic segmentation (WSSS) techniques for histopathology images of LUAD. Participants have to design the algorithm to segment tumor epithelial, tumor-associated stroma and normal tissue with only patch-level labels. This challenge includes 10,091 patch-level annotations (the training set) and over 130 million labeled pixels (the validation and test sets), from 87 WSIs (67 from GDPH, 20 from TCGA). All the labels were generated by a pathologist-in-the-loop pipeline with the help of AI models and checked by the label review board. Among 532 registrations, 28 teams submitted the results in the test phase with over 1,000 submissions. Finally, the first place team achieved mIoU of 0.8413 (tumor: 0.8389, stroma: 0.7931, normal: 0.8919). According to the technical reports of the top-tier teams, CAM is still the most popular approach in WSSS. Cutmix data augmentation has been widely adopted to generate more reliable samples. With the success of this challenge, we believe that WSSS approaches with patch-level annotations can be a complement to the traditional pixel annotations while reducing the annotation efforts. The entire dataset has been released to encourage more researches on computational pathology in LUAD and more novel WSSS techniques.
△ Less
Submitted 13 April, 2022; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Towards Low Light Enhancement with RAW Images
Authors:
Haofeng Huang,
Wenhan Yang,
Yueyu Hu,
Jiaying Liu,
Ling-Yu Duan
Abstract:
In this paper, we make the first benchmark effort to elaborate on the superiority of using RAW images in the low light enhancement and develop a novel alternative route to utilize RAW images in a more flexible and practical way. Inspired by a full consideration on the typical image processing pipeline, we are inspired to develop a new evaluation framework, Factorized Enhancement Model (FEM), which…
▽ More
In this paper, we make the first benchmark effort to elaborate on the superiority of using RAW images in the low light enhancement and develop a novel alternative route to utilize RAW images in a more flexible and practical way. Inspired by a full consideration on the typical image processing pipeline, we are inspired to develop a new evaluation framework, Factorized Enhancement Model (FEM), which decomposes the properties of RAW images into measurable factors and provides a tool for exploring how properties of RAW images affect the enhancement performance empirically. The empirical benchmark results show that the Linearity of data and Exposure Time recorded in meta-data play the most critical role, which brings distinct performance gains in various measures over the approaches taking the sRGB images as input. With the insights obtained from the benchmark results in mind, a RAW-guiding Exposure Enhancement Network (REENet) is developed, which makes trade-offs between the advantages and inaccessibility of RAW images in real applications in a way of using RAW images only in the training phase. REENet projects sRGB images into linear RAW domains to apply constraints with corresponding RAW images to reduce the difficulty of modeling training. After that, in the testing phase, our REENet does not rely on RAW images. Experimental results demonstrate not only the superiority of REENet to state-of-the-art sRGB-based methods and but also the effectiveness of the RAW guidance and all components.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
Optimal UAV Hitching on Ground Vehicles
Authors:
Lihua Ruan,
Lingjie Duan,
Jianwei Huang
Abstract:
Due to its mobility and agility, unmanned aerial vehicle (UAV) has emerged as a promising technology for various tasks, such as sensing, inspection and delivery. However, a typical UAV has limited energy storage and cannot fly a long distance without being recharged. This motivates several existing proposals to use trucks and other ground vehicles to offer riding to help UAVs save energy and expan…
▽ More
Due to its mobility and agility, unmanned aerial vehicle (UAV) has emerged as a promising technology for various tasks, such as sensing, inspection and delivery. However, a typical UAV has limited energy storage and cannot fly a long distance without being recharged. This motivates several existing proposals to use trucks and other ground vehicles to offer riding to help UAVs save energy and expand the operation radius. We present the first theoretical study regarding how UAVs should optimally hitch on ground vehicles, considering vehicles' different travelling patterns and supporting capabilities. For a single UAV, we derive closed-form optimal vehicle selection and hitching strategy. When vehicles only support hitching, a UAV would prefer the vehicle that can carry it closest to its final destination. When vehicles can offer hitching plus charging, the UAV may hitch on a vehicle that carries it farther away from its destination and hitch a longer distance. The UAV may also prefer to hitch on a slower vehicle for the benefit of battery recharging. For multiple UAVs in need of hitching, we develop the max-saving algorithm (MSA) to optimally match UAV-vehicle collaboration. We prove that the MSA globally optimizes the total hitching benefits for the UAVs.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report
Authors:
Andrey Ignatov,
Andres Romero,
Heewon Kim,
Radu Timofte,
Chiu Man Ho,
Zibo Meng,
Kyoung Mu Lee,
Yuxiang Chen,
Yutong Wang,
Zeyu Long,
Chenhao Wang,
Yifei Chen,
Boshen Xu,
Shuhang Gu,
Lixin Duan,
Wen Li,
Wang Bofei,
Zhang Diankai,
Zheng Chengjian,
Liu Shaoli,
Gao Si,
Zhang Xiaofeng,
Lu Kaidi,
Xu Tianyu,
Zheng Hui
, et al. (6 additional authors not shown)
Abstract:
Video super-resolution has recently become one of the most important mobile-related problems due to the rise of video communication and streaming services. While many solutions have been proposed for this task, the majority of them are too computationally expensive to run on portable devices with limited hardware resources. To address this problem, we introduce the first Mobile AI challenge, where…
▽ More
Video super-resolution has recently become one of the most important mobile-related problems due to the rise of video communication and streaming services. While many solutions have been proposed for this task, the majority of them are too computationally expensive to run on portable devices with limited hardware resources. To address this problem, we introduce the first Mobile AI challenge, where the target is to develop an end-to-end deep learning-based video super-resolution solutions that can achieve a real-time performance on mobile GPUs. The participants were provided with the REDS dataset and trained their models to do an efficient 4X video upscaling. The runtime of all models was evaluated on the OPPO Find X2 smartphone with the Snapdragon 865 SoC capable of accelerating floating-point networks on its Adreno GPU. The proposed solutions are fully compatible with any mobile GPU and can upscale videos to HD resolution at up to 80 FPS while demonstrating high fidelity results. A detailed description of all models developed in the challenge is provided in this paper.
△ Less
Submitted 17 May, 2021;
originally announced May 2021.
-
Balanced Order Batching with Task-Oriented Graph Clustering
Authors:
Lu Duan,
Haoyuan Hu,
Zili Wu,
Guozheng Li,
Xinhang Zhang,
Yu Gong,
Yinghui Xu
Abstract:
Balanced order batching problem (BOBP) arises from the process of warehouse picking in Cainiao, the largest logistics platform in China. Batching orders together in the picking process to form a single picking route, reduces travel distance. The reason for its importance is that order picking is a labor intensive process and, by using good batching methods, substantial savings can be obtained. The…
▽ More
Balanced order batching problem (BOBP) arises from the process of warehouse picking in Cainiao, the largest logistics platform in China. Batching orders together in the picking process to form a single picking route, reduces travel distance. The reason for its importance is that order picking is a labor intensive process and, by using good batching methods, substantial savings can be obtained. The BOBP is a NP-hard combinational optimization problem and designing a good problem-specific heuristic under the quasi-real-time system response requirement is non-trivial. In this paper, rather than designing heuristics, we propose an end-to-end learning and optimization framework named Balanced Task-orientated Graph Clustering Network (BTOGCN) to solve the BOBP by reducing it to balanced graph clustering optimization problem. In BTOGCN, a task-oriented estimator network is introduced to guide the type-aware heterogeneous graph clustering networks to find a better clustering result related to the BOBP objective. Through comprehensive experiments on single-graph and multi-graphs, we show: 1) our balanced task-oriented graph clustering network can directly utilize the guidance of target signal and outperforms the two-stage deep embedding and deep clustering method; 2) our method obtains an average 4.57m and 0.13m picking distance ("m" is the abbreviation of the meter (the SI base unit of length)) reduction than the expert-designed algorithm on single and multi-graph set and has a good generalization ability to apply in practical scenario.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
Deep Controllable Backlight Dimming
Authors:
Lvyin Duan,
Demetris Marnerides,
Alan Chalmers,
Zhichun Lei,
Kurt Debattista
Abstract:
Dual-panel displays require local dimming algorithms in order to reproduce content with high fidelity and high dynamic range. In this work, a novel deep learning based local dimming method is proposed for rendering HDR images on dual-panel HDR displays. The method uses a Convolutional Neural Network to predict backlight values, using as input the HDR image that is to be displayed. The model is des…
▽ More
Dual-panel displays require local dimming algorithms in order to reproduce content with high fidelity and high dynamic range. In this work, a novel deep learning based local dimming method is proposed for rendering HDR images on dual-panel HDR displays. The method uses a Convolutional Neural Network to predict backlight values, using as input the HDR image that is to be displayed. The model is designed and trained via a controllable power parameter that allows a user to trade off between power and quality. The proposed method is evaluated against six other methods on a test set of 105 HDR images, using a variety of quantitative quality metrics. Results demonstrate improved display quality and better power consumption when using the proposed method compared to the best alternatives.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
Authors:
Haoqiang Guo,
Lu Peng,
Jian Zhang,
Fang Qi,
Lide Duan
Abstract:
Recent studies identify that Deep learning Neural Networks (DNNs) are vulnerable to subtle perturbations, which are not perceptible to human visual system but can fool the DNN models and lead to wrong outputs. A class of adversarial attack network algorithms has been proposed to generate robust physical perturbations under different circumstances. These algorithms are the first efforts to move for…
▽ More
Recent studies identify that Deep learning Neural Networks (DNNs) are vulnerable to subtle perturbations, which are not perceptible to human visual system but can fool the DNN models and lead to wrong outputs. A class of adversarial attack network algorithms has been proposed to generate robust physical perturbations under different circumstances. These algorithms are the first efforts to move forward secure deep learning by providing an avenue to train future defense networks, however, the intrinsic complexity of them prevents their broader usage.
In this paper, we propose the first hardware accelerator for adversarial attacks based on memristor crossbar arrays. Our design significantly improves the throughput of a visual adversarial perturbation system, which can further improve the robustness and security of future deep learning systems. Based on the algorithm uniqueness, we propose four implementations for the adversarial attack accelerator ($A^3$) to improve the throughput, energy efficiency, and computational efficiency.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Dynamic Pricing and Mean Field Analysis for Controlling Age of Information
Authors:
Xuehe Wang,
Lingjie Duan
Abstract:
Today many mobile users in various zones are invited to sense and send back real-time useful information (e.g., traffic observation and sensor data) to keep the freshness of the content updates in such zones. However, due to the sampling cost in sensing and transmission, a user may not have the incentive to contribute the real-time information to help reduce the age of information (AoI). We propos…
▽ More
Today many mobile users in various zones are invited to sense and send back real-time useful information (e.g., traffic observation and sensor data) to keep the freshness of the content updates in such zones. However, due to the sampling cost in sensing and transmission, a user may not have the incentive to contribute the real-time information to help reduce the age of information (AoI). We propose dynamic pricing for each zone to offer age-dependent monetary returns and encourage users to sample information at different rates over time. This dynamic pricing design problem needs to well balance the monetary payments as rewards to users and the AoI evolution over time, and is challenging to solve especially under the incomplete information about users' arrivals and their private sampling costs. After formulating the problem as a nonlinear constrained dynamic program, to avoid the curse of dimensionality, we first propose to approximate the dynamic AoI reduction as a time-average term and successfully solve the approximate dynamic pricing in closed-form. Further, by providing the steady-state analysis for an infinite time horizon, we show that the pricing scheme (though in closed-form) can be further simplified to an $\varepsilon$-optimal version without recursive computing over time. Finally, we extend the AoI control from a single zone to many zones with heterogeneous user arrival rates and initial ages, where each zone cares not only its own AoI dynamics but also the average AoI of all the zones in a mean field game system to provide a holistic service. Accordingly, we propose decentralized mean field pricing for each zone to self-operate by using a mean field term to estimate the average age dynamics of all the zones, which does not even require many zones to exchange their local data with each other.
△ Less
Submitted 17 May, 2021; v1 submitted 18 April, 2020;
originally announced April 2020.
-
Cooperative Double-IRS Aided Communication: Beamforming Design and Power Scaling
Authors:
Yitao Han,
Shuowen Zhang,
Lingjie Duan,
Rui Zhang
Abstract:
Intelligent reflecting surface (IRS) is a promising technology to support high performance wireless communication. By adaptively configuring the reflection amplitude and/or phase of each passive reflecting element on it, the IRS can reshape the electromagnetic environment in favour of signal transmission. This letter advances the existing research by proposing and analyzing a double-IRS aided wire…
▽ More
Intelligent reflecting surface (IRS) is a promising technology to support high performance wireless communication. By adaptively configuring the reflection amplitude and/or phase of each passive reflecting element on it, the IRS can reshape the electromagnetic environment in favour of signal transmission. This letter advances the existing research by proposing and analyzing a double-IRS aided wireless communication system. Under the reasonable assumption that the reflection channel from IRS 1 to IRS 2 is of rank 1 (e.g., line-of-sight channel), we propose a joint passive beamforming design for the two IRSs. Based on this, we show that deploying two cooperative IRSs with in total K elements can yield a power gain of order O(K^4), which greatly outperforms the case of deploying one traditional IRS with a power gain of order O(K^2). Our simulation results validate that the performance of deploying two cooperative IRSs is significantly better than that of deploying one IRS given a sufficient total number of IRS elements. We also extend our line-of-sight channel model to show how different channel models affect the performance of the double-IRS aided wireless communication system.
△ Less
Submitted 4 April, 2020;
originally announced April 2020.
-
Towards Reliable UAV Swarm Communication in D2D-Enhanced Cellular Network
Authors:
Yitao Han,
Liang Liu,
Lingjie Duan,
Rui Zhang
Abstract:
In the existing cellular networks, it remains a challenging problem to communicate with and control an unmanned aerial vehicle (UAV) swarm with both high reliability and low latency. Due to the UAV swarm's high working altitude and strong ground-to-air channels, it is generally exposed to multiple ground base stations (GBSs), while the GBSs that are serving ground users (occupied GBSs) can generat…
▽ More
In the existing cellular networks, it remains a challenging problem to communicate with and control an unmanned aerial vehicle (UAV) swarm with both high reliability and low latency. Due to the UAV swarm's high working altitude and strong ground-to-air channels, it is generally exposed to multiple ground base stations (GBSs), while the GBSs that are serving ground users (occupied GBSs) can generate strong interference to the UAV swarm. To tackle this issue, we propose a novel two-phase transmission protocol by exploiting cellular plus device-to-device (D2D) communication for the UAV swarm. In Phase I, one swarm head is chosen for ground-to-air channel estimation, and all the GBSs that are not serving ground users (available GBSs) transmit a common control message to the UAV swarm simultaneously, using the same cellular frequency band, to combat the strong interference from occupied GBSs. In Phase II, all the UAVs that have decoded the common control message in Phase I further relay it to the other UAVs in the swarm via D2D communication, by exploiting the less interfered D2D frequency band and the proximity among UAVs. In this paper, we aim to characterize the reliability performance of the above two-phase protocol, i.e., the expected percentage of UAVs in the swarm that can decode the common control message, which is a non-trivial problem due to the complex system setup and the intricate coupling between the two phases. Nevertheless, we manage to obtain an approximated expression of the reliability performance of interest, under reasonable assumptions and with the aid of the Pearson distributions. Numerical results validate the accuracy of our analytical results and show the effectiveness of our protocol over other benchmark protocols. We also study the effect of key system parameters on the reliability performance, to reveal useful insights on the practical system design.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond Feature and Signal
Authors:
Sifeng Xia,
Kunchangtai Liang,
Wenhan Yang,
Ling-Yu Duan,
Jiaying Liu
Abstract:
In this paper, we study a new problem arising from the emerging MPEG standardization effort Video Coding for Machine (VCM), which aims to bridge the gap between visual feature compression and classical video coding. VCM is committed to address the requirement of compact signal representation for both machine and human vision in a more or less scalable way. To this end, we make endeavors in leverag…
▽ More
In this paper, we study a new problem arising from the emerging MPEG standardization effort Video Coding for Machine (VCM), which aims to bridge the gap between visual feature compression and classical video coding. VCM is committed to address the requirement of compact signal representation for both machine and human vision in a more or less scalable way. To this end, we make endeavors in leveraging the strength of predictive and generative models to support advanced compression techniques for both machine and human vision tasks simultaneously, in which visual features serve as a bridge to connect signal-level and task-level compact representations in a scalable manner. Specifically, we employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern. By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames via a generative model, relying on the appearance of the coded key frames. Meanwhile, the sparse motion pattern is compact and highly effective for high-level vision tasks, e.g. action recognition. Experimental results demonstrate that our method yields much better reconstruction quality compared with the traditional video codecs (0.0063 gain in SSIM), as well as state-of-the-art action recognition performance over highly compressed videos (9.4% gain in recognition accuracy), which showcases a promising paradigm of coding signal for both human and machine vision.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Jamming-assisted Proactive Eavesdropping over Two Suspicious Communication Links
Authors:
Haiyang Zhang,
Lingjie Duan,
Rui Zhang
Abstract:
This paper studies a new and challenging wireless surveillance problem where a legitimate monitor attempts to eavesdrop two suspicious communication links simultaneously. To facilitate concurrent eavesdropping, our multi-antenna legitimate monitor employs a proactive eavesdropping via jamming approach, by selectively jamming suspicious receivers to lower the transmission rates of the target links.…
▽ More
This paper studies a new and challenging wireless surveillance problem where a legitimate monitor attempts to eavesdrop two suspicious communication links simultaneously. To facilitate concurrent eavesdropping, our multi-antenna legitimate monitor employs a proactive eavesdropping via jamming approach, by selectively jamming suspicious receivers to lower the transmission rates of the target links. In particular, we are interested in characterizing the achievable eavesdropping rate region for the minimum-mean-squared-error (MMSE) receiver case, by optimizing the legitimate monitor's jamming transmit covariance matrix subject to its power budget. As the monitor cannot hear more than what suspicious links transmit, the achievable eavesdropping rate region is essentially the intersection of the achievable rate region for the two suspicious links and that for the two eavesdropping links. The former region can be purposely altered by the monitor's jamming transmit covariance matrix, whereas the latter region is fixed when the MMSE receiver is employed. Therefore, we first analytically characterize the achievable rate region for the two suspicious links via optimizing the jamming transmit covariance matrix and then obtain the achievable eavesdropping rate region for the MMSE receiver case. Furthermore, we also extend our study to the MMSE with successive interference cancellation (MMSE-SIC) receiver case and characterize the corresponding achievable eavesdropping rate region by jointly optimizing the time-sharing factor between different decoding orders. Finally, numerical results are provided to corroborate our analysis and examine the eavesdropping performance.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.
-
Deep learning enables extraction of capillary-level angiograms from single OCT volume
Authors:
Jianlong Yang,
Peng Liu,
Lixin Duan,
Yan Hu,
Jiang Liu
Abstract:
Optical coherence tomography angiography (OCTA) has drawn numerous attentions in ophthalmology. However, its data acquisition is time-consuming, because it is based on temporal-decorrelation principle thus requires multiple repeated volumetric OCT scans. In this paper, we developed a deep learning algorithm by combining a fovea attention mechanism with a residual neural network, which is able to e…
▽ More
Optical coherence tomography angiography (OCTA) has drawn numerous attentions in ophthalmology. However, its data acquisition is time-consuming, because it is based on temporal-decorrelation principle thus requires multiple repeated volumetric OCT scans. In this paper, we developed a deep learning algorithm by combining a fovea attention mechanism with a residual neural network, which is able to extract capillary-level angiograms directly from a single OCT scan. The segmentation results of the inner limiting membrane and outer plexiform layers and the central $1\times1$ mm$^2$ field of view of the fovea are employed in the fovea attention mechanism. So the influences of large retinal vessels and choroidal vasculature on the extraction of capillaries can be minimized during the training of the network. The results demonstrate that the proposed algorithm has the capacity to better-visualizing capillaries around the foveal avascular zone than the existing work using a U-Net architecture.
△ Less
Submitted 14 October, 2019; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Economic Analysis of Rollover and Shared Data Plans
Authors:
Xuehe Wang,
Lingjie Duan
Abstract:
In today's growing data market, wireless service providers (WSPs) compete severely to attract users by announcing innovative data plans. Two of the most popular innovative data plans are rollover and shared data plans, where the former plan allows a user to keep his unused data quota to next month and the latter plan allows users in a family to share unused data. As a pioneer to provide such data…
▽ More
In today's growing data market, wireless service providers (WSPs) compete severely to attract users by announcing innovative data plans. Two of the most popular innovative data plans are rollover and shared data plans, where the former plan allows a user to keep his unused data quota to next month and the latter plan allows users in a family to share unused data. As a pioneer to provide such data plans, a WSP faces immediate revenue loss from existing users who pay less overage charges due to less data over-usage, but his market share increases gradually by attracting new users and those under the other WSPs. In some countries, WSPs have asymmetric timing for providing such innovative data plans, while some other markets' WSPs have symmetric timing or no planning. This raises the question of why and when the competitive WSPs should offer the new data plans. This paper provides game theoretic modelling and analysis of the WSPs' timing of offering innovative data plans, by considering new user arrival and dynamic user churn between WSPs. Our equilibrium analysis shows that the WSP with small market share prefers to announce the innovative data plan first to attract more users, while the WSP with large market share prefers to announce later to avoid the immediate revenue loss. In a market with many new users, WSPs with similar market shares will offer the data plans simultaneously, but these WSPs facing few new users may not offer any new plan. Perhaps surprisingly, WSPs' profits can decrease with new user number and they may not benefit from the option of innovative data plans. Finally, unlike rollover data plan, we show that the timing of shared data plan further depends on the composition of users.
△ Less
Submitted 29 March, 2019;
originally announced April 2019.
-
Dynamic Pricing and Capacity Allocation of UAV-provided Mobile Services
Authors:
Xuehe Wang,
Lingjie Duan
Abstract:
Due to its agility and mobility, the unmanned aerial vehicle (UAV) is a promising technology to provide high-quality mobile services (e.g., fast Internet access, edge computing, and local caching) to ground users. Major Internet Service Providers (ISPs) want to enable UAV-provided services (UPS) to improve and enrich the current mobile services for additional profit. This profit-maximization probl…
▽ More
Due to its agility and mobility, the unmanned aerial vehicle (UAV) is a promising technology to provide high-quality mobile services (e.g., fast Internet access, edge computing, and local caching) to ground users. Major Internet Service Providers (ISPs) want to enable UAV-provided services (UPS) to improve and enrich the current mobile services for additional profit. This profit-maximization problem is not easy as the UAV has limited energy storage and needs to fly closely to serve users, requiring an optimal energy allocation for balancing both hovering time and service capacity. When hovering in a hotspot, how the UAV should dynamically price its capacity-limited UPS according to randomly arriving users with private service valuations is another question. We prove that the UAV should ask for a higher price if the leftover hovering time is longer or its service capacity is smaller, and its expected profit approaches to that under complete user information if the hovering time is sufficiently large. As the hotspot's user occurrence rate increases, a shorter hovering time or a larger service capacity should be allocated. Finally, when the UAV faces multiple hotspot candidates with different user occurrence rates and flying distances, we prove that it is optimal to deploy the UAV to serve a single hotspot. With multiple UAVs, however, this result can be reversed with UAVs' forking deployment to different hotspots.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Wireless Power Transfer and Data Collection in Wireless Sensor Networks
Authors:
Kai Li,
Wei Ni,
Lingjie Duan,
Mehran Abolhasan,
Jianwei Niu
Abstract:
In a rechargeable wireless sensor network, the data packets are generated by sensor nodes at a specific data rate, and transmitted to a base station. Moreover, the base station transfers power to the nodes by using Wireless Power Transfer (WPT) to extend their battery life. However, inadequately scheduling WPT and data collection causes some of the nodes to drain their battery and have their data…
▽ More
In a rechargeable wireless sensor network, the data packets are generated by sensor nodes at a specific data rate, and transmitted to a base station. Moreover, the base station transfers power to the nodes by using Wireless Power Transfer (WPT) to extend their battery life. However, inadequately scheduling WPT and data collection causes some of the nodes to drain their battery and have their data buffer overflow, while the other nodes waste their harvested energy, which is more than they need to transmit their packets. In this paper, we investigate a novel optimal scheduling strategy, called EHMDP, aiming to minimize data packet loss from a network of sensor nodes in terms of the nodes' energy consumption and data queue state information. The scheduling problem is first formulated by a centralized MDP model, assuming that the complete states of each node are well known by the base station. This presents the upper bound of the data that can be collected in a rechargeable wireless sensor network. Next, we relax the assumption of the availability of full state information so that the data transmission and WPT can be semi-decentralized. The simulation results show that, in terms of network throughput and packet loss rate, the proposed algorithm significantly improves the network performance.
△ Less
Submitted 14 November, 2017; v1 submitted 3 November, 2017;
originally announced November 2017.