Skip to main content

Showing 1–50 of 73 results for author: Cai, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.23184  [pdf, ps, other

    eess.IV cs.AI cs.CV

    Score-based Diffusion Model for Unpaired Virtual Histology Staining

    Authors: Anran Liu, Xiaofei Wang, Jing Cai, Chao Li

    Abstract: Hematoxylin and eosin (H&E) staining visualizes histology but lacks specificity for diagnostic markers. Immunohistochemistry (IHC) staining provides protein-targeted staining but is restricted by tissue availability and antibody specificity. Virtual staining, i.e., computationally translating the H&E image to its IHC counterpart while preserving the tissue structure, is promising for efficient IHC… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 11 pages, 3 figures

  2. arXiv:2505.06118  [pdf, ps, other

    eess.IV cs.AI cs.CV

    The Application of Deep Learning for Lymph Node Segmentation: A Systematic Review

    Authors: Jingguo Qu, Xinyang Han, Man-Lik Chui, Yao Pu, Simon Takadiyi Gunda, Ziman Chen, Jing Qin, Ann Dorothy King, Winnie Chiu-Wing Chu, Jing Cai, Michael Tin-Cheung Ying

    Abstract: Automatic lymph node segmentation is the cornerstone for advances in computer vision tasks for early detection and staging of cancer. Traditional segmentation methods are constrained by manual delineation and variability in operator proficiency, limiting their ability to achieve high accuracy. The introduction of deep learning technologies offers new possibilities for improving the accuracy of lym… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  3. arXiv:2503.12399  [pdf, other

    cs.CV eess.IV

    Pathology Image Restoration via Mixture of Prompts

    Authors: Jiangdong Cai, Yan Chen, Zhenrong Shen, Haotian Jiang, Honglin Xiong, Kai Xuan, Lichi Zhang, Qian Wang

    Abstract: In digital pathology, acquiring all-in-focus images is essential to high-quality imaging and high-efficient clinical workflow. Traditional scanners achieve this by scanning at multiple focal planes of varying depths and then merging them, which is relatively slow and often struggles with complex tissue defocus. Recent prevailing image restoration technique provides a means to restore high-quality… ▽ More

    Submitted 16 March, 2025; originally announced March 2025.

  4. arXiv:2503.06563  [pdf, other

    eess.IV cs.AI cs.CV

    LSA: Latent Style Augmentation Towards Stain-Agnostic Cervical Cancer Screening

    Authors: Jiangdong Cai, Haotian Jiang, Zhenrong Shen, Yonghao Li, Honglin Xiong, Lichi Zhang, Qian Wang

    Abstract: The deployment of computer-aided diagnosis systems for cervical cancer screening using whole slide images (WSIs) faces critical challenges due to domain shifts caused by staining variations across different scanners and imaging environments. While existing stain augmentation methods improve patch-level robustness, they fail to scale to WSIs due to two key limitations: (1) inconsistent stain patter… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  5. arXiv:2503.05678  [pdf, other

    eess.IV cs.CV

    Towards Effective and Efficient Context-aware Nucleus Detection in Histopathology Whole Slide Images

    Authors: Zhongyi Shui, Ruizhe Guo, Honglin Li, Yuxuan Sun, Yunlong Zhang, Chenglu Zhu, Jiatong Cai, Pingyi Chen, Yanzhou Su, Lin Yang

    Abstract: Nucleus detection in histopathology whole slide images (WSIs) is crucial for a broad spectrum of clinical applications. Current approaches for nucleus detection in gigapixel WSIs utilize a sliding window methodology, which overlooks boarder contextual information (eg, tissue structure) and easily leads to inaccurate predictions. To address this problem, recent studies additionally crops a large Fi… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: under review

  6. arXiv:2502.20018  [pdf, other

    cs.RO cs.CV eess.IV

    Multi-Keypoint Affordance Representation for Functional Dexterous Grasping

    Authors: Fan Yang, Dongsheng Luo, Wenrui Chen, Jiacheng Lin, Junjie Cai, Kailun Yang, Zhiyong Li, Yaonan Wang

    Abstract: Functional dexterous grasping requires precise hand-object interaction, going beyond simple gripping. Existing affordance-based methods primarily predict coarse interaction regions and cannot directly constrain the grasping posture, leading to a disconnection between visual perception and manipulation. To address this issue, we propose a multi-keypoint affordance representation for functional dext… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: The source code and demo videos will be publicly available at https://github.com/PopeyePxx/MKA

  7. arXiv:2502.09656  [pdf, other

    q-bio.QM cs.CV eess.IV

    Multi-Omics Fusion with Soft Labeling for Enhanced Prediction of Distant Metastasis in Nasopharyngeal Carcinoma Patients after Radiotherapy

    Authors: Jiabao Sheng, SaiKit Lam, Jiang Zhang, Yuanpeng Zhang, Jing Cai

    Abstract: Omics fusion has emerged as a crucial preprocessing approach in the field of medical image processing, providing significant assistance to several studies. One of the challenges encountered in the integration of omics data is the presence of unpredictability arising from disparities in data sources and medical imaging equipment. In order to overcome this challenge and facilitate the integration of… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

    Journal ref: Computers in Biology and Medicine, 168, 107684 (2024)

  8. arXiv:2501.02000  [pdf, other

    eess.IV cs.AI cs.CV

    Multi-Center Study on Deep Learning-Assisted Detection and Classification of Fetal Central Nervous System Anomalies Using Ultrasound Imaging

    Authors: Yang Qi, Jiaxin Cai, Jing Lu, Runqing Xiong, Rongshang Chen, Liping Zheng, Duo Ma

    Abstract: Prenatal ultrasound evaluates fetal growth and detects congenital abnormalities during pregnancy, but the examination of ultrasound images by radiologists requires expertise and sophisticated equipment, which would otherwise fail to improve the rate of identifying specific types of fetal central nervous system (CNS) abnormalities and result in unnecessary patient examinations. We construct a deep… ▽ More

    Submitted 1 January, 2025; originally announced January 2025.

  9. SCKF-LSTM Based Trajectory Tracking for Electricity-Gas Integrated Energy System

    Authors: Liang Chen, Yang Li, Jun Cai, Songlin Gu, Ying Yan

    Abstract: This paper introduces a novel approach for tracking the dynamic trajectories of integrated natural gas and power systems, leveraging a Kalman filter-based structure. To predict the states of the system, the Holt's exponential smoothing techniques and nonlinear dynamic equations of gas pipelines are applied to establish the power and gas system equations, respectively. The square-root cubature Kalm… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

    Comments: Accepted by IEEE Transactions on Industrial Informatics

    Journal ref: IEEE Transactions on Industrial Informatics 21 (2025) 4296-4305

  10. arXiv:2411.15130  [pdf, other

    cs.RO eess.SY

    Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots

    Authors: Jiaze Cai, Vishnu Sangli, Mintae Kim, Koushil Sreenath

    Abstract: Bird-sized flapping-wing robots offer significant potential for agile flight in complex environments, but achieving agile and robust trajectory tracking remains a challenge due to the complex aerodynamics and highly nonlinear dynamics inherent in flapping-wing flight. In this work, a learning-based control approach is introduced to unlock the versatility and adaptiveness of flapping-wing flight. W… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

  11. arXiv:2411.12363  [pdf, other

    cs.SD eess.AS

    DGSNA: prompt-based Dynamic Generative Scene-based Noise Addition method

    Authors: Zihao Chen, Zhentao Lin, Bi Zeng, Linyi Huang, Zhi Li, Jia Cai

    Abstract: To ensure the reliable operation of speech systems across diverse environments, noise addition methods have emerged as the prevailing solution. However, existing methods offer limited coverage of real-world noisy scenes and depend on pre-existing scene-based information and noise. This paper presents prompt-based Dynamic Generative Scene-based Noise Addition (DGSNA), a novel noise addition methodo… ▽ More

    Submitted 26 May, 2025; v1 submitted 19 November, 2024; originally announced November 2024.

  12. arXiv:2411.08896  [pdf, other

    eess.SP cs.LG cs.NI

    Demand-Aware Beam Hopping and Power Allocation for Load Balancing in Digital Twin empowered LEO Satellite Networks

    Authors: Ruili Zhao, Jun Cai, Jiangtao Luo, Junpeng Gao, Yongyi Ran

    Abstract: Low-Earth orbit (LEO) satellites utilizing beam hopping (BH) technology offer extensive coverage, low latency, high bandwidth, and significant flexibility. However, the uneven geographical distribution and temporal variability of ground traffic demands, combined with the high mobility of LEO satellites, present significant challenges for efficient beam resource utilization. Traditional BH methods… ▽ More

    Submitted 28 October, 2024; originally announced November 2024.

  13. arXiv:2410.23738  [pdf, other

    eess.IV cs.CV

    MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image Segmentation

    Authors: Yufeng Jiang, Zongxi Li, Xiangyan Chen, Haoran Xie, Jing Cai

    Abstract: Recent advancements in medical imaging have resulted in more complex and diverse images, with challenges such as high anatomical variability, blurred tissue boundaries, low organ contrast, and noise. Traditional segmentation methods struggle to address these challenges, making deep learning approaches, particularly U-shaped architectures, increasingly prominent. However, the quadratic complexity o… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

  14. arXiv:2410.16817  [pdf

    physics.med-ph eess.IV

    A Deep Learning-Based Method for Metal Artifact-Resistant Syn-MP-RAGE Contrast Synthesis

    Authors: Ziyi Zeng, Yuhao Wang, Dianlin Hu, T. Michael O'Shea, Rebecca C. Fry, Jing Cai, Lei Zhang

    Abstract: In certain brain volumetric studies, synthetic T1-weighted magnetization-prepared rapid gradient-echo (MP-RAGE) contrast, derived from quantitative T1 MRI (T1-qMRI), proves highly valuable due to its clear white/gray matter boundaries for brain segmentation. However, generating synthetic MP-RAGE (syn-MP-RAGE) typically requires pairs of high-quality, artifact-free, multi-modality inputs, which can… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: 11 pages, 8 figures, 2 tables

  15. arXiv:2410.09289  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Multimodal Audio-based Disease Prediction with Transformer-based Hierarchical Fusion Network

    Authors: Jinjin Cai, Ruiqi Wang, Dezhong Zhao, Ziqin Yuan, Victoria McKenna, Aaron Friedman, Rachel Foot, Susan Storey, Ryan Boente, Sudip Vhaduri, Byung-Cheol Min

    Abstract: Audio-based disease prediction is emerging as a promising supplement to traditional medical diagnosis methods, facilitating early, convenient, and non-invasive disease detection and prevention. Multimodal fusion, which integrates features from various domains within or across bio-acoustic modalities, has proven effective in enhancing diagnostic performance. However, most existing methods in the fi… ▽ More

    Submitted 14 December, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

  16. arXiv:2409.15961  [pdf, ps, other

    cs.NI eess.SP

    Toward Scalable and Efficient Visual Data Transmission in 6G Networks

    Authors: Junhao Cai, Taegun An, Changhee Joo

    Abstract: 6G network technology will emerge in a landscape where visual data transmissions dominate global mobile traffic and are expected to grow continuously, driven by the increasing demand for AI-based computer vision applications. This will make already challenging task of visual data transmission even more difficult. In this work, we review effective techniques for visual data transmission, such as co… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  17. arXiv:2408.03361  [pdf, other

    eess.IV cs.CV

    GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

    Authors: Pengcheng Chen, Jin Ye, Guoan Wang, Yanjun Li, Zhongying Deng, Wei Li, Tianbin Li, Haodong Duan, Ziyan Huang, Yanzhou Su, Benyou Wang, Shaoting Zhang, Bin Fu, Jianfei Cai, Bohan Zhuang, Eric J Seibel, Junjun He, Yu Qiao

    Abstract: Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields. In the medical field, LVLMs have a high potential to offer substantial assistance for diagnosis and treatment. Before that, it is crucial to develop benchmarks to evaluate LVLMs' effectiveness in various medical applications. Curren… ▽ More

    Submitted 21 October, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

    Comments: GitHub: https://github.com/uni-medical/GMAI-MMBench Hugging face: https://huggingface.co/datasets/OpenGVLab/GMAI-MMBench

  18. arXiv:2407.06662  [pdf, other

    eess.SP

    Experimental Demonstration of 16D Voronoi Constellation with Two-Level Coding over 50km Four-Core Fiber

    Authors: Can Zhao, Bin Chen, Jiaqi Cai, Zhiwei Liang, Yi Lei, Junjie Xiong, Lin Ma, Daohui Hu, Lin Sun, Gangxiang Shen

    Abstract: A 16-dimensional Voronoi constellation concatenated with multilevel coding is experimentally demonstrated over a 50km four-core fiber transmission system. The proposed scheme reduces the required launch power by 6dB and provides a 17dB larger operating range than 16QAM with BICM at the outer HD-FEC BER threshold.

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 4 pages, 4 figures, accepted by 2024 European Conference on Optical Communication (ECOC)

  19. arXiv:2407.06612  [pdf

    eess.IV cs.CV cs.LG

    AI-based Automatic Segmentation of Prostate on Multi-modality Images: A Review

    Authors: Rui Jin, Derun Li, Dehui Xiang, Lei Zhang, Hailing Zhou, Fei Shi, Weifang Zhu, Jing Cai, Tao Peng, Xinjian Chen

    Abstract: Prostate cancer represents a major threat to health. Early detection is vital in reducing the mortality rate among prostate cancer patients. One approach involves using multi-modality (CT, MRI, US, etc.) computer-aided diagnosis (CAD) systems for the prostate region. However, prostate segmentation is challenging due to imperfections in the images and the prostate's complex tissue structure. The ad… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  20. arXiv:2407.01469  [pdf, other

    eess.IV

    Unrolling Plug-and-Play Gradient Graph Laplacian Regularizer for Image Restoration

    Authors: Jianghe Cai, Gene Cheung, Fei Chen

    Abstract: Generic deep learning (DL) networks for image restoration like denoising and interpolation lack mathematical interpretability, require voluminous training data to tune a large parameter set, and are fragile in the face of covariate shift. To address these shortcomings, we build interpretable networks by unrolling variants of a graph-based optimization algorithm of different complexities. Specifica… ▽ More

    Submitted 12 March, 2025; v1 submitted 1 July, 2024; originally announced July 2024.

  21. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, Jingyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  22. arXiv:2402.18936  [pdf, ps, other

    cs.NI eess.SP

    Energy-Efficient UAV Swarm Assisted MEC with Dynamic Clustering and Scheduling

    Authors: Jialiuyuan Li, Jiayuan Chen, Changyan Yi, Tong Zhang, Kun Zhu, Jun Cai

    Abstract: In this paper, the energy-efficient unmanned aerial vehicle (UAV) swarm assisted mobile edge computing (MEC) with dynamic clustering and scheduling is studied. In the considered system model, UAVs are divided into multiple swarms, with each swarm consisting of a leader UAV and several follower UAVs to provide computing services to end-users. Unlike existing work, we allow UAVs to dynamically clust… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  23. arXiv:2310.11641  [pdf

    eess.IV cs.AI physics.med-ph

    Cloud-Magnetic Resonance Imaging System: In the Era of 6G and Artificial Intelligence

    Authors: Yirong Zhou, Yanhuang Wu, Yuhan Su, Jing Li, Jianyun Cai, Yongfu You, Di Guo, Xiaobo Qu

    Abstract: Magnetic Resonance Imaging (MRI) plays an important role in medical diagnosis, generating petabytes of image data annually in large hospitals. This voluminous data stream requires a significant amount of network bandwidth and extensive storage infrastructure. Additionally, local data processing demands substantial manpower and hardware investments. Data isolation across different healthcare instit… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 4pages, 5figures, letters

  24. arXiv:2310.01176  [pdf, other

    eess.IV cs.CV

    Cross-adversarial local distribution regularization for semi-supervised medical image segmentation

    Authors: Thanh Nguyen-Duc, Trung Le, Roland Bammer, He Zhao, Jianfei Cai, Dinh Phung

    Abstract: Medical semi-supervised segmentation is a technique where a model is trained to segment objects of interest in medical images with limited annotated data. Existing semi-supervised segmentation methods are usually based on the smoothness assumption. This assumption implies that the model output distributions of two similar data samples are encouraged to be invariant. In other words, the smoothness… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: MICCAI 2023

  25. arXiv:2309.02670  [pdf, other

    eess.IV cs.CV

    Progressive Attention Guidance for Whole Slide Vulvovaginal Candidiasis Screening

    Authors: Jiangdong Cai, Honglin Xiong, Maosong Cao, Luyan Liu, Lichi Zhang, Qian Wang

    Abstract: Vulvovaginal candidiasis (VVC) is the most prevalent human candidal infection, estimated to afflict approximately 75% of all women at least once in their lifetime. It will lead to several symptoms including pruritus, vaginal soreness, and so on. Automatic whole slide image (WSI) classification is highly demanded, for the huge burden of disease control and prevention. However, the WSI-based compute… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted in the main conference MICCAI 2023

    Journal ref: 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023)

  26. arXiv:2307.04513  [pdf, other

    eess.IV cs.CV

    CoactSeg: Learning from Heterogeneous Data for New Multiple Sclerosis Lesion Segmentation

    Authors: Yicheng Wu, Zhonghua Wu, Hengcan Shi, Bjoern Picker, Winston Chong, Jianfei Cai

    Abstract: New lesion segmentation is essential to estimate the disease progression and therapeutic effects during multiple sclerosis (MS) clinical treatments. However, the expensive data acquisition and expert annotation restrict the feasibility of applying large-scale deep learning models. Since single-time-point samples with all-lesion labels are relatively easy to collect, exploiting them to train deep m… ▽ More

    Submitted 14 September, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: Accepted by MICCAI 2023 (Early Acceptance)

  27. arXiv:2306.01864  [pdf, other

    cs.LG cs.SD eess.AS

    Discovering COVID-19 Coughing and Breathing Patterns from Unlabeled Data Using Contrastive Learning with Varying Pre-Training Domains

    Authors: Jinjin Cai, Sudip Vhaduri, Xiao Luo

    Abstract: Rapid discovery of new diseases, such as COVID-19 can enable a timely epidemic response, preventing the large-scale spread and protecting public health. However, limited research efforts have been taken on this problem. In this paper, we propose a contrastive learning-based modeling approach for COVID-19 coughing and breathing pattern discovery from non-COVID coughs. To validate our models, extens… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted by Proceedings of INTERSPEECH 2023

    Journal ref: Proceedings of INTERSPEECH 2023

  28. arXiv:2305.12793  [pdf, other

    eess.AS cs.CL cs.MM cs.SD

    Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training

    Authors: Jianfeng He, Julian Salazar, Kaisheng Yao, Haoqi Li, Jinglun Cai

    Abstract: End-to-end (E2E) spoken language understanding (SLU) is constrained by the cost of collecting speech-semantics pairs, especially when label domains change. Hence, we explore \textit{zero-shot} E2E SLU, which learns E2E SLU without speech-semantics pairs, instead using only speech-text and text-semantics pairs. Previous work achieved zero-shot by pseudolabeling all speech-text transcripts with a na… ▽ More

    Submitted 2 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 18 pages, 7 figures

  29. arXiv:2305.03837  [pdf, other

    eess.AS cs.LG cs.SD

    Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation

    Authors: Nilaksh Das, Monica Sunkara, Sravan Bodapati, Jinglun Cai, Devang Kulshreshtha, Jeff Farris, Katrin Kirchhoff

    Abstract: End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture,… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Accepted to ICASSP 2023

  30. arXiv:2304.12184  [pdf, other

    eess.SP cs.AI cs.IT cs.LG

    Active RIS-aided EH-NOMA Networks: A Deep Reinforcement Learning Approach

    Authors: Zhaoyuan Shi, Huabing Lu, Xianzhong Xie, Helin Yang, Chongwen Huang, Jun Cai, Zhiguo Ding

    Abstract: An active reconfigurable intelligent surface (RIS)-aided multi-user downlink communication system is investigated, where non-orthogonal multiple access (NOMA) is employed to improve spectral efficiency, and the active RIS is powered by energy harvesting (EH). The problem of joint control of the RIS's amplification matrix and phase shift matrix is formulated to maximize the communication success ra… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  31. arXiv:2303.04585  [pdf, other

    cs.SD cs.AI eess.AS

    Exploring Efficient-Tuned Learning Audio Representation Method from BriVL

    Authors: Sen Fang, Yangjian Wu, Bowen Gao, Jingwen Cai, Teik Toe Teoh

    Abstract: Recently, researchers have gradually realized that in some cases, the self-supervised pre-training on large-scale Internet data is better than that of high-quality/manually labeled data sets, and multimodal/large models are better than single or bimodal/small models. In this paper, we propose a robust audio representation learning method WavBriVL based on Bridging-Vision-and-Language (BriVL). WavB… ▽ More

    Submitted 28 July, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: 13 pages, 2023.3 Finished

  32. arXiv:2212.07867  [pdf, other

    eess.IV cs.CV cs.RO

    Localizing Scan Targets from Human Pose for Autonomous Lung Ultrasound Imaging

    Authors: Jianzhi Long, Jicang Cai, Abdullah Al-Battal, Shiwei Jin, Jing Zhang, Dacheng Tao, Truong Nguyen

    Abstract: Ultrasound is progressing toward becoming an affordable and versatile solution to medical imaging. With the advent of COVID-19 global pandemic, there is a need to fully automate ultrasound imaging as it requires trained operators in close proximity to patients for a long period of time, therefore increasing risk of infection. In this work, we investigate the important yet seldom-studied problem of… ▽ More

    Submitted 25 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: v2 2023/02/25

    ACM Class: I.4.9

  33. Super-resolution Reconstruction of Single Image for Latent features

    Authors: Xin Wang, Jing-Ke Yan, Jing-Ye Cai, Jian-Hua Deng, Qin Qin, Yao Cheng

    Abstract: Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image. However, during SISR tasks, it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features. This challenge can lead to issues such as model collapse, lack of… ▽ More

    Submitted 9 November, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Journal ref: Computational Visual Media,2023

  34. arXiv:2211.11144  [pdf

    eess.IV cs.CV

    Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI with Simultaneous Motion Estimation and Super-Resolution

    Authors: Shaohua Zhi, Yinghui Wang, Haonan Xiao, Ti Bai, Hong Ge, Bing Li, Chenyang Liu, Wen Li, Tian Li, Jing Cai

    Abstract: Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging technique for tumor motion management in image-guided radiation therapy (IGRT). However, current 4D-MRI suffers from low spatial resolution and strong motion artifacts owing to the long acquisition time and patients' respiratory variations; these limitations, if not managed properly, can adversely affect treatment planning and del… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  35. arXiv:2210.16514  [pdf

    physics.med-ph eess.IV

    Extracting lung function-correlated information from CT-encoded static textures

    Authors: Yu-Hua Huang, Xinzhi Teng, Jiang Zhang, Zhi Chen, Zongrui Ma, Ge Ren, Feng-Ming, Kong, Jing Cai

    Abstract: The inherent characteristics of lung tissues, which are independent of breathing manoeuvre, may provide fundamental information on lung function. This paper attempted to study function-correlated lung textures and their spatial distribution from CT. 21 lung cancer patients with thoracic 4DCT scans, DTPA-SPECT ventilation images (V), and available pulmonary function test (PFT) measurements were col… ▽ More

    Submitted 29 October, 2022; originally announced October 2022.

    Comments: 6 figures, 4 tables

  36. arXiv:2206.01777  [pdf, other

    cs.CV eess.IV

    Real-Time Super-Resolution for Real-World Images on Mobile Devices

    Authors: Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho

    Abstract: Image Super-Resolution (ISR), which aims at recovering High-Resolution (HR) images from the corresponding Low-Resolution (LR) counterparts. Although recent progress in ISR has been remarkable. However, they are way too computationally intensive to be deployed on edge devices, since most of the recent approaches are deep learning-based. Besides, these methods always fail in real-world scenes, since… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2004.13674

  37. arXiv:2204.14154  [pdf, other

    cs.IT eess.SP

    Outage Performance of Uplink Rate Splitting Multiple Access with Randomly Deployed Users

    Authors: Huabing Lu, Xianzhong Xie, Zhaoyuan Shi, Hongjian Lei, Nan Zhao, Jun Cai

    Abstract: With the rapid proliferation of smart devices in wireless networks, more powerful technologies are expected to fulfill the network requirements of high throughput, massive connectivity, and diversify quality of service. To this end, rate splitting multiple access (RSMA) is proposed as a promising solution to improve spectral efficiency and provide better fairness for the next-generation mobile net… ▽ More

    Submitted 10 April, 2023; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: 38 pages,8 figures

  38. arXiv:2203.07948  [pdf, other

    cs.ET eess.SP

    An Ultra-Compact Single FeFET Binary and Multi-Bit Associative Search Engine

    Authors: Xunzhao Yin, Franz Müller, Qingrong Huang, Chao Li, Mohsen Imani, Zeyu Yang, Jiahao Cai, Maximilian Lederer, Ricardo Olivo, Nellie Laleni, Shan Deng, Zijian Zhao, Cheng Zhuo, Thomas Kämpfe, Kai Ni

    Abstract: Content addressable memory (CAM) is widely used in associative search tasks for its highly parallel pattern matching capability. To accommodate the increasingly complex and data-intensive pattern matching tasks, it is critical to keep improving the CAM density to enhance the performance and area efficiency. In this work, we demonstrate: i) a novel ultra-compact 1FeFET CAM design that enables paral… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 20 pages, 14 figures

  39. arXiv:2203.01324  [pdf, ps, other

    eess.IV cs.CV

    Exploring Smoothness and Class-Separation for Semi-supervised Medical Image Segmentation

    Authors: Yicheng Wu, Zhonghua Wu, Qianyi Wu, Zongyuan Ge, Jianfei Cai

    Abstract: Semi-supervised segmentation remains challenging in medical imaging since the amount of annotated medical data is often scarce and there are many blurred pixels near the adhesive edges or in the low-contrast regions. To address the issues, we advocate to firstly constrain the consistency of pixels with and without strong perturbations to apply a sufficient smoothness constraint and further encoura… ▽ More

    Submitted 28 June, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted by MICCAI 2022

  40. arXiv:2202.13372  [pdf, other

    eess.IV cs.CV

    Weakly Supervised Learning for cell recognition in immunohistochemical cytoplasm staining images

    Authors: Shichuan Zhang, Chenglu Zhu, Honglin Li, Jiatong Cai, Lin Yang

    Abstract: Cell classification and counting in immunohistochemical cytoplasm staining images play a pivotal role in cancer diagnosis. Weakly supervised learning is a potential method to deal with labor-intensive labeling. However, the inconstant cell morphology and subtle differences between classes also bring challenges. To this end, we present a novel cell recognition framework based on multi-task learning… ▽ More

    Submitted 27 February, 2022; originally announced February 2022.

  41. arXiv:2112.08561  [pdf, other

    cs.SD eess.AS

    EmotionBox: a music-element-driven emotional music generation system using Recurrent Neural Network

    Authors: Kaitong Zheng, Ruijie Meng, Chengshi Zheng, Xiaodong Li, Jinqiu Sang, Juanjuan Cai, Jie Wang

    Abstract: With the development of deep neural networks, automatic music composition has made great progress. Although emotional music can evoke listeners' different emotions and it is important for artistic expression, only few researches have focused on generating emotional music. This paper presents EmotionBox -an music-element-driven emotional music generator that is capable of composing music given a sp… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

  42. arXiv:2111.11893  [pdf, other

    eess.IV cs.CV

    Extending the Unmixing methods to Multispectral Images

    Authors: Jizhen Cai, Hermine Chatoux, Clotilde Boust, Alamin Mansouri

    Abstract: In the past few decades, there has been intensive research concerning the Unmixing of hyperspectral images. Some methods such as NMF, VCA, and N-FINDR have become standards since they show robustness in dealing with the unmixing of hyperspectral images. However, the research concerning the unmixing of multispectral images is relatively scarce. Thus, we extend some unmixing methods to the multispec… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

    Comments: 6 pages, CIC29 conference

  43. arXiv:2109.09503  [pdf, other

    cs.IT cs.LG cs.NI eess.SP

    Deep Reinforcement Learning Based Multidimensional Resource Management for Energy Harvesting Cognitive NOMA Communications

    Authors: Zhaoyuan Shi, Xianzhong Xie, Huabing Lu, Helin Yang, Jun Cai, Zhiguo Ding

    Abstract: The combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency and spectral efficiency of the upcoming beyond fifth generation network (B5G), especially for support the wireless sensor communications in Internet of things (IoT) system. However, how to realize intelligent frequency, time, and energy res… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: 35 pages, 12 figures

  44. arXiv:2105.01828  [pdf, other

    eess.IV cs.CV

    Lesion Segmentation and RECIST Diameter Prediction via Click-driven Attention and Dual-path Connection

    Authors: Youbao Tang, Ke Yan, Jinzheng Cai, Lingyun Huang, Guotong Xie, Jing Xiao, Jingjing Lu, Gigin Lin, Le Lu

    Abstract: Measuring lesion size is an important step to assess tumor growth and monitor disease progression and therapy response in oncology image analysis. Although it is tedious and highly time-consuming, radiologists have to work on this task by using RECIST criteria (Response Evaluation Criteria In Solid Tumors) routinely and manually. Even though lesion segmentation may be the more accurate and clinica… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

  45. arXiv:2105.01218  [pdf, other

    eess.IV cs.CV

    Weakly-Supervised Universal Lesion Segmentation with Regional Level Set Loss

    Authors: Youbao Tang, Jinzheng Cai, Ke Yan, Lingyun Huang, Guotong Xie, Jing Xiao, Jingjing Lu, Gigin Lin, Le Lu

    Abstract: Accurately segmenting a variety of clinically significant lesions from whole body computed tomography (CT) scans is a critical task on precision oncology imaging, denoted as universal lesion segmentation (ULS). Manual annotation is the current clinical practice, being highly time-consuming and inconsistent on tumor's longitudinal assessment. Effectively training an automatic segmentation model is… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  46. arXiv:2102.05011  [pdf, other

    cs.LG cs.CV eess.IV

    Mars Image Content Classification: Three Years of NASA Deployment and Recent Advances

    Authors: Kiri Wagstaff, Steven Lu, Emily Dunkel, Kevin Grimes, Brandon Zhao, Jesse Cai, Shoshanna B. Cole, Gary Doran, Raymond Francis, Jake Lee, Lukas Mandrake

    Abstract: The NASA Planetary Data System hosts millions of images acquired from the planet Mars. To help users quickly find images of interest, we have developed and deployed content-based classification and search capabilities for Mars orbital and surface images. The deployed systems are publicly accessible using the PDS Image Atlas. We describe the process of training, evaluating, calibrating, and deployi… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: Published at the Thirty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-21). IAAI Innovative Application Award. 10 pages, 11 figures, 6 tables

  47. arXiv:2010.10298  [pdf

    eess.IV cs.CV

    The Detection of Thoracic Abnormalities ChestX-Det10 Challenge Results

    Authors: Jie Lian, Jingyu Liu, Yizhou Yu, Mengyuan Ding, Yaoci Lu, Yi Lu, Jie Cai, Deshou Lin, Miao Zhang, Zhe Wang, Kai He, Yijie Yu

    Abstract: The detection of thoracic abnormalities challenge is organized by the Deepwise AI Lab. The challenge is divided into two rounds. In this paper, we present the results of 6 teams which reach the second round. The challenge adopts the ChestX-Det10 dateset proposed by the Deepwise AI Lab. ChestX-Det10 is the first chest X-Ray dataset with instance-level annotations, including 10 categories of disease… ▽ More

    Submitted 21 October, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

  48. arXiv:2010.05388  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    AI Song Contest: Human-AI Co-Creation in Songwriting

    Authors: Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, Carrie J. Cai

    Abstract: Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song w… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: 6 pages + 3 pages of references

    ACM Class: J.5; I.2

    Journal ref: ISMIR 2020

  49. Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time

    Authors: Hui Zeng, Jianrui Cai, Lida Li, Zisheng Cao, Lei Zhang

    Abstract: Recent years have witnessed the increasing popularity of learning based methods to enhance the color and tone of photos. However, many existing photo enhancement methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In this paper, we learn image-… ▽ More

    Submitted 30 September, 2020; originally announced September 2020.

    Comments: High quality adaptive photo enhancement in real-time (<2ms for 4K resolution images)! Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  50. arXiv:2009.06943  [pdf, other

    eess.IV cs.CV

    AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Xiaotong Luo, Liang Chen, Jiangtao Zhang, Maitreya Suin , et al. (60 additional authors not shown)

    Abstract: This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces one or several aspects such as runtime, parameter co… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.