Skip to main content

Showing 1–50 of 151 results for author: Sun, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.00388  [pdf, ps, other

    cs.IT eess.SP

    Accuracy and Security-Guaranteed Participant Selection and Beamforming Design for RIS-Assisted Federated Learning

    Authors: Mengru Wu, Yu Gao, Weidang Lu, Huimei Han, Lei Sun, Wanli Ni

    Abstract: Federated learning (FL) has emerged as an effective approach for training neural network models without requiring the sharing of participants' raw data, thereby addressing data privacy concerns. In this paper, we propose a reconfigurable intelligent surface (RIS)-assisted FL framework in the presence of eavesdropping, where partial edge devices are selected to participate in the FL training proces… ▽ More

    Submitted 30 June, 2025; originally announced July 2025.

  2. arXiv:2506.23473  [pdf, ps, other

    eess.SP

    Cooperative Sensing in Cell-free Massive MIMO ISAC Systems: Performance Optimization and Signal Processing

    Authors: Haotian Liu, Zhiqing Wei, Luyang Sun, Ruizhong Xu, Yixin Zhang, Zhiyong Feng

    Abstract: Integrated sensing and communication (ISAC), as a technology enabled seamless connection between communication and sensing, is regarded a core enabling technology for these applications. However, the accuracy of single-node sensing in ISAC system is limited, prompting the emergence of multi-node cooperative sensing. In multi-node cooperative sensing, the synchronization error limits the sensing ac… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 13 pages, 10 figures

  3. arXiv:2506.22732  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Robust Tensor Completion via Gradient Tensor Nulclear L1-L2 Norm for Traffic Data Recovery

    Authors: Hao Shu, Jicheng Li, Tianyv Lei, Lijun Sun

    Abstract: In real-world scenarios, spatiotemporal traffic data frequently experiences dual degradation from missing values and noise caused by sensor malfunctions and communication failures. Therefore, effective data recovery methods are essential to ensure the reliability of downstream data-driven applications. while classical tensor completion methods have been widely adopted, they are incapable of modeli… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  4. arXiv:2506.17599  [pdf, ps, other

    eess.SP

    Two-Stage Prony-Based Estimation of Fractional Delay and Doppler Shifts in OTFS Modulation

    Authors: Yutaka Jitsumatsu, Liangchen Sun

    Abstract: This paper addresses the estimation of fractional delay and Doppler shifts in multipath channels that cause doubly selective fading-an essential task for integrated sensing and communication (ISAC) systems in high-mobility environments. Orthogonal Time Frequency Space (OTFS) modulation enables simple and robust channel compensation under such conditions. However, fractional delay and Doppler compo… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

    Comments: 6 pages and 7 figures

  5. arXiv:2506.15835  [pdf, ps, other

    eess.IV cs.AI cs.CV

    MoNetV2: Enhanced Motion Network for Freehand 3D Ultrasound Reconstruction

    Authors: Mingyuan Luo, Xin Yang, Zhongnuo Yan, Yan Cao, Yuanji Zhang, Xindi Hu, Jin Wang, Haoxuan Ding, Wei Han, Litao Sun, Dong Ni

    Abstract: Three-dimensional (3D) ultrasound (US) aims to provide sonographers with the spatial relationships of anatomical structures, playing a crucial role in clinical diagnosis. Recently, deep-learning-based freehand 3D US has made significant advancements. It reconstructs volumes by estimating transformations between images without external tracking. However, image-only reconstruction poses difficulties… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  6. arXiv:2506.13094  [pdf, ps, other

    eess.IV

    MorphSAM: Learning the Morphological Prompts from Atlases for Spine Image Segmentation

    Authors: Dingwei Fan, Junyong Zhao, Chunlin Li, Xinlong Wang, Ronghan Zhang, Mingliang Wang, Qi Zhu, Haipeng Si, Daoqiang Zhang, Liang Sun

    Abstract: Spine image segmentation is crucial for clinical diagnosis and treatment of spine diseases. The complex structure of the spine and the high morphological similarity between individual vertebrae and adjacent intervertebral discs make accurate spine segmentation a challenging task. Although the Segment Anything Model (SAM) has been developed, it still struggles to effectively capture and utilize mor… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  7. arXiv:2506.07876  [pdf, ps, other

    cs.RO eess.SY

    Versatile Loco-Manipulation through Flexible Interlimb Coordination

    Authors: Xinghao Zhu, Yuxin Chen, Lingfeng Sun, Farzad Niroui, Simon Le Cleac'h, Jiuguang Wang, Kuan Fang

    Abstract: The ability to flexibly leverage limbs for loco-manipulation is essential for enabling autonomous robots to operate in unstructured environments. Yet, prior work on loco-manipulation is often constrained to specific tasks or predetermined limb configurations. In this work, we present Reinforcement Learning for Interlimb Coordination (ReLIC), an approach that enables versatile loco-manipulation thr… ▽ More

    Submitted 10 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  8. arXiv:2506.04861  [pdf, ps, other

    eess.SP

    Design of OTFS Signals with Pulse Shaping and Window Function for OTFS-Based Radar

    Authors: Liangchen Sun, Yutaka Jitsumatsu

    Abstract: We propose a pulse radar system that employs a generalized window function derived from the root raised cosine (RRC), which relaxes the conventional constraint that the window values are within the range [0, 1]. The proposed window allows both negative values and values exceeding 1, enabling greater flexibility in signal design. The system transmits orthogonal time frequency space (OTFS) signals i… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: A total of 7 pages, including two tables and six figures

  9. arXiv:2506.01394  [pdf, ps, other

    eess.IV cs.CV

    NTIRE 2025 the 2nd Restore Any Image Model (RAIM) in the Wild Challenge

    Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Zhengqiang Zhang, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

    Abstract: In this paper, we present a comprehensive overview of the NTIRE 2025 challenge on the 2nd Restore Any Image Model (RAIM) in the Wild. This challenge established a new benchmark for real-world image restoration, featuring diverse scenarios with and without reference ground truth. Participants were tasked with restoring real-captured images suffering from complex and unknown degradations, where both… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  10. arXiv:2505.11248  [pdf, ps, other

    eess.SP cs.IT

    Unfolded Deep Graph Learning for Networked Over-the-Air Computation

    Authors: Xiao Tang, Huirong Xiao, Chao Shen, Li Sun, Qinghe Du, Dusit Niyato, Zhu Han

    Abstract: Over-the-air computation (AirComp) has emerged as a promising technology that enables simultaneous transmission and computation through wireless channels. In this paper, we investigate the networked AirComp in multiple clusters allowing diversified data computation, which is yet challenged by the transceiver coordination and interference management therein. Particularly, we aim to maximize the mul… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: Accepted @ IEEE TWC

  11. arXiv:2505.02211  [pdf, other

    eess.IV cs.CV

    CSASN: A Multitask Attention-Based Framework for Heterogeneous Thyroid Carcinoma Classification in Ultrasound Images

    Authors: Peiqi Li, Yincheng Gao, Renxing Li, Haojie Yang, Yunyun Liu, Boji Liu, Jiahui Ni, Ying Zhang, Yulu Wu, Xiaowei Fang, Lehang Guo, Liping Sun, Jiangang Chen

    Abstract: Heterogeneous morphological features and data imbalance pose significant challenges in rare thyroid carcinoma classification using ultrasound imaging. To address this issue, we propose a novel multitask learning framework, Channel-Spatial Attention Synergy Network (CSASN), which integrates a dual-branch feature extractor - combining EfficientNet for local spatial encoding and ViT for global semant… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: 18 pages, 10 figures, 4 tables

  12. arXiv:2504.10686  [pdf, other

    cs.CV eess.IV

    The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Hang Guo, Lei Sun, Zongwei Wu, Radu Timofte, Yawei Li, Yao Zhang, Xinning Chai, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song, Hongyuan Yu, Pufan Xu, Cheng Wan, Zhijuan Huang, Peng Guo, Shuyuan Cui, Chenjun Li, Xuehai Hu, Pan Pan, Xin Zhang, Heng Zhang, Qing Luo, Linyan Jiang , et al. (122 additional authors not shown)

    Abstract: This paper presents a comprehensive review of the NTIRE 2025 Challenge on Single-Image Efficient Super-Resolution (ESR). The challenge aimed to advance the development of deep models that optimize key computational metrics, i.e., runtime, parameters, and FLOPs, while achieving a PSNR of at least 26.90 dB on the $\operatorname{DIV2K\_LSDIR\_valid}$ dataset and 26.99 dB on the… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted by CVPR2025 NTIRE Workshop, Efficient Super-Resolution Challenge Report. 50 pages

  13. arXiv:2504.01311  [pdf, other

    eess.SY

    Model-Predictive Planning and Airspeed Regulation to Minimize Flight Energy Consumption

    Authors: Trevor Karpinski, Alexander Blakesley, Jakub Krol, Bani Anvari, George Gorospe, Liang Sun

    Abstract: Although battery technology has advanced tremendously over the past decade, it continues to be a bottleneck for the mass adoption of electric aircraft in long-haul cargo and passenger delivery. The onboard energy is expected to be utilized in an efficient manner. Energy concumption modeling research offers increasingly accurate mathematical models, but there is scant research pertaining to real-ti… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  14. arXiv:2503.03971  [pdf, other

    eess.IV

    Towards Universal Learning-based Model for Cardiac Image Reconstruction: Summary of the CMRxRecon2024 Challenge

    Authors: Fanwen Wang, Zi Wang, Yan Li, Jun Lyu, Chen Qin, Shuo Wang, Kunyuan Guo, Mengting Sun, Mingkai Huang, Haoyu Zhang, Michael Tänzer, Qirong Li, Xinran Chen, Jiahao Huang, Yinzhe Wu, Kian Anvari Hamedani, Yuntong Lyu, Longyu Sun, Qing Li, Ziqiang Xu, Bingyu Xin, Dimitris N. Metaxas, Narges Razizadeh, Shahabedin Nabavi, George Yiasemis , et al. (34 additional authors not shown)

    Abstract: Cardiovascular magnetic resonance (CMR) imaging offers diverse contrasts for non-invasive assessment of cardiac function and myocardial characterization. However, CMR often requires the acquisition of many contrasts, and each contrast takes a considerable amount of time. The extended acquisition time will further increase the susceptibility to motion artifacts. Existing deep learning-based reconst… ▽ More

    Submitted 13 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

  15. arXiv:2503.02685  [pdf, other

    q-bio.NC cs.CV eess.SP q-bio.QM

    TReND: Transformer derived features and Regularized NMF for neonatal functional network Delineation

    Authors: Sovesh Mohapatra, Minhui Ouyang, Shufang Tan, Jianlin Guo, Lianglong Sun, Yong He, Hao Huang

    Abstract: Precise parcellation of functional networks (FNs) of early developing human brain is the fundamental basis for identifying biomarker of developmental disorders and understanding functional development. Resting-state fMRI (rs-fMRI) enables in vivo exploration of functional changes, but adult FN parcellations cannot be directly applied to the neonates due to incomplete network maturation. No standar… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 10 Pages, 5 figures

  16. arXiv:2502.18523  [pdf, other

    eess.IV cs.AI cs.CV

    End-to-End Deep Learning for Structural Brain Imaging: A Unified Framework

    Authors: Yao Su, Keqi Han, Mingjie Zeng, Lichao Sun, Liang Zhan, Carl Yang, Lifang He, Xiangnan Kong

    Abstract: Brain imaging analysis is fundamental in neuroscience, providing valuable insights into brain structure and function. Traditional workflows follow a sequential pipeline-brain extraction, registration, segmentation, parcellation, network generation, and classification-treating each step as an independent task. These methods rely heavily on task-specific training data and expert intervention to corr… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

  17. arXiv:2502.11484  [pdf, other

    cs.LG eess.SY

    Dictionary-Learning-Based Data Pruning for System Identification

    Authors: Tingna Wang, Sikai Zhang, Limin Sun

    Abstract: System identification is normally involved in augmenting time series data by time shifting and nonlinearisation (via polynomial basis), which introduce redundancy both feature-wise and sample-wise. Many research works focus on reducing redundancy feature-wise, while less attention is paid to sample-wise redundancy. This paper proposes a novel data pruning method, called (mini-batch) FastCan, to re… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  18. arXiv:2502.05130  [pdf, other

    cs.SD cs.AI cs.CV cs.MM eess.AS

    Latent Swap Joint Diffusion for 2D Long-Form Latent Generation

    Authors: Yusheng Dai, Chenxi Wang, Chang Li, Chen Wang, Jun Du, Kewei Li, Ruoyu Wang, Jiefeng Ma, Lei Sun, Jianqing Gao

    Abstract: This paper introduces Swap Forward (SaFa), a modality-agnostic and efficient method to generate seamless and coherence long spectrum and panorama through latent swap joint diffusion across multi-views. We first investigate the spectrum aliasing problem in spectrum-based audio generation caused by existing joint diffusion methods. Through a comparative analysis of the VAE latent representation of M… ▽ More

    Submitted 18 March, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  19. arXiv:2502.04230  [pdf, other

    cs.SD cs.AI cs.CR cs.LG eess.AS

    XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

    Authors: Yixin Liu, Lie Lu, Jihui Jin, Lichao Sun, Andrea Fanelli

    Abstract: The rapid proliferation of generative audio synthesis and editing technologies has raised significant concerns about copyright infringement, data provenance, and the spread of misinformation through deepfake audio. Watermarking offers a proactive solution by embedding imperceptible, identifiable, and traceable marks into audio content. While recent neural network-based watermarking methods like Wa… ▽ More

    Submitted 7 February, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: 24 pages, 10 figures

  20. arXiv:2502.03839  [pdf, other

    eess.SY

    On the Number of Control Nodes in Boolean Networks with Degree Constraints

    Authors: Liangjie Sun, Wai-Ki Ching, Tatsuya Akutsu

    Abstract: This paper studies the minimum control node set problem for Boolean networks (BNs) with degree constraints. The main contribution is to derive the nontrivial lower and upper bounds on the size of the minimum control node set through combinatorial analysis of four types of BNs (i.e., $k$-$k$-XOR-BNs, simple $k$-$k$-AND-BNs, $k$-$k$-AND-BNs with negation and $k$-$k$-NC-BNs, where the $k$-$k$-AND-BN… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  21. arXiv:2501.15368  [pdf, other

    cs.CL cs.SD eess.AS

    Baichuan-Omni-1.5 Technical Report

    Authors: Yadong Li, Jun Liu, Tao Zhang, Tao Zhang, Song Chen, Tianpeng Li, Zehuan Li, Lijun Liu, Lingfeng Ming, Guosheng Dong, Da Pan, Chong Li, Yuanbo Fang, Dongdong Kuang, Mingrui Wang, Chenglin Zhu, Youwei Zhang, Hongyu Guo, Fengyu Zhang, Yuran Wang, Bowen Ding, Wei Song, Xu Li, Yuqi Huo, Zheng Liang , et al. (68 additional authors not shown)

    Abstract: We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pip… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

  22. arXiv:2501.10851  [pdf, other

    eess.IV cs.CV

    Exploring Siamese Networks in Self-Supervised Fast MRI Reconstruction

    Authors: Liyan Sun, Shaocong Yu, Chi Zhang, Xinghao Ding

    Abstract: Reconstructing MR images using deep neural networks from undersampled k-space data without using fully sampled training references offers significant value in practice, which is a self-supervised regression problem calling for effective prior knowledge and supervision. The Siamese architectures are motivated by the definition "invariance" and shows promising results in unsupervised visual represen… ▽ More

    Submitted 18 January, 2025; originally announced January 2025.

  23. arXiv:2501.09972  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions

    Authors: Heda Zuo, Weitao You, Junxian Wu, Shihong Ren, Pei Chen, Mingxu Zhou, Yujia Lu, Lingyun Sun

    Abstract: Composing music for video is essential yet challenging, leading to a growing interest in automating music generation for video applications. Existing approaches often struggle to achieve robust music-video correspondence and generative diversity, primarily due to inadequate feature alignment methods and insufficient datasets. In this study, we present General Video-to-Music Generation model (GVMGe… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: Accepted by the 39th AAAI Conference on Artificial Intelligence (AAAI-25)

  24. arXiv:2501.07127  [pdf, ps, other

    eess.IV

    QoE-oriented Communication Service Provision for Annotation Rendering in Mobile Augmented Reality

    Authors: Lulu Sun, Conghao Zhou, Shisheng Hu, Yupeng Zhu, Nan Cheng, Xu Xia

    Abstract: As mobile augmented reality (MAR) continues to evolve, future 6G networks will play a pivotal role in supporting immersive and personalized user experiences. In this paper, we address the communication service provision problem for annotation rendering in edge-assisted MAR, with the objective of optimizing spectrum resource utilization while ensuring the required quality of experience (QoE) for MA… ▽ More

    Submitted 3 March, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

    Comments: 6 pages,4 figures

  25. arXiv:2412.19200  [pdf, other

    cs.SD cs.IR eess.AS

    Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning

    Authors: Dengming Zhang, Weitao You, Ziheng Liu, Lingyun Sun, Pei Chen

    Abstract: Dynamic Music Emotion Recognition (DMER) aims to predict the emotion of different moments in music, playing a crucial role in music information retrieval. The existing DMER methods struggle to capture long-term dependencies when dealing with sequence data, which limits their performance. Furthermore, these methods often overlook the influence of individual differences on emotion perception, even t… ▽ More

    Submitted 26 December, 2024; originally announced December 2024.

    Comments: Accepted by the 39th AAAI Conference on Artificial Intelligence (AAAI-25)

  26. arXiv:2412.16530  [pdf, other

    cs.SD cs.CL cs.CV cs.MM eess.AS

    Improving Lip-synchrony in Direct Audio-Visual Speech-to-Speech Translation

    Authors: Lucas Goncalves, Prashant Mathur, Xing Niu, Brady Houston, Chandrashekhar Lavania, Srikanth Vishnubhotla, Lijia Sun, Anthony Ferritto

    Abstract: Audio-Visual Speech-to-Speech Translation typically prioritizes improving translation quality and naturalness. However, an equally critical aspect in audio-visual content is lip-synchrony-ensuring that the movements of the lips match the spoken content-essential for maintaining realism in dubbed videos. Despite its importance, the inclusion of lip-synchrony constraints in AVS2S models has been lar… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

    Comments: Accepted at ICASSP, 4 pages

  27. arXiv:2412.13126  [pdf, other

    eess.IV cs.CV

    A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis

    Authors: Xiao Zhou, Luoyi Sun, Dexuan He, Wenbin Guan, Ruifen Wang, Lifeng Wang, Xin Sun, Kun Sun, Ya Zhang, Yanfeng Wang, Weidi Xie

    Abstract: Deep learning has enabled the development of highly robust foundation models for various pathological tasks across diverse diseases and patient cohorts. Among these models, vision-language pre-training, which leverages large-scale paired data to align pathology image and text embedding spaces, and provides a novel zero-shot paradigm for downstream tasks. However, existing models have been primaril… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  28. Decentralized Dynamic Event-triggered Output-feedback Control of Stochastic Non-triangular Interconnected Systems with Unknown Time-varying Sensor Sensitivity

    Authors: Libei Sun, Yongduan Song, Maolong Lv

    Abstract: This study addresses the intricate challenge of decentralized output-feedback control for stochastic non-triangular nonlinear interconnected systems with unknown time-varying sensor sensitivity in a dynamic event-triggered context. The presence of stochastic disturbances, non-triangular structural uncertainties, and evolving sensor sensitivity distinguishes this problem of global asymptotic stabil… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: IEEE Transactions on Automatic Control (2024)

  29. arXiv:2411.12985  [pdf, other

    eess.SP

    Disco Intelligent Omni-Surfaces: 360-degree Fully-Passive Jamming Attacks

    Authors: Huan Huang, Hongliang Zhang, Jide Yuan, Luyao Sun, Yitian Wang, Weidong Mei, Boya Di, Yi Cai, Zhu Han

    Abstract: Intelligent omni-surfaces (IOSs) with 360-degree electromagnetic radiation significantly improves the performance of wireless systems, while an adversarial IOS also poses a significant potential risk for physical layer security. In this paper, we propose a "DISCO" IOS (DIOS) based fully-passive jammer (FPJ) that can launch omnidirectional fully-passive jamming attacks. In the proposed DIOS-based F… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: This paper has been submitted to IEEE TWC for possible publication

  30. arXiv:2409.11299  [pdf, other

    eess.IV cs.AI cs.CV

    TTT-Unet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation

    Authors: Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun

    Abstract: Biomedical image segmentation is crucial for accurately diagnosing and analyzing various diseases. However, Convolutional Neural Networks (CNNs) and Transformers, the most commonly used architectures for this task, struggle to effectively capture long-range dependencies due to the inherent locality of CNNs and the computational complexity of Transformers. To address this limitation, we introduce T… ▽ More

    Submitted 5 December, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

  31. arXiv:2409.09754  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Towards Single-Lens Controllable Depth-of-Field Imaging via Depth-Aware Point Spread Functions

    Authors: Xiaolong Qian, Qi Jiang, Yao Gao, Shaohua Gao, Zhonghua Yi, Lei Sun, Kai Wei, Haifeng Li, Kailun Yang, Kaiwei Wang, Jian Bai

    Abstract: Controllable Depth-of-Field (DoF) imaging commonly produces amazing visual effects based on heavy and expensive high-end lenses. However, confronted with the increasing demand for mobile scenarios, it is desirable to achieve a lightweight solution with Minimalist Optical Systems (MOS). This work centers around two major limitations of MOS, i.e., the severe optical aberrations and uncontrollable Do… ▽ More

    Submitted 11 February, 2025; v1 submitted 15 September, 2024; originally announced September 2024.

    Comments: Accepted to IEEE Transactions on Computational Imaging (TCI). The source code and the established dataset will be publicly available at https://github.com/XiaolongQian/DCDI

  32. arXiv:2409.05809  [pdf, other

    physics.optics cs.CV eess.IV

    A Flexible Framework for Universal Computational Aberration Correction via Automatic Lens Library Generation and Domain Adaptation

    Authors: Qi Jiang, Yao Gao, Shaohua Gao, Zhonghua Yi, Lei Sun, Hao Shi, Kailun Yang, Kaiwei Wang, Jian Bai

    Abstract: Emerging universal Computational Aberration Correction (CAC) paradigms provide an inspiring solution to light-weight and high-quality imaging without repeated data preparation and model training to accommodate new lens designs. However, the training databases in these approaches, i.e., the lens libraries (LensLibs), suffer from their limited coverage of real-world aberration behaviors. In this wor… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  33. arXiv:2409.03265  [pdf

    eess.IV

    Enhancing digital core image resolution using optimal upscaling algorithm: with application to paired SEM images

    Authors: Shaohua You, Shuqi Sun, Zhengting Yan, Qinzhuo Liao, Huiying Tang, Lianhe Sun, Gensheng Li

    Abstract: The porous media community extensively utilizes digital rock images for core analysis. High-resolution digital rock images that possess sufficient quality are essential but often challenging to acquire. Super-resolution (SR) approaches enhance the resolution of digital rock images and provide improved visualization of fine features and structures, aiding in the analysis and interpretation of rock… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  34. arXiv:2407.18560  [pdf, other

    eess.SY

    On the Number of Observation Nodes in Boolean Networks

    Authors: Liangjie Sun, Wai-Ki Ching, Tatsuya Akutsu

    Abstract: A Boolean network (BN) is called observable if any initial state can be uniquely determined from the output sequence. In the existing literature on observability of BNs, there is almost no research on the relationship between the number of observation nodes and the observability of BNs, which is an important and practical issue. In this paper, we mainly focus on three types of BNs with $n$ nodes (… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 31 pages, 2 figures, 15 tables

  35. arXiv:2407.08944  [pdf, other

    cs.CV eess.IV

    Bora: Biomedical Generalist Video Generation Model

    Authors: Weixiang Sun, Xiaocao You, Ruizhe Zheng, Zhengqing Yuan, Xiang Li, Lifang He, Quanzheng Li, Lichao Sun

    Abstract: Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical… ▽ More

    Submitted 15 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  36. arXiv:2407.06662  [pdf, other

    eess.SP

    Experimental Demonstration of 16D Voronoi Constellation with Two-Level Coding over 50km Four-Core Fiber

    Authors: Can Zhao, Bin Chen, Jiaqi Cai, Zhiwei Liang, Yi Lei, Junjie Xiong, Lin Ma, Daohui Hu, Lin Sun, Gangxiang Shen

    Abstract: A 16-dimensional Voronoi constellation concatenated with multilevel coding is experimentally demonstrated over a 50km four-core fiber transmission system. The proposed scheme reduces the required launch power by 6dB and provides a 17dB larger operating range than 16QAM with BICM at the outer HD-FEC BER threshold.

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 4 pages, 4 figures, accepted by 2024 European Conference on Optical Communication (ECOC)

  37. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Cheng Ouyang, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Yajing Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, Jing Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 16 January, 2025; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 23 pages, 3 figures, 2 tables

  38. arXiv:2406.12646  [pdf, other

    eess.IV cs.AI cs.CV

    An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

    Authors: Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Yajing Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

    Abstract: The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potenti… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to MICCAI-2024

  39. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes,… ▽ More

    Submitted 1 April, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE TPAMI. Project website: https://whu-sigma.github.io/HyperSIGMA

  40. arXiv:2406.08177  [pdf, other

    eess.IV cs.CV

    One-Step Effective Diffusion Network for Real-World Image Super-Resolution

    Authors: Rongyuan Wu, Lingchen Sun, Zhiyuan Ma, Lei Zhang

    Abstract: The pre-trained text-to-image diffusion models have been increasingly employed to tackle the real-world image super-resolution (Real-ISR) problem due to their powerful generative image priors. Most of the existing methods start from random noise to reconstruct the high-quality (HQ) image under the guidance of the given low-quality (LQ) image. While promising results have been achieved, such Real-I… ▽ More

    Submitted 24 October, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by NeurIPS 2024

  41. arXiv:2406.02560  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Less Peaky and More Accurate CTC Forced Alignment by Label Priors

    Authors: Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur

    Abstract: Connectionist temporal classification (CTC) models are known to have peaky output distributions. Such behavior is not a problem for automatic speech recognition (ASR), but it can cause inaccurate forced alignments (FA), especially at finer granularity, e.g., phoneme level. This paper aims at alleviating the peaky behavior for CTC and improve its suitability for forced alignment generation, by leve… ▽ More

    Submitted 18 July, 2024; v1 submitted 22 April, 2024; originally announced June 2024.

    Comments: Accepted by ICASSP 2024. Github repo: https://github.com/huangruizhe/audio/tree/aligner_label_priors

  42. arXiv:2405.20617  [pdf, other

    eess.SP

    Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

    Authors: Yuning Zhang, Thomas Choi, Zihang Cheng, Issei Kanno, Masaaki Ito, Jorge Gomez-Ponce, Hussein Hammoud, Bowei Wu, Ashwani Pradhan, Kelvin Arana, Pramod Krishna, Tianyi Yang, Tyler Chen, Ishita Vasishtha, Haoyu Xie, Linyu Sun, Andreas F. Molisch

    Abstract: The design of cell-free massive MIMO (CF-mMIMO) systems requires accurate, measurement-based channel models. This paper provides the first results from the by far most extensive outdoor measurement campaign for CF-mMIMO channels in an urban environment. We measured impulse responses between over 20,000 potential access point (AP) locations and 80 user equipments (UEs) at 3.5 GHz with 350 MHz bandw… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: Submitted to: VTC 2024-Fall

  43. arXiv:2405.09923  [pdf, other

    cs.CV eess.IV

    NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge

    Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

    Abstract: In this paper, we review the NTIRE 2024 challenge on Restore Any Image Model (RAIM) in the Wild. The RAIM challenge constructed a benchmark for image restoration in the wild, including real-world images with/without reference ground truth in various scenarios from real applications. The participants were required to restore the real-captured images from complex and unknown degradation, where gener… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  44. arXiv:2404.19201  [pdf, other

    eess.IV cs.CV cs.RO physics.optics

    Exploring Quasi-Global Solutions to Compound Lens Based Computational Imaging Systems

    Authors: Yao Gao, Qi Jiang, Shaohua Gao, Lei Sun, Kailun Yang, Kaiwei Wang

    Abstract: Recently, joint design approaches that simultaneously optimize optical systems and downstream algorithms through data-driven learning have demonstrated superior performance over traditional separate design approaches. However, current joint design approaches heavily rely on the manual identification of initial lenses, posing challenges and limitations, particularly for compound lens systems with m… ▽ More

    Submitted 20 February, 2025; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to IEEE Transactions on Computational Imaging (TCI). The source code will be made publicly available at https://github.com/LiGpy/QGSO

  45. arXiv:2404.18687  [pdf, other

    cs.RO eess.SY

    Socially Adaptive Path Planning Based on Generative Adversarial Network

    Authors: Yao Wang, Yuqi Kong, Wenzheng Chi, Lining Sun

    Abstract: The natural interaction between robots and pedestrians in the process of autonomous navigation is crucial for the intelligent development of mobile robots, which requires robots to fully consider social rules and guarantee the psychological comfort of pedestrians. Among the research results in the field of robotic path planning, the learning-based socially adaptive algorithms have performed well i… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  46. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  47. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  48. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  49. arXiv:2404.01082  [pdf, other

    eess.IV

    The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

    Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Liping Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

    Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More

    Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 25 pages, 17 figures

  50. arXiv:2403.11155  [pdf, other

    eess.IV cs.MM

    Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction

    Authors: Yixiang Mao, Liyang Sun, Yong Liu, Yao Wang

    Abstract: For $360^{\circ}$ video streaming, FoV-adaptive coding that allocates more bits for the predicted user's field of view (FoV) is an effective way to maximize the rendered video quality under the limited bandwidth. We develop a low-latency FoV-adaptive coding and streaming system for interactive applications that is robust to bandwidth variations and FoV prediction errors. To minimize the end-to-end… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.