Skip to main content

Showing 1–50 of 142 results for author: Gao, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.08403  [pdf, ps, other

    cs.NI cs.AI cs.DC cs.LG eess.SY

    Towards AI-Native RAN: An Operator's Perspective of 6G Day 1 Standardization

    Authors: Nan Li, Qi Sun, Lehan Wang, Xiaofei Xu, Jinri Huang, Chunhui Liu, Jing Gao, Yuhong Huang, Chih-Lin I

    Abstract: Artificial Intelligence/Machine Learning (AI/ML) has become the most certain and prominent feature of 6G mobile networks. Unlike 5G, where AI/ML was not natively integrated but rather an add-on feature over existing architecture, 6G shall incorporate AI from the onset to address its complexity and support ubiquitous AI applications. Based on our extensive mobile network operation and standardizati… ▽ More

    Submitted 11 July, 2025; originally announced July 2025.

  2. arXiv:2507.04821  [pdf, ps, other

    eess.SY

    Force-IMU Fusion-Based Sensing Acupuncture Needle and Quantitative Analysis System for Acupuncture Manipulations

    Authors: Peng Tian, Kang Yu, Tianyun Jiang, Yuqi Wang, Haiying Zhang, Hao Yang, Yunfeng Wang, Jun Zhang, Shuo Gao, Junhong Gao

    Abstract: Acupuncture, one of the key therapeutic methods in Traditional Chinese Medicine (TCM), has been widely adopted in various clinical fields. Quantitative research on acupuncture manipulation parameters is critical to achieve standardized techniques. However, quantitative mechanical detection of acupuncture parameters remains limited. This study establishes a kinematic and dynamic model of acupunctur… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: This work has been submitted to the IEEE for possible publication

  3. arXiv:2506.21613  [pdf, ps, other

    cs.CL cs.SD eess.AS

    ChildGuard: A Specialized Dataset for Combatting Child-Targeted Hate Speech

    Authors: Gautam Siddharth Kashyap, Mohammad Anas Azeez, Rafiq Ali, Zohaib Hasan Siddiqui, Jiechao Gao, Usman Naseem

    Abstract: The increasing prevalence of child-targeted hate speech online underscores the urgent need for specialized datasets to address this critical issue. Existing hate speech datasets lack agespecific annotations, fail to capture nuanced contexts, and overlook the unique emotional impact on children. To bridge this gap, we introduce ChildGuard1, a curated dataset derived from existing corpora and enrich… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  4. arXiv:2506.20762  [pdf, ps, other

    cs.NI eess.SP

    Drift-Adaptive Slicing-Based Resource Management for Cooperative ISAC Networks

    Authors: Shisheng Hu, Jie Gao, Xue Qin, Conghao Zhou, Xinyu Huang, Mushu Li, Mingcheng He, Xuemin Shen

    Abstract: In this paper, we propose a novel drift-adaptive slicing-based resource management scheme for cooperative integrated sensing and communication (ISAC) networks. Particularly, we establish two network slices to provide sensing and communication services, respectively. In the large-timescale planning for the slices, we partition the sensing region of interest (RoI) of each mobile device and reserve n… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: Accepted by IEEE Transactions on Cognitive Communications and Networking

  5. arXiv:2506.12314  [pdf, ps, other

    cs.RO eess.SY

    Explosive Output to Enhance Jumping Ability: A Variable Reduction Ratio Design Paradigm for Humanoid Robots Knee Joint

    Authors: Xiaoshuai Ma, Haoxiang Qi, Qingqing Li, Haochen Xu, Xuechao Chen, Junyao Gao, Zhangguo Yu, Qiang Huang

    Abstract: Enhancing the explosive power output of the knee joints is critical for improving the agility and obstacle-crossing capabilities of humanoid robots. However, a mismatch between the knee-to-center-of-mass (CoM) transmission ratio and jumping demands, coupled with motor performance degradation at high speeds, restricts the duration of high-power output and limits jump performance. To address these p… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  6. arXiv:2505.05088  [pdf, other

    cs.MM cs.CV eess.IV

    SSH-Net: A Self-Supervised and Hybrid Network for Noisy Image Watermark Removal

    Authors: Wenyang Liu, Jianjun Gao, Kim-Hui Yap

    Abstract: Visible watermark removal is challenging due to its inherent complexities and the noise carried within images. Existing methods primarily rely on supervised learning approaches that require paired datasets of watermarked and watermark-free images, which are often impractical to obtain in real-world scenarios. To address this challenge, we propose SSH-Net, a Self-Supervised and Hybrid Network speci… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Under Review in JVCI

  7. arXiv:2504.04829  [pdf, other

    cs.LG eess.SP stat.ML

    Attentional Graph Meta-Learning for Indoor Localization Using Extremely Sparse Fingerprints

    Authors: Wenzhong Yan, Feng Yin, Jun Gao, Ao Wang, Yang Tian, Ruizhi Chen

    Abstract: Fingerprint-based indoor localization is often labor-intensive due to the need for dense grids and repeated measurements across time and space. Maintaining high localization accuracy with extremely sparse fingerprints remains a persistent challenge. Existing benchmark methods primarily rely on the measured fingerprints, while neglecting valuable spatial and environmental characteristics. In this p… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  8. arXiv:2504.01038  [pdf, other

    eess.IV cs.CV cs.HC

    An Integrated AI-Enabled System Using One Class Twin Cross Learning (OCT-X) for Early Gastric Cancer Detection

    Authors: Xian-Xian Liu, Yuanyuan Wei, Mingkun Xu, Yongze Guo, Hongwei Zhang, Huicong Dong, Qun Song, Qi Zhao, Wei Luo, Feng Tien, Juntao Gao, Simon Fong

    Abstract: Early detection of gastric cancer, a leading cause of cancer-related mortality worldwide, remains hampered by the limitations of current diagnostic technologies, leading to high rates of misdiagnosis and missed diagnoses. To address these challenges, we propose an integrated system that synergizes advanced hardware and software technologies to balance speed-accuracy. Our study introduces the One C… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

    Comments: 26 pages, 4 figures, 6 tables

  9. arXiv:2504.01010  [pdf, other

    cs.CV eess.IV

    A YOLO-Based Semi-Automated Labeling Approach to Improve Fault Detection Efficiency in Railroad Videos

    Authors: Dylan Lester, James Gao, Samuel Sutphin, Pingping Zhu, Husnu Narman, Ammar Alzarrad

    Abstract: Manual labeling for large-scale image and video datasets is often time-intensive, error-prone, and costly, posing a significant barrier to efficient machine learning workflows in fault detection from railroad videos. This study introduces a semi-automated labeling method that utilizes a pre-trained You Only Look Once (YOLO) model to streamline the labeling process and enhance fault detection accur… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: Published on American Society of Engineering Education (ASEE) North Central Section Conference, 2025

  10. arXiv:2503.18078  [pdf, other

    eess.SP

    GenMetaLoc: Learning to Learn Environment-Aware Fingerprint Generation for Sample Efficient Wireless Localization

    Authors: Jun Gao, Feng Yin, Wenzhong Yan, Qinglei Kong, Lexi Xu, Shuguang Cui

    Abstract: Existing fingerprinting-based localization methods often require extensive data collection and struggle to generalize to new environments. In contrast to previous environment-unknown MetaLoc, we propose GenMetaLoc in this paper, which first introduces meta-learning to enable the generation of dense fingerprint databases from an environment-aware perspective. In the model aspect, the learning-to-le… ▽ More

    Submitted 23 March, 2025; originally announced March 2025.

  11. arXiv:2503.03465  [pdf, other

    cs.CV eess.IV

    DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing

    Authors: ChenTong Wang, Jincheng Gao, Fei Zhu, Abderrahim Halimi, Cédric Richard

    Abstract: Transformers have shown significant success in hyperspectral unmixing (HU). However, challenges remain. While multi-scale and long-range spatial correlations are essential in unmixing tasks, current Transformer-based unmixing networks, built on Vision Transformer (ViT) or Swin-Transformer, struggle to capture them effectively. Additionally, current Transformer-based unmixing networks rely on the l… ▽ More

    Submitted 5 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

  12. arXiv:2502.17759  [pdf

    eess.IV cs.CV

    Label-free Prediction of Vascular Connectivity in Perfused Microvascular Networks in vitro

    Authors: Liang Xu, Pengwu Song, Shilu Zhu, Yang Zhang, Ru Zhang, Zhiyuan Zheng, Qingdong Zhang, Jie Gao, Chen Han, Mingzhai Sun, Peng Yao, Min Ye, Ronald X. Xu

    Abstract: Continuous monitoring and in-situ assessment of microvascular connectivity have significant implications for culturing vascularized organoids and optimizing the therapeutic strategies. However, commonly used methods for vascular connectivity assessment heavily rely on fluorescent labels that may either raise biocompatibility concerns or interrupt the normal cell growth process. To address this iss… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  13. arXiv:2502.05766  [pdf, other

    eess.AS cs.SD

    Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models

    Authors: Jing-Xuan Zhang, Genshun Wan, Jianqing Gao, Zhen-Hua Ling

    Abstract: Audio-visual representation learning is crucial for advancing multimodal speech processing tasks, such as lipreading and audio-visual speech recognition. Recently, speech foundation models (SFMs) have shown remarkable generalization capabilities across various speech-related tasks. Building on this progress, we propose an audio-visual representation learning model that leverages cross-modal knowle… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

    Comments: accepted to Pattern Recognition

  14. arXiv:2502.05130  [pdf, other

    cs.SD cs.AI cs.CV cs.MM eess.AS

    Latent Swap Joint Diffusion for 2D Long-Form Latent Generation

    Authors: Yusheng Dai, Chenxi Wang, Chang Li, Chen Wang, Jun Du, Kewei Li, Ruoyu Wang, Jiefeng Ma, Lei Sun, Jianqing Gao

    Abstract: This paper introduces Swap Forward (SaFa), a modality-agnostic and efficient method to generate seamless and coherence long spectrum and panorama through latent swap joint diffusion across multi-views. We first investigate the spectrum aliasing problem in spectrum-based audio generation caused by existing joint diffusion methods. Through a comparative analysis of the VAE latent representation of M… ▽ More

    Submitted 18 March, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  15. arXiv:2502.01143  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

    Authors: Tairan He, Jiawei Gao, Wenli Xiao, Yuanhang Zhang, Zi Wang, Jiashun Wang, Zhengyi Luo, Guanqi He, Nikhil Sobanbab, Chaoyi Pan, Zeji Yi, Guannan Qu, Kris Kitani, Jessica Hodgins, Linxi "Jim" Fan, Yuke Zhu, Changliu Liu, Guanya Shi

    Abstract: Humanoid robots hold the potential for unparalleled versatility in performing human-like, whole-body skills. However, achieving agile and coordinated whole-body motions remains a significant challenge due to the dynamics mismatch between simulation and the real world. Existing approaches, such as system identification (SysID) and domain randomization (DR) methods, often rely on labor-intensive par… ▽ More

    Submitted 25 April, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

    Comments: RSS 2025. Project website: https://agile.human2humanoid.com/

  16. arXiv:2501.10404  [pdf, other

    eess.SP cs.LG

    Automated Detection of Epileptic Spikes and Seizures Incorporating a Novel Spatial Clustering Prior

    Authors: Hanyang Dong, Shurong Sheng, Xiongfei Wang, Jiahong Gao, Yi Sun, Wanli Yang, Kuntao Xiao, Pengfei Teng, Guoming Luan, Zhao Lv

    Abstract: A Magnetoencephalography (MEG) time-series recording consists of multi-channel signals collected by superconducting sensors, with each signal's intensity reflecting magnetic field changes over time at the sensor location. Automating epileptic MEG spike detection significantly reduces manual assessment time and effort, yielding substantial clinical benefits. Existing research addresses MEG spike de… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

    Comments: 8 pages, 6 figures, accepted by BIBM2024

  17. arXiv:2501.03496  [pdf, other

    eess.SY

    A Unified Attack Detection Strategy for Multi-Agent Systems over Transient and Steady Stages

    Authors: Jinming Gao, Yijing Wang, Wentao Zhang, Rui Zhao, Yang Shi, Zhiqiang Zuo

    Abstract: This paper proposes a unified detection strategy against three kinds of attacks for multi-agent systems (MASs) which is applicable to both transient and steady stages. For attacks on the communication layer, a watermarking-based detection scheme with KullbackLeibler (KL) divergence is designed. Different from traditional communication schemes, each agent transmits a message set containing two stat… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

  18. arXiv:2411.17552  [pdf, other

    eess.SY

    Ensuring Safety in Target Pursuit Control: A CBF-Safe Reinforcement Learning Approach

    Authors: Yaosheng Deng, Junjie Gao, Jiaping Xiao, Mir Feroskhan

    Abstract: This paper addresses the target-pursuit problem, aiming to ensure each pursuer's safety regarding collision avoidance, sensing range, and input saturation. An input-constrained CBF is proposed to dynamically regulate the pursuer's control, ensuring effective target pursuit even when the target performs evasive maneuvers. To further ensure safety, two sets of CBF constraints are designed to regulat… ▽ More

    Submitted 10 December, 2024; v1 submitted 26 November, 2024; originally announced November 2024.

    Comments: 12 pages

  19. arXiv:2411.14385  [pdf, other

    eess.IV cs.CV

    Enhancing Diagnostic Precision in Gastric Bleeding through Automated Lesion Segmentation: A Deep DuS-KFCM Approach

    Authors: Xian-Xian Liu, Mingkun Xu, Yuanyuan Wei, Huafeng Qin, Qun Song, Simon Fong, Feng Tien, Wei Luo, Juntao Gao, Zhihua Zhang, Shirley Siu

    Abstract: Timely and precise classification and segmentation of gastric bleeding in endoscopic imagery are pivotal for the rapid diagnosis and intervention of gastric complications, which is critical in life-saving medical procedures. Traditional methods grapple with the challenge posed by the indistinguishable intensity values of bleeding tissues adjacent to other gastric structures. Our study seeks to rev… ▽ More

    Submitted 25 November, 2024; v1 submitted 21 November, 2024; originally announced November 2024.

  20. arXiv:2411.14250  [pdf, other

    eess.IV cs.CV

    CP-UNet: Contour-based Probabilistic Model for Medical Ultrasound Images Segmentation

    Authors: Ruiguo Yu, Yiyang Zhang, Yuan Tian, Zhiqiang Liu, Xuewei Li, Jie Gao

    Abstract: Deep learning-based segmentation methods are widely utilized for detecting lesions in ultrasound images. Throughout the imaging procedure, the attenuation and scattering of ultrasound waves cause contour blurring and the formation of artifacts, limiting the clarity of the acquired ultrasound images. To overcome this challenge, we propose a contour-based probabilistic segmentation model CP-UNet, wh… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

    Comments: 4 pages, 4 figures, 2 tables;For icassp2025

  21. arXiv:2411.13314   

    cs.SD eess.AS

    I2TTS: Image-indicated Immersive Text-to-speech Synthesis with Spatial Perception

    Authors: Jiawei Zhang, Tian-Hao Zhang, Jun Wang, Jiaran Gao, Xinyuan Qian, Xu-Cheng Yin

    Abstract: Controlling the style and characteristics of speech synthesis is crucial for adapting the output to specific contexts and user requirements. Previous Text-to-speech (TTS) works have focused primarily on the technical aspects of producing natural-sounding speech, such as intonation, rhythm, and clarity. However, they overlook the fact that there is a growing emphasis on spatial perception of synthe… ▽ More

    Submitted 2 December, 2024; v1 submitted 20 November, 2024; originally announced November 2024.

    Comments: The paper is missing some information

  22. arXiv:2411.08896  [pdf, other

    eess.SP cs.LG cs.NI

    Demand-Aware Beam Hopping and Power Allocation for Load Balancing in Digital Twin empowered LEO Satellite Networks

    Authors: Ruili Zhao, Jun Cai, Jiangtao Luo, Junpeng Gao, Yongyi Ran

    Abstract: Low-Earth orbit (LEO) satellites utilizing beam hopping (BH) technology offer extensive coverage, low latency, high bandwidth, and significant flexibility. However, the uneven geographical distribution and temporal variability of ground traffic demands, combined with the high mobility of LEO satellites, present significant challenges for efficient beam resource utilization. Traditional BH methods… ▽ More

    Submitted 28 October, 2024; originally announced November 2024.

  23. arXiv:2411.07751  [pdf, other

    cs.SD cs.AI cs.CV cs.MM eess.AS

    SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

    Authors: Xinyuan Qian, Jiaran Gao, Yaodan Zhang, Qiquan Zhang, Hexin Liu, Leibny Paola Garcia, Haizhou Li

    Abstract: Speech enhancement plays an essential role in various applications, and the integration of visual information has been demonstrated to bring substantial advantages. However, the majority of current research concentrates on the examination of facial and lip movements, which can be compromised or entirely inaccessible in scenarios where occlusions occur or when the camera view is distant. Whereas co… ▽ More

    Submitted 2 April, 2025; v1 submitted 12 November, 2024; originally announced November 2024.

    Comments: accepted by IEEE Journal of Selected Topics in Signal Processing

  24. Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling

    Authors: Yang Li, Jiankai Gao, Yuanzheng Li, Chen Chen, Sen Li, Mohammad Shahidehpour, Zhe Chen

    Abstract: To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Accepted by IEEE Transactions on Industry Applications (Paper Id: 2023-KDSEM-1058)

  25. arXiv:2410.09444  [pdf

    eess.IV cs.CV

    Diabetic retinopathy image classification method based on GreenBen data augmentation

    Authors: Yutong Liu, Jie Gao, Haijiang Zhu

    Abstract: For the diagnosis of diabetes retinopathy (DR) images, this paper proposes a classification method based on artificial intelligence. The core lies in a new data augmentation method, GreenBen, which first extracts the green channel grayscale image from the retinal image and then performs Ben enhancement. Considering that diabetes macular edema (DME) is a complication closely related to DR, this pap… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  26. arXiv:2410.02510  [pdf, other

    cs.RO cs.MA eess.SY

    SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics

    Authors: James Gao, Jacob Lee, Yuting Zhou, Yunze Hu, Chang Liu, Pingping Zhu

    Abstract: Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: Submitted to American Control Conference (ACC) 2025

  27. arXiv:2409.17603  [pdf, other

    cs.CL cs.SD eess.AS

    Deep CLAS: Deep Contextual Listen, Attend and Spell

    Authors: Mengzhi Wang, Shifu Xiong, Genshun Wan, Hang Chen, Jianqing Gao, Lirong Dai

    Abstract: Contextual-LAS (CLAS) has been shown effective in improving Automatic Speech Recognition (ASR) of rare words. It relies on phrase-level contextual modeling and attention-based relevance scoring without explicit contextual constraint which lead to insufficient use of contextual information. In this work, we propose deep CLAS to use contextual information better. We introduce bias loss forcing model… ▽ More

    Submitted 19 December, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: Submitted to JUSTC

  28. arXiv:2409.09289  [pdf, other

    cs.SD cs.MM eess.AS

    DSCLAP: Domain-Specific Contrastive Language-Audio Pre-Training

    Authors: Shengqiang Liu, Da Liu, Anna Wang, Zhiyu Zhang, Jie Gao, Yali Li

    Abstract: Analyzing real-world multimodal signals is an essential and challenging task for intelligent voice assistants (IVAs). Mainstream approaches have achieved remarkable performance on various downstream tasks of IVAs with pre-trained audio models and text models. However, these models are pre-trained independently and usually on tasks different from target domains, resulting in sub-optimal modality re… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  29. arXiv:2409.09284  [pdf, other

    cs.SD cs.MM eess.AS

    M$^{3}$V: A multi-modal multi-view approach for Device-Directed Speech Detection

    Authors: Anna Wang, Da Liu, Zhiyu Zhang, Shengqiang Liu, Jie Gao, Yali Li

    Abstract: With the goal of more natural and human-like interaction with virtual voice assistants, recent research in the field has focused on full duplex interaction mode without relying on repeated wake-up words. This requires that in scenes with complex sound sources, the voice assistant must classify utterances as device-oriented or non-device-oriented. The dual-encoder structure, which is jointly modele… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  30. arXiv:2409.02041  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge

    Authors: Shutong Niu, Ruoyu Wang, Jun Du, Gaobin Yang, Yanhui Tu, Siyuan Wu, Shuangqing Qian, Huaxin Wu, Haitao Xu, Xueyang Zhang, Guolong Zhong, Xindi Yu, Jieru Chen, Mengzhi Wang, Di Cai, Tian Gao, Genshun Wan, Feng Ma, Jia Pan, Jianqing Gao

    Abstract: This technical report outlines our submission system for the CHiME-8 NOTSOFAR-1 Challenge. The primary difficulty of this challenge is the dataset recorded across various conference rooms, which captures real-world complexities such as high overlap rates, background noises, a variable number of speakers, and natural conversation styles. To address these issues, we optimized the system in several a… ▽ More

    Submitted 24 October, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

  31. arXiv:2408.09715  [pdf, other

    cs.AI cs.CV cs.LG eess.IV

    HYDEN: Hyperbolic Density Representations for Medical Images and Reports

    Authors: Zhi Qiao, Linbin Han, Xiantong Zhen, Jia-Hong Gao, Zhen Qian

    Abstract: In light of the inherent entailment relations between images and text, hyperbolic point vector embeddings, leveraging the hierarchical modeling advantages of hyperbolic space, have been utilized for visual semantic representation learning. However, point vector embedding approaches fail to address the issue of semantic uncertainty, where an image may have multiple interpretations, and text may ref… ▽ More

    Submitted 19 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  32. arXiv:2407.13227  [pdf, other

    eess.SY

    Solving the Model Unavailable MARE using Q-Learning Algorithm

    Authors: Fei Yan, Jie Gao, Tao Feng, Jianxing Liu

    Abstract: In this paper, the discrete-time modified algebraic Riccati equation (MARE) is solved when the system model is completely unavailable. To achieve this, firstly a brand new iterative method based on the standard discrete-time algebraic Riccati equation (DARE) and its input weighting matrix is proposed to solve the MARE. For the single-input case, the iteration can be initialized by an arbitrary pos… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  33. arXiv:2407.00218  [pdf, other

    eess.SY cs.RO

    Resilient Estimator-based Control Barrier Functions for Dynamical Systems with Disturbances and Noise

    Authors: Chuyuan Tao, Wenbin Wan, Junjie Gao, Bihao Mo, Hunmin Kim, Naira Hovakimyan

    Abstract: Control Barrier Function (CBF) is an emerging method that guarantees safety in path planning problems by generating a control command to ensure the forward invariance of a safety set. Most of the developments up to date assume availability of correct state measurements and absence of disturbances on the system. However, if the system incurs disturbances and is subject to noise, the CBF cannot guar… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  34. arXiv:2406.16200  [pdf, other

    cs.LG cs.CR cs.IT eess.SP

    Towards unlocking the mystery of adversarial fragility of neural networks

    Authors: Jingchao Gao, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Catherine Xu, Hui Xie, Weiyu Xu

    Abstract: In this paper, we study the adversarial robustness of deep neural networks for classification tasks. We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification. In particular, our theoretical results show that neural ne… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 21 pages

  35. arXiv:2406.14067  [pdf

    physics.optics eess.SP

    A microwave photonic prototype for concurrent radar detection and spectrum sensing over an 8 to 40 GHz bandwidth

    Authors: Taixia Shi, Dingding Liang, Lu Wang, Lin Li, Shaogang Guo, Jiawei Gao, Xiaowei Li, Chulun Lin, Lei Shi, Baogang Ding, Shiyang Liu, Fangyi Yang, Chi Jiang, Yang Chen

    Abstract: In this work, a microwave photonic prototype for concurrent radar detection and spectrum sensing is proposed, designed, built, and investigated. A direct digital synthesizer and an analog electronic circuit are integrated to generate an intermediate frequency (IF) linearly frequency-modulated (LFM) signal with a tunable center frequency from 2.5 to 9.5 GHz and an instantaneous bandwidth of 1 GHz.… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 18 pages, 12 figures, 1 table

  36. arXiv:2406.13038  [pdf, other

    cs.AI eess.SP

    Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach

    Authors: Zilin Bian, Jingqin Gao, Kaan Ozbay, Zhenning Li

    Abstract: Although traffic prediction has been receiving considerable attention with a number of successes in the context of intelligent transportation systems, the prediction of traffic states over a complex transportation network that contains different road types has remained a challenge. This study proposes a multi-scale graph wavelet temporal convolution network (MSGWTCN) to predict the traffic states… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  37. arXiv:2405.19213  [pdf, other

    eess.SY cs.AI cs.LG cs.NI

    EdgeSight: Enabling Modeless and Cost-Efficient Inference at the Edge

    Authors: ChonLam Lao, Jiaqi Gao, Ganesh Ananthanarayanan, Aditya Akella, Minlan Yu

    Abstract: Traditional ML inference is evolving toward modeless inference, which abstracts the complexity of model selection from users, allowing the system to automatically choose the most appropriate model for each request based on accuracy and resource requirements. While prior studies have focused on modeless inference within data centers, this paper tackles the pressing need for cost-efficient modeless… ▽ More

    Submitted 14 January, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: 12 pages

  38. arXiv:2405.15863  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

    Authors: Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang, Jianqing Gao, Feng Ma

    Abstract: Text-to-music (TTM) generation, which converts textual descriptions into audio, opens up innovative avenues for multimedia creation. Achieving high quality and diversity in this process demands extensive, high-quality data, which are often scarce in available datasets. Most open-source datasets frequently suffer from issues like low-quality waveforms and low text-audio consistency, hindering the a… ▽ More

    Submitted 17 June, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: IJCAI

  39. arXiv:2405.15271  [pdf

    eess.SY physics.ins-det physics.optics

    Seamless Integration and Implementation of Distributed Contact and Contactless Vital Sign Monitoring

    Authors: Dingding Liang, Yang Chen, Jiawei Gao, Taixia Shi, Jianping Yao

    Abstract: Real-time vital sign monitoring is gaining immense significance not only in the medical field but also in personal health management. Facing the needs of different application scenarios of the smart and healthy city in the future, the low-cost, large-scale, scalable, and distributed vital sign monitoring system is of great significance. In this work, a seamlessly integrated contact and contactless… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 14 pages,9 figures

  40. arXiv:2405.11883  [pdf, other

    cs.IT eess.SP

    Asynchronous MIMO-OFDM Massive Unsourced Random Access with Codeword Collisions

    Authors: Tianya Li, Yongpeng Wu, Junyuan Gao, Wenjun Zhang, Xiang-Gen Xia, Derrick Wing Kwan Ng, Chengshan Xiao

    Abstract: This paper investigates asynchronous multiple-input multiple-output (MIMO) massive unsourced random access (URA) in an orthogonal frequency division multiplexing (OFDM) system over frequency-selective fading channels, with the presence of both timing and carrier frequency offsets (TO and CFO) and non-negligible codeword collisions. The proposed coding framework segregates the data into two compone… ▽ More

    Submitted 10 October, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by the IEEE Transactions on Wireless Communications

  41. Array SAR 3D Sparse Imaging Based on Regularization by Denoising Under Few Observed Data

    Authors: Yangyang Wang, Xu Zhan, Jing Gao, Jinjie Yao, Shunjun Wei, JianSheng Bai

    Abstract: Array synthetic aperture radar (SAR) three-dimensional (3D) imaging can obtain 3D information of the target region, which is widely used in environmental monitoring and scattering information measurement. In recent years, with the development of compressed sensing (CS) theory, sparse signal processing is used in array SAR 3D imaging. Compared with matched filter (MF), sparse SAR imaging can effect… ▽ More

    Submitted 26 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  42. arXiv:2404.15279  [pdf, other

    eess.SP cs.AI

    Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification

    Authors: Jimmy Lin, Junkai Li, Jiasi Gao, Weizhi Ma, Yang Liu

    Abstract: Tactile signals collected by wearable electronics are essential in modeling and understanding human behavior. One of the main applications of tactile signals is action classification, especially in healthcare and robotics. However, existing tactile classification methods fail to capture the spatial and temporal features of tactile signals simultaneously, which results in sub-optimal performances.… ▽ More

    Submitted 20 January, 2024; originally announced April 2024.

    Comments: Accepted by AAAI 2024

  43. arXiv:2402.18871  [pdf, other

    eess.IV cs.CV

    LoLiSRFlow: Joint Single Image Low-light Enhancement and Super-resolution via Cross-scale Transformer-based Conditional Flow

    Authors: Ziyu Yue, Jiaxin Gao, Sihan Xie, Yang Liu, Zhixun Su

    Abstract: The visibility of real-world images is often limited by both low-light and low-resolution, however, these issues are only addressed in the literature through Low-Light Enhancement (LLE) and Super- Resolution (SR) methods. Admittedly, a simple cascade of these approaches cannot work harmoniously to cope well with the highly ill-posed problem for simultaneously enhancing visibility and resolution. I… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  44. arXiv:2402.00320  [pdf

    eess.IV

    DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

    Authors: Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang Jin, Chenxi Hu

    Abstract: Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to tr… ▽ More

    Submitted 2 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 10 pages, 8 figures

  45. arXiv:2401.02099  [pdf

    cs.CV cs.SD eess.AS

    Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition

    Authors: Zeyu Li, Suncheng Xiang, Tong Yu, Jingsheng Gao, Jiacheng Ruan, Yanping Hu, Ting Liu, Yuzhuo Fu

    Abstract: The recognition of underwater audio plays a significant role in identifying a vessel while it is in motion. Underwater target recognition tasks have a wide range of applications in areas such as marine environmental protection, detection of ship radiated noise, underwater noise control, and coastal vessel dispatch. The traditional UATR task involves training a network to extract features from audi… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: Accepted by ICIC 2024

  46. arXiv:2312.17030  [pdf, other

    eess.IV cs.CV

    Learning Multi-axis Representation in Frequency Domain for Medical Image Segmentation

    Authors: Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Suncheng Xiang

    Abstract: Recently, Visual Transformer (ViT) has been extensively used in medical image segmentation (MIS) due to applying self-attention mechanism in the spatial domain to modeling global knowledge. However, many studies have focused on improving models in the spatial domain while neglecting the importance of frequency domain information. Therefore, we propose Multi-axis External Weights UNet (MEW-UNet) ba… ▽ More

    Submitted 24 September, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: This paper has been accepted by Machine Learning Journal

  47. arXiv:2312.13752  [pdf

    eess.IV cs.AI cs.CV

    Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

    Authors: Yang Nan, Xiaodan Xing, Shiyi Wang, Zeyu Tang, Federico N Felder, Sheng Zhang, Roberta Eufrasia Ledda, Xiaoliu Ding, Ruiqi Yu, Weiping Liu, Feng Shi, Tianyang Sun, Zehong Cao, Minghui Zhang, Yun Gu, Hanxiao Zhang, Jian Gao, Pingyu Wang, Wen Tang, Pengxin Yu, Han Kang, Junqiang Chen, Xing Lu, Boyu Zhang, Michail Mamalakis , et al. (16 additional authors not shown)

    Abstract: Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric… ▽ More

    Submitted 16 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 19 pages

  48. arXiv:2312.11460  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response

    Authors: Junfeng Long, Zirui Wang, Quanyi Li, Jiawei Gao, Liu Cao, Jiangmiao Pang

    Abstract: Robust locomotion control depends on accurate state estimations. However, the sensors of most legged robots can only provide partial and noisy observations, making the estimation particularly challenging, especially for external states like terrain frictions and elevation maps. Inspired by the classical Internal Model Control principle, we consider these external states as disturbances and introdu… ▽ More

    Submitted 1 January, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Use 1 hour to train a quadruped robot capable of traversing any terrain under any disturbances in the open world, Project Page: https://github.com/OpenRobotLab/HIMLoco

  49. arXiv:2312.02298  [pdf, other

    eess.SP cs.CV cs.LG stat.AP

    MoE-AMC: Enhancing Automatic Modulation Classification Performance Using Mixture-of-Experts

    Authors: Jiaxin Gao, Qinglong Cao, Yuntian Chen

    Abstract: Automatic Modulation Classification (AMC) plays a vital role in time series analysis, such as signal classification and identification within wireless communications. Deep learning-based AMC models have demonstrated significant potential in this domain. However, current AMC models inadequately consider the disparities in handling signals under conditions of low and high Signal-to-Noise Ratio (SNR)… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  50. arXiv:2311.12223  [pdf, other

    cs.NI cs.AI eess.SP

    Digital Twin-Based User-Centric Edge Continual Learning in Integrated Sensing and Communication

    Authors: Shisheng Hu, Jie Gao, Xinyu Huang, Mushu Li, Kaige Qu, Conghao Zhou, Xuemin, Shen

    Abstract: In this paper, we propose a digital twin (DT)-based user-centric approach for processing sensing data in an integrated sensing and communication (ISAC) system with high accuracy and efficient resource utilization. The considered scenario involves an ISAC device with a lightweight deep neural network (DNN) and a mobile edge computing (MEC) server with a large DNN. After collecting sensing data, the… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: submitted to IEEE ICC 2024