Skip to main content

Showing 1–50 of 65 results for author: Hu, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.02465  [pdf, ps, other

    eess.SP

    Large Language Models Can Achieve Explainable and Training-Free One-shot HRRP ATR

    Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Qi Liu, Zhen Liu

    Abstract: This letter introduces a pioneering, training-free and explainable framework for High-Resolution Range Profile (HRRP) automatic target recognition (ATR) utilizing large-scale pre-trained Large Language Models (LLMs). Diverging from conventional methods requiring extensive task-specific training or fine-tuning, our approach converts one-dimensional HRRP signals into textual scattering center repres… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Submitted to IEEE SPL 2025

  2. arXiv:2505.15380  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Accelerating Autoregressive Speech Synthesis Inference With Speech Speculative Decoding

    Authors: Zijian Lin, Yang Zhang, Yougen Yuan, Yuming Yan, Jinjiang Liu, Zhiyong Wu, Pengfei Hu, Qun Yu

    Abstract: Modern autoregressive speech synthesis models leveraging language models have demonstrated remarkable performance. However, the sequential nature of next token prediction in these models leads to significant latency, hindering their deployment in scenarios where inference speed is critical. In this work, we propose Speech Speculative Decoding (SSD), a novel framework for autoregressive speech synt… ▽ More

    Submitted 2 June, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

    Comments: Accepted by INTERSPEECH 2025

  3. arXiv:2505.15058  [pdf, ps, other

    cs.SD cs.AI cs.CV cs.GR eess.AS

    AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars

    Authors: Tianbao Zhang, Jian Zhao, Yuer Li, Zheng Zhu, Ping Hu, Zhaoxin Fan, Wenjun Wu, Xuelong Li

    Abstract: Whole-body audio-driven avatar pose and expression generation is a critical task for creating lifelike digital humans and enhancing the capabilities of interactive virtual agents, with wide-ranging applications in virtual reality, digital entertainment, and remote communication. Existing approaches often generate audio-driven facial expressions and gestures independently, which introduces a signif… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 11pages, conference

    MSC Class: 68T10

  4. arXiv:2505.13468  [pdf, other

    cs.CV astro-ph.IM cs.LG eess.SY

    An Edge AI Solution for Space Object Detection

    Authors: Wenxuan Zhang, Peng Hu

    Abstract: Effective Edge AI for space object detection (SOD) tasks that can facilitate real-time collision assessment and avoidance is essential with the increasing space assets in near-Earth orbits. In SOD, low Earth orbit (LEO) satellites must detect other objects with high precision and minimal delay. We explore an Edge AI solution based on deep-learning-based vision sensing for SOD tasks and propose a d… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Accepted as poster paper at the 2025 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

  5. arXiv:2505.05521  [pdf, other

    eess.SY

    Model-Based Closed-Loop Control Algorithm for Stochastic Partial Differential Equation Control

    Authors: Peiyan Hu, Haodong Feng, Yue Wang, Zhiming Ma

    Abstract: Neural operators have demonstrated promise in modeling and controlling systems governed by Partial Differential Equations (PDEs). Beyond PDEs, Stochastic Partial Differential Equations (SPDEs) play a critical role in modeling systems influenced by randomness, with applications in finance, physics, and beyond. However, controlling SPDE-governed systems remains a significant challenge. On the one ha… ▽ More

    Submitted 15 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

  6. arXiv:2505.05078  [pdf, ps, other

    cs.SD eess.AS

    Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following

    Authors: Silvan Peter, Patricia Hu, Gerhard Widmer

    Abstract: Real-time music tracking systems follow a musical performance and at any time report the current position in a corresponding score. Most existing methods approach this problem exclusively in the audio domain, typically using online time warping (OLTW) techniques on incoming audio and an audio representation of the score. Audio OLTW techniques have seen incremental improvements both in features and… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: 5 pages, 3 tables, 2 pseudocodes, to be published at the Sound and Music Computing Conference 2025

  7. arXiv:2505.05055  [pdf, other

    cs.SD eess.AS

    How to Infer Repeat Structures in MIDI Performances

    Authors: Silvan Peter, Patricia Hu, Gerhard Widmer

    Abstract: MIDI performances are generally expedient in performance research and music information retrieval, and even more so if they can be connected to a score. This connection is usually established by means of alignment, linking either notes or time points between the score and the performance. The first obstacle when trying to establish such an alignment is that a performance realizes one (out of many)… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: 3 pages, 1 figure, 1 table, to be published in the Music Encoding Conference 2025

  8. arXiv:2505.01650  [pdf, other

    cs.CV eess.IV

    Toward Onboard AI-Enabled Solutions to Space Object Detection for Space Sustainability

    Authors: Wenxuan Zhang, Peng Hu

    Abstract: The rapid expansion of advanced low-Earth orbit (LEO) satellites in large constellations is positioning space assets as key to the future, enabling global internet access and relay systems for deep space missions. A solution to the challenge is effective space object detection (SOD) for collision assessment and avoidance. In SOD, an LEO satellite must detect other satellites and objects with high… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: This paper has been accepted at the 18th International Conference on Space Operations (SpaceOps 2025)

  9. arXiv:2504.16036  [pdf

    physics.med-ph eess.SP physics.app-ph

    Rotational ultrasound and photoacoustic tomography of the human body

    Authors: Yang Zhang, Shuai Na, Jonathan J. Russin, Karteekeya Sastry, Li Lin, Junfu Zheng, Yilin Luo, Xin Tong, Yujin An, Peng Hu, Konstantin Maslov, Tze-Woei Tan, Charles Y. Liu, Lihong V. Wang

    Abstract: Imaging the human body's morphological and angiographic information is essential for diagnosing, monitoring, and treating medical conditions. Ultrasonography performs the morphological assessment of the soft tissue based on acoustic impedance variations, whereas photoacoustic tomography (PAT) can visualize blood vessels based on intrinsic hemoglobin absorption. Three-dimensional (3D) panoramic ima… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  10. arXiv:2503.14287  [pdf, other

    eess.SP eess.SY

    Cross-Environment Transfer Learning for Location-Aided Beam Prediction in 5G and Beyond Millimeter-Wave Networks

    Authors: Enrico Tosi, Panwei Hu, Aleksandar Ichkov, Marina Petrova, Ljiljana Simić

    Abstract: Millimeter-wave (mm-wave) communications requirebeamforming and consequent precise beam alignmentbetween the gNodeB (gNB) and the user equipment (UE) toovercome high propagation losses. This beam alignment needs tobe constantly updated for different UE locations based on beamsweepingradio frequency measurements, leading to significantbeam management overhead. One potential solution involvesusing m… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  11. arXiv:2503.08013  [pdf, other

    eess.SY

    A Three-Dimensional Pursuit-Evasion Game Based on Fuzzy Actor-Critic Learning Algorithm

    Authors: Penglin Hu

    Abstract: Most of the existing research on pursuit-evasion game (PEG) is conducted in a two-dimensional (2D) environment. In this paper, we investigate the PEG in a 3D space. We extend the Apollonius circle (AC) to the 3D space and introduce its detailed analytical form. To enhance the capture efficiency, we derive the optimal motion space for both the pursuer and the evader. To address the issue arising fr… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  12. arXiv:2503.06741  [pdf, other

    eess.SY

    A Novel Multi-Objective Reinforcement Learning Algorithm for Pursuit-Evasion Game

    Authors: Penglin Hu, Chunhui Zhao, Quan Pan

    Abstract: In practical application, the pursuit-evasion game (PEG) often involves multiple complex and conflicting objectives. The single-objective reinforcement learning (RL) usually focuses on a single optimization objective, and it is difficult to find the optimal balance among multiple objectives. This paper proposes a three-objective RL algorithm based on fuzzy Q-learning (FQL) to solve the PEG with di… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

    Comments: 23 pages, 10 figures, 1 tables

  13. arXiv:2501.11028  [pdf, other

    eess.SP

    Few-shot Human Motion Recognition through Multi-Aspect mmWave FMCW Radar Data

    Authors: Hao Fan, Lingfeng Chen, Chengbai Xu, Jiadong Zhou, Yongpeng Dai, Panhe HU

    Abstract: Radar human motion recognition methods based on deep learning models has been a heated spot of remote sensing in recent years, yet the existing methods are mostly radial-oriented. In practical application, the test data could be multi-aspect and the sample number of each motion could be very limited, causing model overfitting and reduced recognition accuracy. This paper proposed channel-DN4, a mul… ▽ More

    Submitted 19 January, 2025; originally announced January 2025.

  14. arXiv:2412.08913  [pdf, other

    cs.CV eess.IV

    Sensing for Space Safety and Sustainability: A Deep Learning Approach with Vision Transformers

    Authors: Wenxuan Zhang, Peng Hu

    Abstract: The rapid increase of space assets represented by small satellites in low Earth orbit can enable ubiquitous digital services for everyone. However, due to the dynamic space environment, numerous space objects, complex atmospheric conditions, and unexpected events can easily introduce adverse conditions affecting space safety, operations, and sustainability of the outer space environment. This chal… ▽ More

    Submitted 14 December, 2024; v1 submitted 11 December, 2024; originally announced December 2024.

    Comments: To be published in the 12th Annual IEEE International Conference on Wireless for Space and Extreme Environments (WiSEE 2024)

  15. arXiv:2408.07644  [pdf, other

    cs.RO cs.LG cs.MA eess.SY

    SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

    Authors: Jianye Xu, Pan Hu, Bassam Alrifaee

    Abstract: This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same sce… ▽ More

    Submitted 10 April, 2025; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024

  16. arXiv:2408.04737  [pdf, other

    cs.SD cs.LG eess.AS

    Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems

    Authors: Lukáš Samuel Marták, Patricia Hu, Gerhard Widmer

    Abstract: Automatic Music Transcription (AMT) is the task of recognizing notes in audio recordings of music. The State-of-the-Art (SotA) benchmarks have been dominated by deep learning systems. Due to the scarcity of high quality data, they are usually trained and evaluated exclusively or predominantly on classical piano music. Unfortunately, that hinders our ability to understand how they generalize to oth… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 2 pages, 1 figure, presented in the 1st International Workshop on Sound Signal Processing Applications (IWSSPA) 2024

  17. arXiv:2408.03124  [pdf, other

    eess.SY cs.LG

    CL-DiffPhyCon: Closed-loop Diffusion Control of Complex Physical Systems

    Authors: Long Wei, Haodong Feng, Yuchen Yang, Ruiqi Feng, Peiyan Hu, Xiang Zheng, Tao Zhang, Dixia Fan, Tailin Wu

    Abstract: The control problems of complex physical systems have broad applications in science and engineering. Previous studies have shown that generative control methods based on diffusion models offer significant advantages for solving these problems. However, existing generative control approaches face challenges in both performance and efficiency when extended to the closed-loop setting, which is essent… ▽ More

    Submitted 22 February, 2025; v1 submitted 31 July, 2024; originally announced August 2024.

    Comments: Published as a conference paper at ICLR 2025

  18. arXiv:2407.11620  [pdf

    eess.SP

    A Deep Learning-Based Target Radial Length Estimation Method through HRRP Sequence

    Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Xiao Sun, Zehao Wang

    Abstract: This paper introduces an innovative deep learning-based method for end-to-end target radial length estimation from HRRP (High Resolution Range Profile) sequences. Firstly, the HRRP sequences are normalized and transformed into GAF (Gram Angular Field) images to effectively capture and utilize the temporal information. Subsequently, these GAF images serve as the input for a pretrained ResNet-101 mo… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 2 pages, 2 figures. Accepted by APCAP 2024

  19. arXiv:2407.08236  [pdf, other

    eess.SP

    HRRPGraphNet: Make HRRPs to Be Graphs for Efficient Target Recognition

    Authors: Lingfeng Chen, Xiao Sun, Zhiliang Pan, Zehao Wang, Xiaolong Su, Zhen Liu, Panhe Hu

    Abstract: High Resolution Range Profiles (HRRP) have become a key area of focus in the domain of Radar Automatic Target Recognition (RATR). Despite the success of deep learning based HRRP recognition, these methods needs a large amount of training samples to generate good performance, which could be a severe challenge under non-cooperative circumstances. Currently, deep learning based models treat HRRP as s… ▽ More

    Submitted 1 November, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: 3 pages, 3 figures. Accepted by IET Electronics Letters

  20. arXiv:2406.15160  [pdf, other

    eess.AS eess.SP

    Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

    Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

    Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by icme2024

  21. arXiv:2406.08454  [pdf, other

    cs.SD eess.AS

    Towards Musically Informed Evaluation of Piano Transcription Models

    Authors: Patricia Hu, Lukáš Samuel Marták, Carlos Cancino-Chacón, Gerhard Widmer

    Abstract: Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent y… ▽ More

    Submitted 7 October, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR 2024)

  22. arXiv:2405.15438  [pdf, other

    cs.CV cs.LG eess.IV

    Comparing remote sensing-based forest biomass mapping approaches using new forest inventory plots in contrasting forests in northeastern and southwestern China

    Authors: Wenquan Dong, Edward T. A. Mitchard, Yuwei Chen, Man Chen, Congfeng Cao, Peilun Hu, Cong Xu, Steven Hancock

    Abstract: Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbia… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  23. arXiv:2312.04795  [pdf, other

    eess.SP

    Latency versus Transmission Power Trade-off in Free-Space Optical (FSO) Satellite Networks with Multiple Inter-Continental Connections

    Authors: Jintao Liang, Aizaz Chaudhry, John Chinneck, Halim Yanikomeroglu, Gunes Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites connected via laser inter-satellite links (LISLs), latency is a critical factor, especially for long-distance inter-continental connections. Since satellites depend on solar panels for power supply, power consumption is also a vital factor. We investigate the minimization of total network latency (i.e., the sum of the network latencies… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of the Communications Society

  24. arXiv:2312.04788  [pdf, other

    eess.SP

    Free-Space Optical (FSO) Satellite Networks Performance Analysis: Transmission Power, Latency, and Outage Probability

    Authors: Jintao Liang, Aizaz U. Chaudhry, Eylem Erdogan, Halim Yanikomeroglu, Gunes Karabulut Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites can have different laser inter-satellite link (LISL) ranges for connectivity. Greater LISL ranges can reduce network latency of the path but can also result in an increase in transmission power for satellites on the path. Consequently, this tradeoff between satellite transmission power and network latency should be investigated, and in… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of Vehicular Technology

  25. Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

    Authors: Qijie Shao, Pengcheng Guo, Jinghao Yan, Pengfei Hu, Lei Xie

    Abstract: Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related a… ▽ More

    Submitted 17 November, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted by IEEE Transactions on Audio, Speech and Language Processing (TASLP)

  26. arXiv:2309.07925  [pdf, other

    eess.AS cs.AI cs.MM cs.SD

    Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

    Authors: Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

    Abstract: In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions. In our framework, deep features extracted from foundation models are used as robust acoustic and visual representations of raw video. Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion. Then, we introduce a joint decoding structure for e… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: The 31st ACM International Conference on Multimedia (MM'23), 2023

  27. arXiv:2309.02399  [pdf, other

    cs.SD cs.DL eess.AS

    The Batik-plays-Mozart Corpus: Linking Performance to Score to Musicological Annotations

    Authors: Patricia Hu, Gerhard Widmer

    Abstract: We present the Batik-plays-Mozart Corpus, a piano performance dataset combining professional Mozart piano sonata performances with expert-labelled scores at a note-precise level. The performances originate from a recording by Viennese pianist Roland Batik on a computer-monitored Bösendorfer grand piano, and are available both as MIDI files and audio recordings. They have been precisely aligned, no… ▽ More

    Submitted 6 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: To be published in the Proceedings of the 24th International Society for Music Information Retrieval Conference (ISMIR 2023), Milan, Italy

  28. arXiv:2306.14471  [pdf

    physics.med-ph eess.IV physics.ins-det physics.optics

    Single-shot 3D photoacoustic computed tomography with a densely packed array for transcranial functional imaging

    Authors: Rui Cao, Yilin Luo, Jinhua Xu, Xiaofei Luo, Ku Geng, Yousuf Aborahama, Manxiu Cui, Samuel Davis, Shuai Na, Xin Tong, Cindy Liu, Karteek Sastry, Konstantin Maslov, Peng Hu, Yide Zhang, Li Lin, Yang Zhang, Lihong V. Wang

    Abstract: Photoacoustic computed tomography (PACT) is emerging as a new technique for functional brain imaging, primarily due to its capabilities in label-free hemodynamic imaging. Despite its potential, the transcranial application of PACT has encountered hurdles, such as acoustic attenuations and distortions by the skull and limited light penetration through the skull. To overcome these challenges, we hav… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  29. arXiv:2306.09397  [pdf, other

    cs.LG cs.MA eess.SP

    Non-Asymptotic Performance of Social Machine Learning Under Limited Data

    Authors: Ping Hu, Virginia Bordignon, Mert Kayaalp, Ali H. Sayed

    Abstract: This paper studies the probability of error associated with the social machine learning framework, which involves an independent training phase followed by a cooperative decision-making phase over a graph. This framework addresses the problem of classifying a stream of unlabeled data in a distributed manner. In this work, we examine the classification task with limited observations during the deci… ▽ More

    Submitted 9 July, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  30. arXiv:2304.12939  [pdf, other

    cs.SD cs.HC eess.AS

    The ACCompanion: Combining Reactivity, Robustness, and Musical Expressivity in an Automatic Piano Accompanist

    Authors: Carlos Cancino-Chacón, Silvan Peter, Patricia Hu, Emmanouil Karystinaios, Florian Henkel, Francesco Foscarin, Nimrod Varga, Gerhard Widmer

    Abstract: This paper introduces the ACCompanion, an expressive accompaniment system. Similarly to a musician who accompanies a soloist playing a given musical piece, our system can produce a human-like rendition of the accompaniment part that follows the soloist's choices in terms of tempo, dynamics, and articulation. The ACCompanion works in the symbolic domain, i.e., it needs a musical instrument capable… ▽ More

    Submitted 30 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23), Macao, China. The differences/extensions with the previous version include a technical appendix, added missing links, and minor text updates. 10 pages, 4 figures

  31. arXiv:2303.12883  [pdf, other

    eess.SY

    HAPS-UAV-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach

    Authors: Atefeh H. Arani, Peng Hu, Yeying Zhu

    Abstract: The integrated use of non-terrestrial network (NTN) entities such as the high-altitude platform station (HAPS) and low-altitude platform station (LAPS) has become essential elements in the space-air-ground integrated networks (SAGINs). However, the complexity, mobility, and heterogeneity of NTN entities and resources present various challenges from system design to deployment. This paper proposes… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  32. arXiv:2303.05697  [pdf

    physics.med-ph eess.IV eess.SP

    Quantification of cervical elasticity during pregnancy based on transvaginal ultrasound imaging and stress measurement

    Authors: Peng Hu, Peinan Zhao, Yuan Qu, Konstantin Maslov, Jessica Chubiz, Methodius G. Tuuli, Molly J. Stout, Lihong V. Wang

    Abstract: Objective: Strain elastography and shear wave elastography are two commonly used methods to quantify cervical elasticity; however, they have limitations. Strain elastography is effective in showing tissue elasticity distribution in a single image, but the absence of stress information causes difficulty in comparing the results acquired from different imaging sessions. Shear wave elastography is ef… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: 26 pages, 8 figures, 1 table

  33. arXiv:2302.13130  [pdf, other

    cs.CV eess.SP

    Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting

    Authors: Tarasha Khurana, Peiyun Hu, David Held, Deva Ramanan

    Abstract: Predicting how the world can evolve in the future is crucial for motion planning in autonomous systems. Classical methods are limited because they rely on costly human annotations in the form of semantic class labels, bounding boxes, and tracks or HD maps of cities to plan their motion and thus are difficult to scale to large unlabeled datasets. One promising self-supervised task is 3D point cloud… ▽ More

    Submitted 30 April, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: CVPR 2023. Project page: https://www.cs.cmu.edu/~tkhurana/ff4d/index.html Code: https://github.com/tarashakhurana/4d-occ-forecasting

  34. arXiv:2302.05525  [pdf, other

    cs.LG cs.NE eess.SY

    Satellite Anomaly Detection Using Variance Based Genetic Ensemble of Neural Networks

    Authors: Mohammad Amin Maleki Sadr, Yeying Zhu, Peng Hu

    Abstract: In this paper, we use a variance-based genetic ensemble (VGE) of Neural Networks (NNs) to detect anomalies in the satellite's historical data. We use an efficient ensemble of the predictions from multiple Recurrent Neural Networks (RNNs) by leveraging each model's uncertainty level (variance). For prediction, each RNN is guided by a Genetic Algorithm (GA) which constructs the optimal structure for… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  35. arXiv:2301.03641  [pdf, other

    cs.NI eess.SY

    Toward Multi-Layer Networking for Satellite Network Operations

    Authors: Peng Hu

    Abstract: Recent advancements in low-Earth-orbit (LEO) satellites aim to bring resilience, ubiquitous, and high-quality service to future Internet infrastructure. However, the soaring number of space assets, increasing dynamics of LEO satellites and expanding dimensions of network threats call for an enhanced approach to efficient satellite operations. To address these pressing challenges, we propose an app… ▽ More

    Submitted 19 November, 2024; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: To be published in the Proceedings of 12th Annual IEEE International Conference on Wireless for Space and Extreme Environments (WISEE 2024), Dec. 16 - 18, 2024, Daytona Beach, FL, USA

  36. arXiv:2212.05986  [pdf, other

    cs.NI eess.SY

    A Cross-Layer Descent Approach for Resilient Network Operations of Proliferated LEO Satellites

    Authors: Peng Hu

    Abstract: With the proliferated low-Earth-orbit (LEO) satellites in mega-constellations, the future Internet will be able to reach any place on Earth, providing high-quality services to everyone. However, high-quality operations in terms of timeliness and resilience are lacking in the current solutions. This paper proposes a multi-layer networking approach called "Cross-Layer Descent (CLD)". Based on the pr… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: 2023 IEEE Wireless Communications and Networking Conference (WCNC), 26--29 March 2023, Glasgow, UK

  37. arXiv:2212.04148  [pdf, other

    cs.CV eess.IV

    Relationship Quantification of Image Degradations

    Authors: Wenxin Wang, Boyun Li, Yuanbiao Gou, Peng Hu, Wangmeng Zuo, Xi Peng

    Abstract: In this paper, we study two challenging but less-touched problems in image restoration, namely, i) how to quantify the relationship between image degradations and ii) how to improve the performance of a specific restoration task using the quantified relationship. To tackle the first challenge, we proposed a Degradation Relationship Index (DRI) which is defined as the mean drop rate difference in t… ▽ More

    Submitted 5 August, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

  38. arXiv:2212.03729  [pdf, other

    eess.SY cs.NI

    Enabling Resilient and Real-Time Network Operations in Space: A Novel Multi-Layer Satellite Networking Scheme

    Authors: Peng Hu

    Abstract: Recently advanced low-Earth-orbit (LEO) satellite networks represented by large constellations and advanced payloads provide great promises for enabling high-quality Internet connectivity to any place on Earth. However, the traditional access-based approach to satellite operations cannot meet the pressing requirements of real-time, reliable, and resilient operations for LEO satellites. A new schem… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: Published in the Proceedings of the 2022 IEEE Latin-American Conference on Communications (LATINCOM), 30 November - 2 December 2022, Rio de Janeiro, Brazil

  39. AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary

    Authors: Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingamy, Riccardo Spolaor, Parth Pathaky, Guoming Zhang, Xiuzhen Cheng

    Abstract: With the increasing popularity of voice-based applications, acoustic eavesdropping has become a serious threat to users' privacy. While on smartphones the access to microphones needs an explicit user permission, acoustic eavesdropping attacks can rely on motion sensors (such as accelerometer and gyroscope), which access is unrestricted. However, previous instances of such attacks can only recogniz… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 2022 IEEE Symposium on Security and Privacy (SP)

    Journal ref: 2022 IEEE Symposium on Security and Privacy (SP)

  40. arXiv:2211.14938  [pdf, other

    cs.LG cs.AI eess.SP

    An Anomaly Detection Method for Satellites Using Monte Carlo Dropout

    Authors: Mohammad Amin Maleki Sadr, Yeying Zhu, Peng Hu

    Abstract: Recently, there has been a significant amount of interest in satellite telemetry anomaly detection (AD) using neural networks (NN). For AD purposes, the current approaches focus on either forecasting or reconstruction of the time series, and they cannot measure the level of reliability or the probability of correct detection. Although the Bayesian neural network (BNN)-based approaches are well kno… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 2022

  41. arXiv:2211.14931  [pdf, other

    eess.SY cs.LG cs.NI

    UAV-Assisted Space-Air-Ground Integrated Networks: A Technical Review of Recent Learning Algorithms

    Authors: Atefeh H. Arani, Peng Hu, Yeying Zhu

    Abstract: Recent technological advancements in space, air, and ground components have made possible a new network paradigm called space-air-ground integrated network (SAGIN). Unmanned aerial vehicles (UAVs) play a key role in SAGINs. However, due to UAVs' high dynamics and complexity, real-world deployment of a SAGIN becomes a significant barrier to realizing such SAGINs. UAVs are expected to meet key perfo… ▽ More

    Submitted 16 July, 2024; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: Accepted by the IEEE Open Journal of Vehicular Technology in July 2024

  42. arXiv:2209.04093  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Learning Audio-Visual embedding for Person Verification in the Wild

    Authors: Peiwen Sun, Shanshan Zhang, Zishan Liu, Yougen Yuan, Taotao Zhang, Honggang Zhang, Pengfei Hu

    Abstract: It has already been observed that audio-visual embedding is more robust than uni-modality embedding for person verification. Here, we proposed a novel audio-visual strategy that considers aggregators from a fusion perspective. First, we introduced weight-enhanced attentive statistics pooling for the first time in face verification. We find that a strong correlation exists between modalities during… ▽ More

    Submitted 26 October, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

  43. arXiv:2209.02205  [pdf, other

    cs.CV eess.SY

    High Speed Rotation Estimation with Dynamic Vision Sensors

    Authors: Guangrong Zhao, Yiran Shen, Ning Chen, Pengfei Hu, Lei Liu, Hongkai Wen

    Abstract: Rotational speed is one of the important metrics to be measured for calibrating the electric motors in manufacturing, monitoring engine during car repairing, faults detection on electrical appliance and etc. However, existing measurement techniques either require prohibitive hardware (e.g., high-speed camera) or are inconvenient to use in real-world application scenarios. In this paper, we propose… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: 10 pages,13 figures

  44. arXiv:2206.00248  [pdf

    physics.med-ph eess.SP

    Transcranial photoacoustic computed tomography of human brain function

    Authors: Yang Zhang, Shuai Na, Karteekeya Sastry, Jonathan J. Russin, Peng Hu, Li Lin, Xin Tong, Kay B. Jann, Danny J. Wang, Charles Y. Liu, Lihong V. Wang

    Abstract: Herein we report the first in-human transcranial imaging of brain function using photoacoustic computed tomography. Functional responses to benchmark motor tasks were imaged on both the skull-less and the skull-intact hemispheres of a hemicraniectomy patient. The observed brain responses in these preliminary results demonstrate the potential of photoacoustic computed tomography for achieving trans… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  45. arXiv:2205.12459  [pdf, other

    cs.CV eess.IV

    A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

    Authors: Zhiqiang Gong, Ping Zhong, Jiahao Qi, Panhe Hu

    Abstract: Deep Neural Networks have been successfully applied in hyperspectral image classification. However, most of prior works adopt general deep architectures while ignore the intrinsic structure of the hyperspectral image, such as the physical noise generation. This would make these deep models unable to generate discriminative features and provide impressive classification performance. To leverage suc… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Journal ref: IET Image Processing, 2022

  46. arXiv:2204.04956  [pdf, other

    eess.IV cs.CV

    Segmentation Network with Compound Loss Function for Hydatidiform Mole Hydrops Lesion Recognition

    Authors: Chengze Zhu, Pingge Hu, Xianxu Zeng, Xingtong Wang, Zehua Ji, Li Shi

    Abstract: Pathological morphology diagnosis is the standard diagnosis method of hydatidiform mole. As a disease with malignant potential, the hydatidiform mole section of hydrops lesions is an important basis for diagnosis. Due to incomplete lesion development, early hydatidiform mole is difficult to distinguish, resulting in a low accuracy of clinical diagnosis. As a remarkable machine learning technology,… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  47. arXiv:2204.04949  [pdf

    eess.IV cs.CV

    A Semantic Segmentation Network Based Real-Time Computer-Aided Diagnosis System for Hydatidiform Mole Hydrops Lesion Recognition in Microscopic View

    Authors: Chengze Zhu, Pingge Hu, Xianxu Zeng, Xingtong Wang, Zehua Ji, Li Shi

    Abstract: As a disease with malignant potential, hydatidiform mole (HM) is one of the most common gestational trophoblastic diseases. For pathologists, the HM section of hydrops lesions is an important basis for diagnosis. In pathology departments, the diverse microscopic manifestations of HM lesions and the limited view under the microscope mean that physicians with extensive diagnostic experience are requ… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  48. arXiv:2204.03398  [pdf, other

    cs.SD eess.AS

    Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

    Authors: Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

    Abstract: General accent recognition (AR) models tend to directly extract low-level information from spectrums, which always significantly overfit on speakers or channels. Considering accent can be regarded as a series of shifts relative to native pronunciation, distinguishing accents will be an easier task with accent shift as input. But due to the lack of native utterance as an anchor, estimating the acce… ▽ More

    Submitted 1 July, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted by Interspeech 2022

  49. Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition

    Authors: Guodong Ma, Pengfei Hu, Jian Kang, Shen Huang, Hao Huang

    Abstract: In Uyghur speech, consonant and vowel reduction are often encountered, especially in spontaneous speech with high speech rate, which will cause a degradation of speech recognition performance. To solve this problem, we propose an effective phone mask training method for Conformer-based Uyghur end-to-end (E2E) speech recognition. The idea is to randomly mask off a certain percentage features of pho… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted by INTERSPEECH 2021

    Journal ref: INTERSPEECH 2021

  50. arXiv:2203.15249  [pdf, other

    cs.SD eess.AS

    MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification

    Authors: Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng

    Abstract: In this paper, we present Multi-scale Feature Aggregation Conformer (MFA-Conformer), an easy-to-implement, simple but effective backbone for automatic speaker verification based on the Convolution-augmented Transformer (Conformer). The architecture of the MFA-Conformer is inspired by recent stateof-the-art models in speech recognition and speaker verification. Firstly, we introduce a convolution s… ▽ More

    Submitted 10 November, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: accepted by INTERSPEECH 2022