Skip to main content

Showing 1–50 of 80 results for author: Mao, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.09377  [pdf, ps, other

    eess.IV

    An Interpretable Two-Stage Feature Decomposition Method for Deep Learning-based SAR ATR

    Authors: Chenwei Wang, Renjie Xu, Congwen Wu, Cunyi Yin, Ziyun Liao, Deqing Mao, Sitong Zhang, Hong Yan

    Abstract: Synthetic aperture radar automatic target recognition (SAR ATR) has seen significant performance improvements with deep learning. However, the black-box nature of deep SAR ATR introduces low confidence and high risks in decision-critical SAR applications, hindering practical deployment. To address this issue, deep SAR ATR should provide an interpretable reasoning basis $r_b$ and logic $λ_w$, formi… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  2. arXiv:2506.01737  [pdf, ps, other

    cs.NE eess.SP

    The Promise of Spiking Neural Networks for Ubiquitous Computing: A Survey and New Perspectives

    Authors: Hemanth Sabbella, Archit Mukherjee, Thivya Kandappu, Sounak Dey, Arpan Pal, Archan Misra, Dong Ma

    Abstract: Spiking neural networks (SNNs) have emerged as a class of bio -inspired networks that leverage sparse, event-driven signaling to achieve low-power computation while inherently modeling temporal dynamics. Such characteristics align closely with the demands of ubiquitous computing systems, which often operate on resource-constrained devices while continuously monitoring and processing time-series se… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 50 pages

    ACM Class: I.2

  3. arXiv:2505.06296  [pdf, other

    eess.SP

    Q-Heart: ECG Question Answering via Knowledge-Informed Multimodal LLMs

    Authors: Hung Manh Pham, Jialu Tang, Aaqib Saeed, Dong Ma

    Abstract: Electrocardiography (ECG) offers critical cardiovascular insights, such as identifying arrhythmias and myocardial ischemia, but enabling automated systems to answer complex clinical questions directly from ECG signals (ECG-QA) remains a significant challenge. Current approaches often lack robust multimodal reasoning capabilities or rely on generic architectures ill-suited for the nuances of physio… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  4. arXiv:2503.12864  [pdf, other

    eess.SY

    Robust Co-Optimization of Distribution Network Hardening and Mobile Resource Scheduling with Decision-Dependent Uncertainty

    Authors: Donglai Ma, Xiaoyu Cao, Bo Zeng, Chen Chen, Qiaozhu Zhai, Qing-Shan Jia, Xiaohong Guan

    Abstract: This paper studies the robust co-planning of proactive network hardening and mobile hydrogen energy resources (MHERs) scheduling, which is to enhance the resilience of power distribution network (PDN) against the disastrous events. A decision-dependent robust optimization model is formulated with min-max resilience constraint and discrete recourse structure, which helps achieve the load survivabil… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: 15 pages, 3 figures

  5. arXiv:2503.04375  [pdf, other

    eess.SY

    Proactive Robust Hardening of Resilient Power Distribution Network: Decision-Dependent Uncertainty Modeling and Fast Solution Strategy

    Authors: Donglai Ma, Xiaoyu Cao, Bo Zeng, Qing-Shan Jia, Chen Chen, Qiaozhu Zhai, Xiaohong Guan

    Abstract: To address the power system hardening problem, traditional approaches often adopt robust optimization (RO) that considers a fixed set of concerned contingencies, regardless of the fact that hardening some components actually renders relevant contingencies impractical. In this paper, we directly adopt a dynamic uncertainty set that explicitly incorporates the impact of hardening decisions on the wo… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  6. arXiv:2503.01248  [pdf, other

    eess.IV cs.CV cs.LG q-bio.TO

    Comprehensive Evaluation of OCT-based Automated Segmentation of Retinal Layer, Fluid and Hyper-Reflective Foci: Impact on Diabetic Retinopathy Severity Assessment

    Authors: S. Chen, D. Ma, M. Raviselvan, S. Sundaramoorthy, K. Popuri, M. J. Ju, M. V. Sarunic, D. Ratra, M. F. Beg

    Abstract: Diabetic retinopathy (DR) is a leading cause of vision loss, requiring early and accurate assessment to prevent irreversible damage. Spectral Domain Optical Coherence Tomography (SD-OCT) enables high-resolution retinal imaging, but automated segmentation performance varies, especially in cases with complex fluid and hyperreflective foci (HRF) patterns. This study proposes an active-learning-based… ▽ More

    Submitted 10 April, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: 20 pages, 11 figures

  7. arXiv:2501.02000  [pdf, other

    eess.IV cs.AI cs.CV

    Multi-Center Study on Deep Learning-Assisted Detection and Classification of Fetal Central Nervous System Anomalies Using Ultrasound Imaging

    Authors: Yang Qi, Jiaxin Cai, Jing Lu, Runqing Xiong, Rongshang Chen, Liping Zheng, Duo Ma

    Abstract: Prenatal ultrasound evaluates fetal growth and detects congenital abnormalities during pregnancy, but the examination of ultrasound images by radiologists requires expertise and sophisticated equipment, which would otherwise fail to improve the rate of identifying specific types of fetal central nervous system (CNS) abnormalities and result in unnecessary patient examinations. We construct a deep… ▽ More

    Submitted 1 January, 2025; originally announced January 2025.

  8. arXiv:2412.03959  [pdf, other

    cs.RO eess.SY

    Is FISHER All You Need in The Multi-AUV Underwater Target Tracking Task?

    Authors: Jingzehua Xu, Guanwen Xie, Ziqi Zhang, Xiangwang Hou, Dongfang Ma, Shuai Zhang, Yong Ren, Dusit Niyato

    Abstract: It is significant to employ multiple autonomous underwater vehicles (AUVs) to execute the underwater target tracking task collaboratively. However, it's pretty challenging to meet various prerequisites utilizing traditional control methods. Therefore, we propose an effective two-stage learning from demonstrations training framework, FISHER, to highlight the adaptability of reinforcement learning (… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Journal ref: IEEE Transactions on Mobile Computing 2025

  9. arXiv:2409.19585  [pdf, other

    cs.SD cs.CL eess.AS

    Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions

    Authors: Jinyi Mi, Xiaohan Shi, Ding Ma, Jiajun He, Takuya Fujimura, Tomoki Toda

    Abstract: Developing a robust speech emotion recognition (SER) system in noisy conditions faces challenges posed by different noise properties. Most previous studies have not considered the impact of human speech noise, thus limiting the application scope of SER. In this paper, we propose a novel two-stage framework for the problem by cascading target speaker extraction (TSE) method and SER. We first train… ▽ More

    Submitted 17 December, 2024; v1 submitted 29 September, 2024; originally announced September 2024.

    Comments: This is the preprint version of the paper accepted at APSIPA ASC 2024

  10. arXiv:2409.11796  [pdf, other

    eess.SY

    Communication, Sensing and Control integrated Closed-loop System: Modeling, Control Design and Resource Allocation

    Authors: Zeyang Meng, Dingyou Ma, Zhiqing Wei, Ying Zhou, Zhiyong Feng

    Abstract: The wireless communication technologies have fundamentally revolutionized industrial operations. The operation of the automated equipment is conducted in a closed-loop manner, where the status of devices is collected and sent to the control center through the uplink channel, and the control center sends the calculated control commands back to the devices via downlink communication. However, existi… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: 12 pages, 6 figures

    MSC Class: 60G99; 93D05 ACM Class: H.1.1; I.6.4

  11. arXiv:2408.16415  [pdf, other

    eess.SP cs.ET

    UAV's Rotor Micro-Doppler Feature Extraction Using Integrated Sensing and Communication Signal: Algorithm Design and Testbed Evaluation

    Authors: Jiachen Wei, Dingyou Ma, Feiyang He, Qixun Zhang, Zhiyong Feng, Zhengfeng Liu, Taohong Liang

    Abstract: With the rapid application of unmanned aerial vehicles (UAVs) in urban areas, the identification and tracking of hovering UAVs have become critical challenges, significantly impacting the safety of aircraft take-off and landing operations. As a promising technology for 6G mobile systems, integrated sensing and communication (ISAC) can be used to detect high-mobility UAVs with a low deployment cost… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  12. arXiv:2407.15903  [pdf, other

    eess.IV

    Semantics Guided Disentangled GAN for Chest X-ray Image Rib Segmentation

    Authors: Lili Huang, Dexin Ma, Xiaowei Zhao, Chenglong Li, Haifeng Zhao, Jin Tang, Chuanfu Li

    Abstract: The label annotations for chest X-ray image rib segmentation are time consuming and laborious, and the labeling quality heavily relies on medical knowledge of annotators. To reduce the dependency on annotated data, existing works often utilize generative adversarial network (GAN) to generate training data. However, GAN-based methods overlook the nuanced information specific to individual organs, w… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  13. arXiv:2407.06901  [pdf, other

    cs.HC cs.SD eess.AS

    RespEar: Earable-Based Robust Respiratory Rate Monitoring

    Authors: Yang Liu, Kayla-Jade Butkow, Jake Stuchbury-Wass, Adam Pullin, Dong Ma, Cecilia Mascolo

    Abstract: Respiratory rate (RR) monitoring is integral to understanding physical and mental health and tracking fitness. Existing studies have demonstrated the feasibility of RR monitoring under specific user conditions (e.g., while remaining still, or while breathing heavily). Yet, performing accurate, continuous and non-obtrusive RR monitoring across diverse daily routines and activities remains challengi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  14. arXiv:2407.05391  [pdf, other

    eess.SP

    Interference Management in MIMO-ISAC Systems: A Transceiver Design Approach

    Authors: Yangyang Niu, Zhiqing Wei, Dingyou Ma, Xiaoyu Yang, Huici Wu, Zhiyong Feng, Jianhua Yuan

    Abstract: The integrated sensing and communication (ISAC) system under multi-input multi-output (MIMO) architecture achieves dual functionalities of sensing and communication on the same platform by utilizing spatial gain, which provides a feasible paradigm facing spectrum congestion. However, the dual functionalities of sensing and communication operating simultaneously in the same platform bring severe in… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  15. Hybrid Beamforming Design for Near-Field ISAC with Modular XL-MIMO

    Authors: Chunwei Meng, Dingyou Ma, Zhaolin Wang, Yuanwei Liu, Zhiqing Wei, Zhiyong Feng

    Abstract: A novel modular extremely large-scale multiple-input-multiple-output (XL-MIMO) integrated sensing and communication (ISAC) framework is proposed in this paper. We consider a downlink ISAC scenario and exploit the modular array architecture to enhance the communication spectral efficiency and sensing resolution while reducing the channel modeling complexity by employing the hybrid spherical and pla… ▽ More

    Submitted 20 February, 2025; v1 submitted 18 June, 2024; originally announced June 2024.

  16. arXiv:2405.19338  [pdf, other

    eess.SP cs.AI cs.CV

    Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

    Authors: Yuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei Liu

    Abstract: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imag… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures and tables

    Journal ref: Communications Medicine 4, Article number: 241 (2024)

  17. Multi-Objective Optimization-based Transmit Beamforming for Multi-Target and Multi-User MIMO-ISAC Systems

    Authors: Chunwei Meng, Zhiqing Wei, Dingyou Ma, Wanli Ni, Liyan Su, Zhiyong Feng

    Abstract: Integrated sensing and communication (ISAC) is an enabling technology for the sixth-generation mobile communications, which equips the wireless communication networks with sensing capabilities. In this paper, we investigate transmit beamforming design for multiple-input and multiple-output (MIMO)-ISAC systems in scenarios with multiple radar targets and communication users. A general form of multi… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  18. Cramer-Rao Bounds for Near-Field Sensing: A Generic Modular Architecture

    Authors: Chunwei Meng, Dingyou Ma, Xu Chen, Zhiyong Feng, Yuanwei Liu

    Abstract: A generic modular array architecture is proposed, featuring uniform/non-uniform subarray layouts that allows for flexible deployment. The bistatic near-field sensing system is considered, where the target is located in the near-field of the whole modular array and the far-field of each subarray. Then, the closed-form expressions of Cramer-Rao bounds (CRBs) for range and angle estimations are deriv… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  19. arXiv:2402.15725  [pdf, other

    eess.AS

    Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

    Authors: Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

    Abstract: Human language can be expressed in either written or spoken form, i.e. text or speech. Humans can acquire knowledge from text to improve speaking and listening. However, the quest for speech pre-trained models to leverage unpaired text has just started. In this paper, we investigate a new way to pre-train such a joint speech-text model to learn enhanced speech representations and benefit various s… ▽ More

    Submitted 3 August, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 5 pages, 1 figures,5 tables, accepted by IEEE Signal Processing Letters(SPL)

  20. arXiv:2311.10416  [pdf, ps, other

    eess.SP

    Meta-DSP: A Meta-Learning Approach for Data-Driven Nonlinear Compensation in High-Speed Optical Fiber Systems

    Authors: Xinyu Xiao, Zhennan Zhou, Bin Dong, Dingjiong Ma, Li Zhou, Jie Sun

    Abstract: Nonlinear effects in high-speed optical fiber systems fundamentally limit channel capacity. While traditional Digital Backward Propagation (DBP) with adaptive filters addresses these effects, its computational complexity remains impractical. Data-driven solutions like Filtered DBP (FDBP) reduce complexity but critically lack inherent generalization: Their nonlinear compensation capability cannot b… ▽ More

    Submitted 10 June, 2025; v1 submitted 17 November, 2023; originally announced November 2023.

  21. arXiv:2309.09627  [pdf, other

    cs.SD eess.AS

    Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders

    Authors: Lester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda

    Abstract: We propose a novel framework for electrolaryngeal speech intelligibility enhancement through the use of robust linguistic encoders. Pretraining and fine-tuning approaches have proven to work well in this task, but in most cases, various mismatches, such as the speech type mismatch (electrolaryngeal vs. typical) or a speaker mismatch between the datasets used in each stage, can deteriorate the conv… ▽ More

    Submitted 20 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024. Demo page: lesterphillip.github.io/icassp2024_el_sie

  22. arXiv:2308.08313  [pdf, other

    eess.IV cs.CV

    ECPC-IDS:A benchmark endometrail cancer PET/CT image dataset for evaluation of semantic segmentation and detection of hypermetabolic regions

    Authors: Dechao Tang, Tianming Du, Deguo Ma, Zhiyu Ma, Hongzan Sun, Marcin Grzegorzek, Huiyan Jiang, Chen Li

    Abstract: Endometrial cancer is one of the most common tumors in the female reproductive system and is the third most common gynecological malignancy that causes death after ovarian and cervical cancer. Early diagnosis can significantly improve the 5-year survival rate of patients. With the development of artificial intelligence, computer-assisted diagnosis plays an increasingly important role in improving… ▽ More

    Submitted 11 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 14 pages,6 figures

  23. arXiv:2308.08172  [pdf, other

    eess.IV cs.CV cs.LG

    AATCT-IDS: A Benchmark Abdominal Adipose Tissue CT Image Dataset for Image Denoising, Semantic Segmentation, and Radiomics Evaluation

    Authors: Zhiyu Ma, Chen Li, Tianming Du, Le Zhang, Dechao Tang, Deguo Ma, Shanchuan Huang, Yan Liu, Yihao Sun, Zhihao Chen, Jin Yuan, Qianqing Nie, Marcin Grzegorzek, Hongzan Sun

    Abstract: Methods: In this study, a benchmark \emph{Abdominal Adipose Tissue CT Image Dataset} (AATTCT-IDS) containing 300 subjects is prepared and published. AATTCT-IDS publics 13,732 raw CT slices, and the researchers individually annotate the subcutaneous and visceral adipose tissue regions of 3,213 of those slices that have the same slice distance to validate denoising methods, train semantic segmentati… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 17 pages, 7 figures

  24. SAR Target Image Generation Method Using Azimuth-Controllable Generative Adversarial Network

    Authors: Chenwei Wang, Jifang Pei, Xiaoyu Liu, Yulin Huang, Deqing Mao, Yin Zhang, Jianyu Yang

    Abstract: Sufficient synthetic aperture radar (SAR) target images are very important for the development of researches. However, available SAR target images are often limited in practice, which hinders the progress of SAR application. In this paper, we propose an azimuth-controllable generative adversarial network to generate precise SAR target images with an intermediate azimuth between two given SAR image… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  25. arXiv:2305.15636  [pdf

    eess.SP

    Channelized analog microwave short-time Fourier transform in the optical domain with improved measurement performance

    Authors: Xiaowei Li, Taixia Shi, Dong Ma, Yang Chen

    Abstract: In this article, analog microwave short-time Fourier transform (STFT) with improved measurement performance is implemented in the optical domain by employing stimulated Brillouin scattering (SBS) and channelization. By jointly using three optical frequency combs and filter- and SBS-based frequency-to-time mapping (FTTM), the time-frequency information of the signal under test (SUT) in different fr… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 18 pages, 9 figures, 1 table

  26. arXiv:2301.00504  [pdf

    eess.IV cs.AI cs.CV eess.SP

    Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

    Authors: Timothy T. Yu, Da Ma, Jayden Cole, Myeong Jin Ju, Mirza F. Beg, Marinko V. Sarunic

    Abstract: Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subs… ▽ More

    Submitted 1 January, 2023; originally announced January 2023.

  27. arXiv:2212.00532  [pdf, other

    eess.IV cs.CV

    EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

    Authors: Liyu Shi, Xiaoyan Li, Weiming Hu, Haoyuan Chen, Jing Chen, Zizhen Fan, Minghe Gao, Yujie Jing, Guotao Lu, Deguo Ma, Zhiyu Ma, Qingtao Meng, Dechao Tang, Hongzan Sun, Marcin Grzegorzek, Shouliang Qi, Yueyang Teng, Chen Li

    Abstract: Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when comp… ▽ More

    Submitted 6 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

  28. arXiv:2211.01079  [pdf, other

    cs.SD eess.AS

    Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition

    Authors: Lester Phillip Violeta, Ding Ma, Wen-Chin Huang, Tomoki Toda

    Abstract: Research on automatic speech recognition (ASR) systems for electrolaryngeal speakers has been relatively unexplored due to small datasets. When training data is lacking in ASR, a large-scale pretraining and fine tuning framework is often sufficient to achieve high recognition rates; however, in electrolaryngeal speech, the domain shift between the pretraining and fine-tuning data is too large to o… ▽ More

    Submitted 30 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted to ICASSP 2023

  29. arXiv:2210.10314  [pdf, other

    cs.SD eess.AS

    Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion

    Authors: Ding Ma, Lester Phillip Violeta, Kazuhiro Kobayashi, Tomoki Toda

    Abstract: Sequence-to-sequence (seq2seq) voice conversion (VC) models have greater potential in converting electrolaryngeal (EL) speech to normal speech (EL2SP) compared to conventional VC models. However, EL2SP based on seq2seq VC requires a sufficiently large amount of parallel data for the model training and it suffers from significant performance degradation when the amount of training data is insuffici… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted to SLT 2022

  30. arXiv:2208.14635  [pdf, other

    eess.IV cs.CV cs.LG

    Segmentation-guided Domain Adaptation and Data Harmonization of Multi-device Retinal Optical Coherence Tomography using Cycle-Consistent Generative Adversarial Networks

    Authors: Shuo Chen, Da Ma, Sieun Lee, Timothy T. L. Yu, Gavin Xu, Donghuan Lu, Karteek Popuri, Myeong Jin Ju, Marinko V. Sarunic, Mirza Faisal Beg

    Abstract: Optical Coherence Tomography(OCT) is a non-invasive technique capturing cross-sectional area of the retina in micro-meter resolutions. It has been widely used as a auxiliary imaging reference to detect eye-related pathology and predict longitudinal progression of the disease characteristics. Retina layer segmentation is one of the crucial feature extraction techniques, where the variations of reti… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: 16 pages, 10 figures

  31. arXiv:2208.09143  [pdf

    physics.optics eess.SP

    Photonics-enabled wavelet-like transform via nonlinear optical frequency sweeping and stimulated Brillouin scattering-based frequency-to-time mapping

    Authors: Pengcheng Zuo, Dong Ma, Yang Chen

    Abstract: A photonics-enabled wavelet-like transform system, characterized by multi-resolution time-frequency analysis, is proposed based on a typical stimulated Brillouin scattering (SBS) pump-probe setup using an optical nonlinear frequency-sweep signal. In the pump path, a continuous-wave optical signal is injected into an SBS medium to generate an SBS gain. In the probe path, a periodic nonlinear freque… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

    Comments: 9 pages, 6 figures

  32. arXiv:2208.04871  [pdf

    eess.SP

    Breaking the accuracy and resolution limitation of filter- and frequency-to-time mapping-based time and frequency acquisition methods by broadening the filter bandwidth

    Authors: Pengcheng Zuo, Dong Ma, Xiaowei Li, Yang Chen

    Abstract: In this paper, the filter- and frequency-to-time mapping (FTTM)-based photonics-assisted time and frequency acquisition methods are comprehensively analyzed and the accuracy and resolution limitation in the fast sweep scenario is broken by broadening the filter bandwidth. It is found that when the sweep speed is very fast, the width of the generated pulse via FTTM is mainly determined by the impul… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: 18 pages, 11 figures

  33. arXiv:2207.01175  [pdf

    physics.optics eess.SP

    Photonics-based short-time Fourier transform without high-frequency electronic devices and equipment

    Authors: Pengcheng Zuo, Dong Ma, Yang Chen

    Abstract: A photonics-based short-time Fourier transform (STFT) system is proposed and experimentally demonstrated based on stimulated Brillouin scattering (SBS) without using high-frequency electronic devices and equipment. The wavelength of a distributed feedback laser diode is periodically swept by using a low-speed periodic sawtooth/triangular driving current. The periodic frequency-sweep optical signal… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

    Comments: 8 pages, 5 figures

  34. arXiv:2204.04579  [pdf, other

    cs.SD eess.AS

    Inferring Pitch from Coarse Spectral Features

    Authors: Danni Ma, Neville Ryant, Mark Liberman

    Abstract: Fundamental frequency (F0) has long been treated as the physical definition of "pitch" in phonetic analysis. But there have been many demonstrations that F0 is at best an approximation to pitch, both in production and in perception: pitch is not F0, and F0 is not pitch. Changes in the pitch involve many articulatory and acoustic covariates; pitch perception often deviates from what F0 analysis pre… ▽ More

    Submitted 26 August, 2022; v1 submitted 9 April, 2022; originally announced April 2022.

  35. arXiv:2203.05707  [pdf

    cs.LG cs.AI eess.IV q-bio.GN

    Machine Learning Based Multimodal Neuroimaging Genomics Dementia Score for Predicting Future Conversion to Alzheimer's Disease

    Authors: Ghazal Mirabnahrazam, Da Ma, Sieun Lee, Karteek Popuri, Hyunwoo Lee, Jiguo Cao, Lei Wang, James E Galvin, Mirza Faisal Beg, the Alzheimer's Disease Neuroimaging Initiative

    Abstract: Background: The increasing availability of databases containing both magnetic resonance imaging (MRI) and genetic data allows researchers to utilize multimodal data to better understand the characteristics of dementia of Alzheimer's type (DAT). Objective: The goal of this study was to develop and analyze novel biomarkers that can help predict the development and progression of DAT. Methods: We use… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

    Journal ref: J Alzheimers Dis 1 Jan. (2022) 1-21

  36. arXiv:2202.09954  [pdf, other

    eess.SP cs.IT cs.LG

    Theoretical Analysis of Deep Neural Networks in Physical Layer Communication

    Authors: Jun Liu, Haitao Zhao, Dongtang Ma, Kai Mei, Jibo Wei

    Abstract: Recently, deep neural network (DNN)-based physical layer communication techniques have attracted considerable interest. Although their potential to enhance communication systems and superb performance have been validated by simulation experiments, little attention has been paid to the theoretical analysis. Specifically, most studies in the physical layer have tended to focus on the application of… ▽ More

    Submitted 26 August, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

    Comments: 15 pages, 13 figures, has been accepted for publication in IEEE Transactions on Communications. arXiv admin note: substantial text overlap with arXiv:2106.01124

    Journal ref: IEEE Transactions on Communications, 2022

  37. Time-varying microwave photonic filter for arbitrary waveform signal-to-noise ratio improvement

    Authors: Dong Ma, Yang Chen

    Abstract: A time-varying microwave photonic filter (TV-MPF) based on stimulated Brillouin scattering (SBS) is proposed and utilized to suppress the in-band noise of broadband arbitrary microwave waveforms, thereby improving the signal-to-noise ratio (SNR). The filter-controlling signal is designed according to the signal to be filtered and drives the TV-MPF so that the passband of the filter is always align… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 8 pages, 5 figures

  38. Improving Across-Dataset Brain Tissue Segmentation Using Transformer

    Authors: Vishwanatha M. Rao, Zihan Wan, Soroush Arabshahi, David J. Ma, Pin-Yu Lee, Ye Tian, Xuzhe Zhang, Andrew F. Laine, Jia Guo

    Abstract: Brain tissue segmentation has demonstrated great utility in quantifying MRI data through Voxel-Based Morphometry and highlighting subtle structural changes associated with various conditions within the brain. However, manual segmentation is highly labor-intensive, and automated approaches have struggled due to properties inherent to MRI acquisition, leaving a great need for an effective segmentati… ▽ More

    Submitted 31 January, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

    ACM Class: I.4.6

  39. arXiv:2201.07438  [pdf, other

    cs.SD eess.AS

    MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcription

    Authors: Dabiao Ma, Yitong Zhang, Meng Li, Feng Ye

    Abstract: Neural network based end-to-end Text-to-Speech (TTS) has greatly improved the quality of synthesized speech. While how to use massive spontaneous speech without transcription efficiently still remains an open problem. In this paper, we propose MHTTS, a fast multi-speaker TTS system that is robust to transcription errors and speaking style speech data. Specifically, we introduce a multi-head model… ▽ More

    Submitted 4 February, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

  40. Short-time Fourier transform based on stimulated Brillouin scattering

    Authors: Pengcheng Zuo, Dong Ma, Yang Chen

    Abstract: In this paper, all-optical short-time Fourier transform (STFT) based on stimulated Brillouin scattering (SBS) is proposed and further used for real-time time-frequency analysis of different radio frequency (RF) signals. In the proposed all-optical STFT system, SBS not only provides a band-pass filter for implementing the window function in conjunction with a periodic frequency-sweep optical signal… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: 18 pages, 9 figures, 1 table

  41. Physics Assisted Deep Learning for Indoor Imaging using Phaseless Wi-Fi Measurements

    Authors: Samruddhi Deshmukh, Amartansh Dubey, Dingfei Ma, Qifeng Chen, Ross Murch

    Abstract: A physics assisted deep learning framework to perform accurate indoor imaging using phaseless Wi-Fi measurements is proposed. It is able to image objects that are large (compared to wavelength) and have high permittivity values, that existing radio frequency (RF) inverse scattering techniques find very challenging, making it suitable for indoor RF imaging. The technique utilizes a Rytov based inve… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: 14 pages, 10 figures. This work has been submitted to IEEE for possible publication

  42. arXiv:2110.12857  [pdf

    physics.app-ph eess.SP physics.optics

    Photonics-assisted microwave pulse detection and frequency measurement based on pulse replication and frequency-to-time mapping

    Authors: Pengcheng Zuo, Dong Ma, Qingbo Liu, Lizhong Jiang, Yang Chen

    Abstract: A photonics-assisted microwave pulse detection and frequency measurement scheme is proposed. The unknown microwave pulse is converted to the optical domain and then injected into a fiber loop for pulse replication, which makes it easier to identify the microwave pulse with large pulse repetition interval (PRI), whereas stimulated Brillouin scattering-based frequency-to-time mapping (FTTM) is utili… ▽ More

    Submitted 25 September, 2021; originally announced October 2021.

    Comments: 13 pages, 8 figures

  43. arXiv:2109.05627  [pdf, other

    eess.IV cs.CV

    Differential Diagnosis of Frontotemporal Dementia and Alzheimer's Disease using Generative Adversarial Network

    Authors: Da Ma, Donghuan Lu, Karteek Popuri, Mirza Faisal Beg

    Abstract: Frontotemporal dementia and Alzheimer's disease are two common forms of dementia and are easily misdiagnosed as each other due to their similar pattern of clinical symptoms. Differentiating between the two dementia types is crucial for determining disease-specific intervention and treatment. Recent development of Deep-learning-based approaches in the field of medical image computing are delivering… ▽ More

    Submitted 29 September, 2021; v1 submitted 12 September, 2021; originally announced September 2021.

  44. arXiv:2109.03904  [pdf

    eess.SP physics.optics

    Time-frequency analysis of microwave signals based on stimulated Brillouin scattering

    Authors: Dong Ma, Pengcheng Zuo, Yang Chen

    Abstract: A novel photonic approach to the time-frequency analysis of microwave signals is proposed based on the stimulated Brillouin scattering (SBS)-assisted frequency-to-time mapping (FTTM). Two types of time-frequency analysis links, namely parallel SBS link and time-division SBS link are proposed. The parallel SBS link can be utilized to perform real-time time-frequency analysis of microwave signal, wh… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 17 pages, 10 figures, 1 table

  45. arXiv:2107.10701  [pdf, other

    eess.AS cs.SD

    Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech

    Authors: Duo Ma, Nana Hou, Van Tung Pham, Haihua Xu, Eng Siong Chng

    Abstract: To realize robust end-to-end Automatic Speech Recognition(E2E ASR) under radio communication condition, we propose a multitask-based method to joint train a Speech Enhancement (SE) module as the front-end and an E2E ASR model as the back-end in this paper. One of the advantage of the proposed method is that the entire system can be trained from scratch. Different from prior works, either component… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: 7pages,3figures,Submitted to APSIPA2021

  46. arXiv:2107.02345  [pdf, other

    eess.IV cs.CV cs.LG

    Domain Adaptation via CycleGAN for Retina Segmentation in Optical Coherence Tomography

    Authors: Ricky Chen, Timothy T. Yu, Gavin Xu, Da Ma, Marinko V. Sarunic, Mirza Faisal Beg

    Abstract: With the FDA approval of Artificial Intelligence (AI) for point-of-care clinical diagnoses, model generalizability is of the utmost importance as clinical decision-making must be domain-agnostic. A method of tackling the problem is to increase the dataset to include images from a multitude of domains; while this technique is ideal, the security requirements of medical data is a major limitation. A… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: 10 pages, 6 figures, 1 table

    ACM Class: I.4.0

  47. FRaC: FMCW-Based Joint Radar-Communications System via Index Modulation

    Authors: Dingyou Ma, Nir Shlezinger, Tianyao Huang, Yimin Liu, Yonina C. Eldar

    Abstract: Dual function radar communications (DFRC) systems are attractive technologies for autonomous vehicles, which utilize electromagnetic waves to constantly sense the environment while simultaneously communicating with neighbouring devices. An emerging approach to implement DFRC systems is to embed information in radar waveforms via index modulation (IM). Implementation of DFRC schemes in vehicular sy… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: 16 pages

  48. arXiv:2106.08147  [pdf, other

    eess.IV cs.CV cs.LG

    Perceptually-inspired super-resolution of compressed videos

    Authors: Di Ma, Mariana Afonso, Fan Zhang, David R. Bull

    Abstract: Spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency. This approach encodes a lower resolution version of the input video and reconstructs the original resolution during decoding. Instead of using conventional up-sampling filters, recent work has employed advanced super-resolution methods based on convolutional neural networks… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  49. arXiv:2106.01124  [pdf, other

    eess.SP cs.IT cs.LG

    Opening the Black Box of Deep Neural Networks in Physical Layer Communication

    Authors: Jun Liu, Haitao Zhao, Dongtang Ma, Kai Mei, Jibo Wei

    Abstract: Deep Neural Network (DNN)-based physical layer techniques are attracting considerable interest due to their potential to enhance communication systems. However, most studies in the physical layer have tended to focus on the application of DNN models to wireless communication problems but not to theoretically understand how does a DNN work in a communication system. In this paper, we aim to quantit… ▽ More

    Submitted 18 February, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: 6 pages, 5 figures, to be presented in the IEEE Wireless Communications and Networking Conference (WCNC) 2022 Workshop on Machine Learning for Communications: Future Large Scale MIMO and AI-Native Air-Interface

  50. arXiv:2105.11594  [pdf

    eess.IV

    A Fast MR Fingerprinting Simulator for Direct Error Estimation and Sequence Optimization

    Authors: Siyuan Hu, Stephen Jordan, Rasim Boyacioglu, Ignacio Rozada, Matthias Troyer, Mark Griswold, Debra McGivney, Dan Ma

    Abstract: MR Fingerprinting is a novel quantitative MR technique that could simultaneously provide multiple tissue property maps. When optimizing MRF scans, modeling undersampling errors and field imperfections in cost functions will make the optimization results more practical and robust. However, this process is computationally expensive and impractical for sequence optimization algorithms when MRF signal… ▽ More

    Submitted 24 May, 2021; originally announced May 2021.

    Comments: 10 pages, 7 figures