Skip to main content

Showing 1–12 of 12 results for author: Hou, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.11150  [pdf, ps, other

    eess.IV cs.CV

    ADAgent: LLM Agent for Alzheimer's Disease Analysis with Collaborative Coordinator

    Authors: Wenlong Hou, Guangqian Yang, Ye Du, Yeung Lau, Lihao Liu, Junjun He, Ling Long, Shujun Wang

    Abstract: Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disease. Early and precise diagnosis of AD is crucial for timely intervention and treatment planning to alleviate the progressive neurodegeneration. However, most existing methods rely on single-modality data, which contrasts with the multifaceted approach used by medical experts. While some deep learning approaches proce… ▽ More

    Submitted 15 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

  2. arXiv:2502.10329  [pdf, other

    cs.SD cs.CR cs.MM eess.AS

    VocalCrypt: Novel Active Defense Against Deepfake Voice Based on Masking Effect

    Authors: Qingyuan Fei, Wenjie Hou, Xuan Hai, Xin Liu

    Abstract: The rapid advancements in AI voice cloning, fueled by machine learning, have significantly impacted text-to-speech (TTS) and voice conversion (VC) fields. While these developments have led to notable progress, they have also raised concerns about the misuse of AI VC technology, causing economic losses and negative public perceptions. To address this challenge, this study focuses on creating active… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

    Comments: 9 pages, four figures

  3. arXiv:2410.07230  [pdf, other

    eess.SP cs.HC cs.LG

    RFBoost: Understanding and Boosting Deep WiFi Sensing via Physical Data Augmentation

    Authors: Weiying Hou, Chenshu Wu

    Abstract: Deep learning shows promising performance in wireless sensing. However, deep wireless sensing (DWS) heavily relies on large datasets. Unfortunately, building comprehensive datasets for DWS is difficult and costly, because wireless data depends on environmental factors and cannot be labeled offline. Despite recent advances in few-shot/cross-domain learning, DWS is still facing data scarcity issues.… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 8, 2, Article 58 (June 2024), 26 pages

    Journal ref: Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 8, 2, Article 58 (June 2024), 26 pages

  4. arXiv:2406.11469  [pdf, other

    eess.IV

    RMFA-Net: A Neural ISP for Real RAW to RGB Image Reconstruction

    Authors: Fei Li, Wenbo Hou, Peng Jia

    Abstract: Deep learning-based ISP algorithms have demonstrated significant potential in raw2rgb reconstruction. However, existing networks have not fully considered the specific characteristics of raw data, such as black level and CFA, which can negatively impact texture and color if mishandled. Moreover, uneven exposure in raw data is also not considered carefully, leading to adverse effects on contrast an… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2206.09783  [pdf, other

    eess.AS cs.CL cs.SD

    Boosting Cross-Domain Speech Recognition with Self-Supervision

    Authors: Han Zhu, Gaofeng Cheng, Jindong Wang, Wenxin Hou, Pengyuan Zhang, Yonghong Yan

    Abstract: The cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to the mismatch between training and testing distributions. Since the target domain usually lacks labeled data, and domain shifts exist at acoustic and linguistic levels, it is challenging to perform unsupervised domain adaptation (UDA) for ASR. Previous work has shown that self-supervised learning (S… ▽ More

    Submitted 30 July, 2023; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2023

  6. arXiv:2203.14823  [pdf

    physics.optics eess.SP

    Reciprocal phase transition-enabled electro-optic modulation

    Authors: Fang Zou, Lei Zou, Ye Tian, Yiming Zhang, Erwin Bente, Weigang Hou, Yu Liu, Siming Chen, Victoria Cao, Lei Guo, Songsui Li, Lianshan Yan, Wei Pan, Dusan Milosevic, Zizheng Cao, A. M. J. Koonen, Huiyun Liu, Xihua Zou

    Abstract: Electro-optic (EO) modulation is a well-known and essential topic in the field of communications and sensing. Its ultrahigh efficiency is unprecedentedly desired in the current green and data era. However, dramatically increasing the modulation efficiency is difficult due to the monotonic mapping relationship between the electrical signal and modulated optical signal. Here, a new mechanism termed… ▽ More

    Submitted 22 November, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: 27 pages, 14 figures

  7. arXiv:2105.11905  [pdf, other

    cs.CL cs.SD eess.AS

    Exploiting Adapters for Cross-lingual Low-resource Speech Recognition

    Authors: Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki

    Abstract: Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. In this paper, we propose to use adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech… ▽ More

    Submitted 17 December, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: Accepted by IEEE Transactions on Audio, Speech, and Language Processing (TASLP) as a full paper; 12 pages; code at https://github.com/jindongwang/transferlearning/tree/master/code/ASR/Adapter

  8. arXiv:2104.07491  [pdf, other

    cs.SD cs.LG eess.AS

    Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching

    Authors: Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki

    Abstract: End-to-end automatic speech recognition (ASR) can achieve promising performance with large-scale training data. However, it is known that domain mismatch between training and testing data often leads to a degradation of recognition accuracy. In this work, we focus on the unsupervised domain adaptation for ASR and propose CMatch, a Character-level distribution matching method to perform fine-graine… ▽ More

    Submitted 8 June, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to INTERSPEECH 2021; code available at https://github.com/jindongwang/transferlearning/tree/master/code/ASR/CMatch

  9. arXiv:2104.05752  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

    Authors: Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Kuo, Samuel Thomas, Edmilson Morais

    Abstract: A major focus of recent research in spoken language understanding (SLU) has been on the end-to-end approach where a single model can predict intents directly from speech inputs without intermediate transcripts. However, this approach presents some challenges. First, since speech can be considered as personally identifiable information, in some cases only automatic speech recognition (ASR) transcri… ▽ More

    Submitted 14 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to Interspeech 2021

  10. arXiv:2004.04955  [pdf, other

    cs.CV cs.LG eess.IV

    Boosting Semantic Human Matting with Coarse Annotations

    Authors: Jinlin Liu, Yuan Yao, Wendi Hou, Miaomiao Cui, Xuansong Xie, Changshui Zhang, Xian-sheng Hua

    Abstract: Semantic human matting aims to estimate the per-pixel opacity of the foreground human regions. It is quite challenging and usually requires user interactive trimaps and plenty of high quality annotated data. Annotating such kind of data is labor intensive and requires great skills beyond normal users, especially considering the very detailed hair part of humans. In contrast, coarse annotated human… ▽ More

    Submitted 10 April, 2020; originally announced April 2020.

  11. arXiv:2001.01888  [pdf

    eess.SP

    Indoor Localization System of ROS mobile robot based on Visible Light Communication

    Authors: Weipeng Guan, Shihuan Chen, Shangsheng Wen, Wenyuan Hou, Zequn Tan, Ruihong Cen

    Abstract: In this paper, an indoor robot localization system based on Robot Operating System (ROS) and visible light communication (VLC) is presented. On the basis of our previous work, we innovatively designed a VLC localization and navigation package based on Robot Operating System (ROS), which contains the LED-ID detection and recognition method, the video target tracking algorithm and the double-lamp po… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  12. Study on the spectral reconstruction of typical surface types based on spectral library and principal component analysis

    Authors: Weizhen Hou, Yilan Mao, Chi Xu, Zhengqiang Li, Donghui Li, Yan Ma, Hua Xu

    Abstract: To meet the demanding of spectral reconstruction in the visible and near-infrared wavelength, the spectral reconstruction method for typical surface types is discussed based on the USGS /ASTER spectral library and principal component analysis (PCA). A new spectral reconstructed model is proposed by the information of several typical bands instead of all of the wavelength bands, and a linear combin… ▽ More

    Submitted 15 June, 2019; originally announced June 2019.

    Comments: 10 pages, 7 figures

    Journal ref: Proceedings of SPIE, 2019