Skip to main content

Showing 1–3 of 3 results for author: Wang, S X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.02039  [pdf, other

    eess.AS cs.AI cs.SD

    No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction

    Authors: Haoshuai Zhou, Changgeng Mo, Boxuan Cao, Linkai Li, Shan Xiang Wang

    Abstract: Personalized speech intelligibility prediction is challenging. Previous approaches have mainly relied on audiograms, which are inherently limited in accuracy as they only capture a listener's hearing threshold for pure tones. Rather than incorporating additional listener features, we propose a novel approach that leverages an individual's existing intelligibility data to predict their performance… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025

  2. arXiv:2505.08215  [pdf, ps, other

    cs.AI cs.SD eess.AS

    Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People

    Authors: Haoshuai Zhou, Boxuan Cao, Changgeng Mo, Linkai Li, Shan Xiang Wang

    Abstract: Speech foundation models (SFMs) have demonstrated strong performance across a variety of downstream tasks, including speech intelligibility prediction for hearing-impaired people (SIP-HI). However, optimizing SFMs for SIP-HI has been insufficiently explored. In this paper, we conduct a comprehensive study to identify key design factors affecting SIP-HI performance with 5 SFMs, focusing on encoder… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  3. arXiv:2502.01770  [pdf, other

    cs.LG cs.AI eess.IV

    Hamming Attention Distillation: Binarizing Keys and Queries for Efficient Long-Context Transformers

    Authors: Mark Horton, Tergel Molom-Ochir, Peter Liu, Bhavna Gopal, Chiyue Wei, Cong Guo, Brady Taylor, Deliang Fan, Shan X. Wang, Hai Li, Yiran Chen

    Abstract: Pre-trained transformer models with extended context windows are notoriously expensive to run at scale, often limiting real-world deployment due to their high computational and memory requirements. In this paper, we introduce Hamming Attention Distillation (HAD), a novel framework that binarizes keys and queries in the attention mechanism to achieve significant efficiency gains. By converting keys… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.