Skip to main content

Showing 1–3 of 3 results for author: Lyon, R F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2409.18239  [pdf, other

    cs.SD cs.LG eess.AS

    Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables

    Authors: Artem Dementyev, Chandan K. A. Reddy, Scott Wisdom, Navin Chatlani, John R. Hershey, Richard F. Lyon

    Abstract: Low latency models are critical for real-time speech enhancement applications, such as hearing aids and hearables. However, the sub-millisecond latency space for resource-constrained hearables remains underexplored. We demonstrate speech enhancement using a computationally efficient minimum-phase FIR filter, enabling sample-by-sample processing to achieve mean algorithmic latency of 0.32 ms to 1.2… ▽ More

    Submitted 7 March, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

  2. arXiv:2404.17490  [pdf, other

    eess.AS cs.SD eess.SP

    The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX

    Authors: Richard F. Lyon, Rob Schonberger, Malcolm Slaney, Mihajlo Velimirović, Honglin Yu

    Abstract: The open-source CARFAC (Cascade of Asymmetric Resonators with Fast-Acting Compression) cochlear model is upgraded to version 2, with improvements to the Matlab implementation, and with new Python/NumPy and JAX implementations -- but C++ version changes are still pending. One change addresses the DC (direct current, or zero frequency) quadratic distortion anomaly previously reported; another reduce… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  3. arXiv:1811.07030  [pdf, other

    cs.SD eess.AS

    Exploring Tradeoffs in Models for Low-latency Speech Enhancement

    Authors: Kevin Wilson, Michael Chinen, Jeremy Thorpe, Brian Patton, John Hershey, Rif A. Saurous, Jan Skoglund, Richard F. Lyon

    Abstract: We explore a variety of neural networks configurations for one- and two-channel spectrogram-mask-based speech enhancement. Our best model improves on previous state-of-the-art performance on the CHiME2 speech enhancement task by 0.4 decibels in signal-to-distortion ratio (SDR). We examine trade-offs such as non-causal look-ahead, computation, and parameter count versus enhancement performance and… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.