Skip to main content

Showing 1–2 of 2 results for author: Soltanmohammadi, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.16500  [pdf, other

    eess.AS cs.AI cs.CL

    Speech Retrieval-Augmented Generation without Automatic Speech Recognition

    Authors: Do June Min, Karel Mundnich, Andy Lapastora, Erfan Soltanmohammadi, Srikanth Ronanki, Kyu Han

    Abstract: One common approach for question answering over speech data is to first transcribe speech using automatic speech recognition (ASR) and then employ text-based retrieval-augmented generation (RAG) on the transcriptions. While this cascaded pipeline has proven effective in many practical settings, ASR errors can propagate to the retrieval and generation steps. To overcome this limitation, we introduc… ▽ More

    Submitted 3 January, 2025; v1 submitted 21 December, 2024; originally announced December 2024.

    Comments: ICASSP 2025

  2. arXiv:2310.07032  [pdf, other

    cs.SD cs.LG eess.AS eess.SY

    Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic System Identification with Application to Audio Processing

    Authors: Karim Helwani, Erfan Soltanmohammadi, Michael M. Goodwin

    Abstract: Improving the interpretability of deep neural networks has recently gained increased attention, especially when the power of deep learning is leveraged to solve problems in physics. Interpretability helps us understand a model's ability to generalize and reveal its limitations. In this paper, we introduce a causal interpretable deep structure for modeling dynamic systems. Our proposed model makes… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.