Skip to main content

Showing 1–2 of 2 results for author: Chuang, G

Searching in archive eess. Search in all archives.
.
  1. arXiv:1809.09190  [pdf, other

    eess.AS cs.CL cs.SD

    From Audio to Semantics: Approaches to end-to-end spoken language understanding

    Authors: Parisa Haghani, Arun Narayanan, Michiel Bacchiani, Galen Chuang, Neeraj Gaur, Pedro Moreno, Rohit Prabhavalkar, Zhongdi Qu, Austin Waters

    Abstract: Conventional spoken language understanding systems consist of two main components: an automatic speech recognition module that converts audio to a transcript, and a natural language understanding module that transforms the resulting text (or top N hypotheses) into a set of domains, intents, and arguments. These modules are typically optimized independently. In this paper, we formulate audio to sem… ▽ More

    Submitted 24 September, 2018; originally announced September 2018.

  2. arXiv:1804.03052  [pdf, other

    cs.CL cs.SD eess.AS

    Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech

    Authors: David Harwath, Galen Chuang, James Glass

    Abstract: In this paper, we explore the learning of neural network embeddings for natural images and speech waveforms describing the content of those images. These embeddings are learned directly from the waveforms without the use of linguistic transcriptions or conventional speech recognition technology. While prior work has investigated this setting in the monolingual case using English speech data, this… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: to appear at ICASSP 2018