Skip to main content

Showing 1–2 of 2 results for author: Cubuk, E D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2210.10879  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

    Authors: Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park

    Abstract: Data augmentation is a ubiquitous technique used to provide robustness to automatic speech recognition (ASR) training. However, even as so much of the ASR training process has become automated and more "end-to-end", the data augmentation policy (what augmentation functions to use, and how to apply them) remains hand-crafted. We present Graph-Augment, a technique to define the augmentation space as… ▽ More

    Submitted 24 October, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 6 pages, accepted at SLT 2022. Updated with copyright

  2. arXiv:1904.08779  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

    Authors: Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le

    Abstract: We present SpecAugment, a simple data augmentation method for speech recognition. SpecAugment is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients). The augmentation policy consists of warping the features, masking blocks of frequency channels, and masking blocks of time steps. We apply SpecAugment on Listen, Attend and Spell networks for end-to-end speech… ▽ More

    Submitted 3 December, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

    Comments: 5 pages, 3 figures, 6 tables; v3: references added

    Journal ref: Proc. Interspeech 2019, 2613-2617