Skip to main content

Showing 1–1 of 1 results for author: Gemmeke, J F

Searching in archive stat. Search in all archives.
.
  1. arXiv:1609.09430  [pdf, other

    cs.SD cs.LG stat.ML

    CNN Architectures for Large-Scale Audio Classification

    Authors: Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin Wilson

    Abstract: Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels. We examine fully connected Deep Neural Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We investigate varying th… ▽ More

    Submitted 10 January, 2017; v1 submitted 29 September, 2016; originally announced September 2016.

    Comments: Accepted for publication at ICASSP 2017 Changes: Added definitions of mAP, AUC, and d-prime. Updated mAP/AUC/d-prime numbers for Audio Set based on changes of latest Audio Set revision. Changed wording to fit 4 page limit with new additions