Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for June 2018

Total of 67 entries : 1-25 26-50 51-67
Showing up to 25 entries per page: fewer | more | all
[26] arXiv:1806.02169 (cross-list from cs.SD) [pdf, other]
Title: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[27] arXiv:1806.02782 (cross-list from cs.CL) [pdf, other]
Title: Training Augmentation with Adversarial Examples for Robust Speech Recognition
Sining Sun, Ching-Feng Yeh, Mari Ostendorf, Mei-Yuh Hwang, Lei Xie
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[28] arXiv:1806.03047 (cross-list from q-bio.NC) [pdf, other]
Title: On sound-based interpretation of neonatal EEG
Sergi Gomez, Mark O'Sullivan, Emanuel Popovici, Sean Mathieson, Geraldine Boylan, Andriy Temko
Comments: ISSC 2018
Subjects: Neurons and Cognition (q-bio.NC); Sound (cs.SD); Audio and Speech Processing (eess.AS); Applications (stat.AP)
[29] arXiv:1806.03185 (cross-list from cs.SD) [pdf, other]
Title: Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
Daniel Stoller, Sebastian Ewert, Simon Dixon
Comments: 7 pages (1 for references), 4 figures, 3 tables. Appearing in the proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018) (camera-ready version). Implementation available at this https URL
Journal-ref: 19th International Society for Music Information Retrieval Conference (ISMIR 2018)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[30] arXiv:1806.04278 (cross-list from cs.SD) [pdf, other]
Title: The NES Music Database: A multi-instrumental dataset with expressive performance attributes
Chris Donahue, Huanru Henry Mao, Julian McAuley
Comments: Published as a conference paper at ISMIR 2018
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[31] arXiv:1806.04558 (cross-list from cs.CL) [pdf, other]
Title: Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu
Comments: NeurIPS 2018
Journal-ref: Advances in Neural Information Processing Systems 31 (2018), 4485-4495
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[32] arXiv:1806.04699 (cross-list from cs.SD) [pdf, other]
Title: Capsule Routing for Sound Event Detection
Turab Iqbal, Yong Xu, Qiuqiang Kong, Wenwu Wang
Comments: Paper accepted for 26th European Signal Processing Conference (EUSIPCO 2018)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[33] arXiv:1806.04841 (cross-list from cs.CL) [pdf, other]
Title: A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition
Hao Tang, Wei-Ning Hsu, Francois Grondin, James Glass
Comments: Interspeech, 2018
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[34] arXiv:1806.04872 (cross-list from cs.CL) [pdf, other]
Title: Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition
Wei-Ning Hsu, Hao Tang, James Glass
Comments: to appear in Interspeech 2018
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[35] arXiv:1806.04903 (cross-list from cs.SD) [pdf, other]
Title: A data-driven approach to mid-level perceptual musical feature modeling
Anna Aljanaki, Mohammad Soleymani
Comments: 7 pages, ISMIR conference paper
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36] arXiv:1806.05622 (cross-list from cs.SD) [pdf, other]
Title: VoxCeleb2: Deep Speaker Recognition
Joon Son Chung, Arsha Nagrani, Andrew Zisserman
Comments: To appear in Interspeech 2018. The audio-visual dataset can be downloaded from this http URL . 1806.05622v2: minor fixes; 5 pages
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[37] arXiv:1806.05791 (cross-list from cs.SD) [pdf, other]
Title: Monaural source enhancement maximizing source-to-distortion ratio via automatic differentiation
Hiroaki Nakajima, Yu Takahashi, Kazunobu Kondo, Yuji Hisaminato
Comments: This paper is submitted to 16th International Workshop on Acoustic Signal Enhancement (IWAENC)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[38] arXiv:1806.06342 (cross-list from cs.SD) [pdf, other]
Title: Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Linhao Dong, Shiyu Zhou, Wei Chen, Bo Xu
Comments: To appear in Interspeech 2018
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[39] arXiv:1806.06347 (cross-list from cs.SD) [pdf, other]
Title: Cover Song Synthesis by Analogy
Christopher J. Tralie
Comments: 11 pages, 5 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:1806.06513 (cross-list from cs.CL) [pdf, other]
Title: Semi-tied Units for Efficient Gating in LSTM and Highway Networks
Chao Zhang, Philip Woodland
Comments: To appear in Proc. INTERSPEECH 2018, September 2-6, 2018, Hyderabad, India
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[41] arXiv:1806.06676 (cross-list from cs.SD) [pdf, other]
Title: Towards multi-instrument drum transcription
Richard Vogl, Gerhard Widmer, Peter Knees
Comments: Published in Proceedings of the 21th International Conference on Digital Audio Effects (DAFx18), 4 - 8 September, 2018, Aveiro, Portugal
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[42] arXiv:1806.06703 (cross-list from math.HO) [pdf, other]
Title: A 5-Dimensional Tonnetz for Nearly Symmetric Hexachords
Vaibhav Mohanty
Subjects: History and Overview (math.HO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[43] arXiv:1806.06773 (cross-list from cs.SD) [pdf, other]
Title: Towards an efficient deep learning model for musical onset detection
Rong Gong, Xavier Serra
Comments: Paper rejected by the 19th International Society for Music Information Retrieval Conference
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[44] arXiv:1806.06812 (cross-list from cs.SD) [pdf, other]
Title: Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices
Hideki Kawahara, Ken-Ichi Sakakibara, Masanori Morise, Hideki Banno, Tomoki Toda, Toshio Irino
Comments: 11 pages, 20 figures, and 1 table, Interspeech 2018
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[45] arXiv:1806.07098 (cross-list from cs.CL) [pdf, other]
Title: End-to-End Speech Recognition From the Raw Waveform
Neil Zeghidour, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert, Emmanuel Dupoux
Comments: Accepted for presentation at Interspeech 2018
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46] arXiv:1806.07407 (cross-list from cs.CL) [pdf, other]
Title: Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition
Tobias Menne, Ralf Schlüter, Hermann Ney
Comments: submitted to IEEE SLT 2018
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[47] arXiv:1806.07506 (cross-list from cs.SD) [pdf, other]
Title: A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification
Eduardo Fonseca, Rong Gong, Xavier Serra
Comments: accepted to SMC 2018; updated Figure 7, results unchanged
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[48] arXiv:1806.07789 (cross-list from cs.SD) [pdf, other]
Title: Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio
Comments: Accepted at INTERSPEECH 2018
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[49] arXiv:1806.08002 (cross-list from cs.SD) [pdf, other]
Title: Synthesizing Diverse, High-Quality Audio Textures
Joseph Antognini, Matt Hoffman, Ron J. Weiss
Comments: 10 pages, submitted to TASLP
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[50] arXiv:1806.08236 (cross-list from cs.SD) [pdf, other]
Title: Learning Transposition-Invariant Interval Features from Symbolic Music and Audio
Stefan Lattner, Maarten Grachten, Gerhard Widmer
Comments: Paper accepted at the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, France, September 23-27; 8 pages, 5 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Total of 67 entries : 1-25 26-50 51-67
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack