Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for June 2020

Total of 181 entries : 1-25 26-50 51-75 76-100 ... 176-181
Showing up to 25 entries per page: fewer | more | all
[1] arXiv:2006.00217 [pdf, other]
Title: Exploring Filterbank Learning for Keyword Spotting
Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[2] arXiv:2006.00408 [pdf, other]
Title: Introducing Latent Timbre Synthesis
K. Tatar, D. Bisig, P. Pasquier
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[3] arXiv:2006.00452 [pdf, other]
Title: Crossed-Time Delay Neural Network for Speaker Recognition
Liang Chen, Yanchun Liang, Xiaohu Shi, You Zhou, Chunguo Wu
Comments: MMM 2021 Paper, add GitHub address
Journal-ref: MMM 2021
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD)
[4] arXiv:2006.00518 [pdf, other]
Title: Data-driven Detection and Analysis of the Patterns of Creaky Voice
Thomas Drugman, John Kane, Christer Gobl
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[5] arXiv:2006.00521 [pdf, other]
Title: Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra
Thomas Drugman, Yannis Stylianou
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[6] arXiv:2006.00525 [pdf, other]
Title: Residual Excitation Skewness for Automatic Speech Polarity Detection
Thomas Drugman
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[7] arXiv:2006.00687 [pdf, other]
Title: Phase-aware Single-stage Speech Denoising and Dereverberation with U-Net
Hyeong-Seok Choi, Hoon Heo, Jie Hwan Lee, Kyogu Lee
Comments: 5 pages, 3 figures, Submitted to Interspeech2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:2006.00703 [pdf, other]
Title: Streaming Language Identification using Combination of Acoustic Representations and ASR Hypotheses
Chander Chandak, Zeynab Raeesy, Ariya Rastrow, Yuzong Liu, Xiangyang Huang, Siyu Wang, Dong Kwon Joo, Roland Maas
Comments: 5 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[9] arXiv:2006.00751 [pdf, other]
Title: Evaluation of CNN-based Automatic Music Tagging Models
Minz Won, Andres Ferraro, Dmitry Bogdanov, Xavier Serra
Comments: 7 pages, 2 figures, Sound and Music Computing 2020 (SMC 2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10] arXiv:2006.00772 [pdf, other]
Title: Similarity-and-Independence-Aware Beamformer: Method for Target Source Extraction using Magnitude Spectrogram as Reference
Atsuo Hiroe
Comments: Accepted in INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[11] arXiv:2006.00782 [pdf, other]
Title: Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition
Sanket Shah, Basil Abraham, Gurunath Reddy M, Sunayana Sitaram, Vikas Joshi
Comments: 5 pages (4 pages + 1 page references), 5 tables, 1 figure, 1 algorithm, 16 references
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[12] arXiv:2006.00848 [pdf, other]
Title: A time-scale modification dataset with subjective quality labels
Timothy Roberts, Kuldip K. Paliwal
Comments: 12 Pages, 13 Figures, Published in The Journal of the Acoustical Society of America (Vol.148, Issue 1), For associated dataset, see this http URL
Journal-ref: J. Acoust. Soc. Am. 148(1). pp. 201-210 (2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13] arXiv:2006.00877 [pdf, other]
Title: High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque, Rajib Rana, Björn W Schuller
Comments: The paper is submitted to IEEE Access for review
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[14] arXiv:2006.01260 [pdf, other]
Title: Improving EEG based continuous speech recognition using GAN
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
Comments: Under Review
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[15] arXiv:2006.01261 [pdf, other]
Title: Understanding effect of speech perception in EEG based speech recognition systems
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
Comments: Under Review
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[16] arXiv:2006.01262 [pdf, other]
Title: Predicting Different Acoustic Features from EEG and towards direct synthesis of Audio Waveform from EEG
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
Comments: Under Review
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[17] arXiv:2006.01595 [pdf, other]
Title: Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data
Haytham M. Fayek, Anurag Kumar
Comments: 29th International Joint Conference on Artificial Intelligence (IJCAI 2020)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[18] arXiv:2006.01708 [pdf, other]
Title: Dilated U-net based approach for multichannel speech enhancement from First-Order Ambisonics recordings
Amélie Bosca, Alexandre Guérin, Lauréline Perotin, Srđan Kitić
Comments: Accepted for EUSIPCO 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[19] arXiv:2006.01796 [pdf, other]
Title: Neural Speaker Diarization with Speaker-Wise Chain Rule
Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, Kenji Nagamatsu
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[20] arXiv:2006.01906 [pdf, other]
Title: Detecting Audio Attacks on ASR Systems with Dropout Uncertainty
Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin
Comments: Accepted for publication at Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[21] arXiv:2006.01919 [pdf, other]
Title: A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection
Archontis Politis, Sharath Adavanne, Tuomas Virtanen
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:2006.02099 [pdf, other]
Title: Time Domain Velocity Vector for Retracing the Multipath Propagation
Jérôme Daniel, Srđan Kitić
Comments: Presented at ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23] arXiv:2006.02547 [pdf, other]
Title: A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James Glass
Comments: Proceedings of Interspeech, 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[24] arXiv:2006.02616 [pdf, other]
Title: Online End-to-End Neural Diarization with Speaker-Tracing Buffer
Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu
Comments: Accepted to SLT 2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[25] arXiv:2006.02786 [pdf, other]
Title: Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Thilo von Neumann, Christoph Boeddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach
Comments: 5 pages, INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Total of 181 entries : 1-25 26-50 51-75 76-100 ... 176-181
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack