Audio and Speech Processing

Authors and titles for March 2020

Total of 117 entries : 1-25 26-50 51-75 76-100 101-117

Showing up to 25 entries per page: fewer | more | all

[51] arXiv:2003.07544 [pdf, other]: Title: Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method

Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu

Comments: ACCEPTED by IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)

Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD)
[52] arXiv:2003.07688 [pdf, other]: Title: End-to-end Recurrent Denoising Autoencoder Embeddings for Speaker Identification

Esther Rituerto-González, Carmen Peláez-Moreno

Comments: Published on Monday 10th of May 2021 in Neural Computing and Applications, Springer

Journal-ref: Online, Neural Comput & Applic (2021), pp. 1-11

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[53] arXiv:2003.07692 [pdf, other]: Title: ASR Error Correction and Domain Adaptation Using Machine Translation

Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze

Comments: Accepted for Oral Presentation at ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[54] arXiv:2003.07704 [pdf, other]: Title: Audio inpainting with generative adversarial network

P. P. Ebner, A. Eltelt

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[55] arXiv:2003.07705 [pdf, other]: Title: Hybrid Autoregressive Transducer (hat)

Ehsan Variani, David Rybach, Cyril Allauzen, Michael Riley

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[56] arXiv:2003.07962 [pdf, other]: Title: Deliberation Model Based Two-Pass End-to-End Speech Recognition

Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[57] arXiv:2003.08954 [pdf, other]: Title: Voice and accompaniment separation in music using self-attention convolutional neural network

Yuzhou Liu (1), Balaji Thoshkahna (2), Ali Milani (3), Trausti Kristjansson (3) ((1) Ohio State University (2) Amazon Music, Bangalore (3) Amazon Lab126, CA)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[58] arXiv:2003.09125 [pdf, other]: Title: Improving Embedding Extraction for Speaker Verification with Ladder Network

Fei Tao, Gokhan Tur

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[59] arXiv:2003.09164 [pdf, other]: Title: Acoustic Scene Classification using Audio Tagging

Jee-weon Jung, Hye-jin Shim, Ju-ho Kim, Seung-bin Kim, Ha-Jin Yu

Comments: 5 pages, 2 figures, 6 tables, submitted to Interspeech 2020 as a conference paper

Subjects: Audio and Speech Processing (eess.AS)
[60] arXiv:2003.09180 [pdf, other]: Title: Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking

Yoonjae Jeong, Hoon-Young Cho

Comments: Accepted by ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[61] arXiv:2003.09542 [pdf, other]: Title: Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification

Bhusan Chettri, Tomi Kinnunen, Emmanouil Benetos

Comments: Accepted to Computer Speech and Language Special issue on Advances in Automatic Speaker Verification Anti-spoofing, 2020

Subjects: Audio and Speech Processing (eess.AS)
[62] arXiv:2003.09889 [pdf, other]: Title: Audio Impairment Recognition Using a Correlation-Based Feature Representation

Alessandro Ragano, Emmanouil Benetos, Andrew Hines

Comments: This publication has been accepted in 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[63] arXiv:2003.09891 [pdf, other]: Title: Low Latency ASR for Simultaneous Speech Translation

Thai Son Nguyen, Jan Niehues, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Muller, Matthias Sperber, Sebastian Stueker, Alex Waibel

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[64] arXiv:2003.10022 [pdf, other]: Title: High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

Thai-Son Nguyen, Ngoc-Quan Pham, Sebastian Stueker, Alex Waibel

Comments: To appear in Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[65] arXiv:2003.10183 [pdf, other]: Title: Dialect Identification of Spoken North Sámi Language Varieties Using Prosodic Features

Sofoklis Kakouros, Katri Hiovain, Martti Vainio, Juraj Šimko

Subjects: Audio and Speech Processing (eess.AS)
[66] arXiv:2003.10369 [pdf, other]: Title: Low Latency End-to-End Streaming Speech Recognition with a Scout Network

Chengyi Wang, Yu Wu, Shujie Liu, Jinyu Li, Liang Lu, Guoli Ye, Ming Zhou

Subjects: Audio and Speech Processing (eess.AS)
[67] arXiv:2003.10724 [pdf, other]: Title: Evaluation of Error and Correlation-Based Loss Functions For Multitask Learning Dimensional Speech Emotion Recognition

Bagus Tris Atmaja, Masato Akagi

Comments: 3 figures, 3 tables, submitted to ANV 2020

Subjects: Audio and Speech Processing (eess.AS)
[68] arXiv:2003.11750 [pdf, other]: Title: Non-parallel Voice Conversion System with WaveNet Vocoder and Collapsed Speech Suppression

Yi-Chiao Wu, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Hayashi, Tomoki Toda

Comments: 13 pages, 13 figures, 1 table, accepted to publish in IEEE Access

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[69] arXiv:2003.11882 [pdf, other]: Title: Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders

Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines

Comments: 6 pages, 11 figures, conference

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[70] arXiv:2003.11982 [pdf, other]: Title: In defence of metric learning for speaker recognition

Joon Son Chung, Jaesung Huh, Seongkyu Mun, Minjae Lee, Hee Soo Heo, Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, Icksang Han

Comments: The code can be found at this https URL

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[71] arXiv:2003.12108 [pdf, other]: Title: A Review of Multi-Objective Deep Learning Speech Denoising Methods

Arian Azarang, Nasser Kehtarnavaz

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[72] arXiv:2003.12266 [pdf, other]: Title: Dual Attention in Time and Frequency Domain for Voice Activity Detection

Joohyung Lee, Youngmoon Jung, Hoirin Kim

Comments: Accepted to Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS)
[73] arXiv:2003.12326 [pdf, other]: Title: Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Yi Luo, Nima Mesgarani

Comments: Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[74] arXiv:2003.12362 [pdf, other]: Title: Can you hear me $\textit{now}$? Sensitive comparisons of human and machine perception

Michael A Lepori, Chaz Firestone

Comments: 24 pages; 4 figures

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[75] arXiv:2003.12366 [pdf, html, other]: Title: Training for Speech Recognition on Coprocessors

Sebastian Baunsgaard, Sebastian B. Wrede, Pınar Tozun

Comments: published at ADMS 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)

Total of 117 entries : 1-25 26-50 51-75 76-100 101-117

Showing up to 25 entries per page: fewer | more | all