Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for September 2019

Total of 113 entries
Showing up to 2000 entries per page: fewer | more | all
[1] arXiv:1909.00082 [pdf, other]
Title: Enhancements for Audio-only Diarization Systems
Dimitrios Dimitriadis
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:1909.00521 [pdf, other]
Title: Modeling Long-Range Context for Concurrent Dialogue Acts Recognition
Yue Yu, Siyao Peng, Grace Hui Yang
Comments: Accepted to CIKM '19
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[3] arXiv:1909.01008 [pdf, other]
Title: The LOCATA Challenge: Acoustic Source Localization and Tracking
Christine Evers, Heinrich Loellmann, Heinrich Mellmann, Alexander Schmidt, Hendrik Barfuss, Patrick Naylor, Walter Kellermann
Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[4] arXiv:1909.01145 [pdf, other]
Title: Maximizing Mutual Information for Tacotron
Peng Liu, Xixin Wu, Shiyin Kang, Guangzhi Li, Dan Su, Dong Yu
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[5] arXiv:1909.02667 [pdf, other]
Title: Bandwidth Embeddings for Mixed-bandwidth Speech Recognition
Gautam Mantena, Ozlem Kalinli, Ossama Abdel-Hamid, Don McAllaster
Comments: A part of this work is accepted in Interspeech 2019 this https URL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[6] arXiv:1909.02851 [pdf, other]
Title: Avaya Conversational Intelligence: A Real-Time System for Spoken Language Understanding in Human-Human Call Center Conversations
Jan Mizgajski, Adrian Szymczak, Robert Głowski, Piotr Szymański, Piotr Żelasko, Łukasz Augustyniak, Mikołaj Morzy, Yishay Carmiel, Jeff Hodson, Łukasz Wójciak, Daniel Smoczyk, Adam Wróbel, Bartosz Borowik, Adam Artajew, Marcin Baran, Cezary Kwiatkowski, Marzena Żyła-Hoppe
Comments: Accepted for Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[7] arXiv:1909.02859 [pdf, other]
Title: Receptive-field-regularized CNN variants for acoustic scene classification
Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer
Comments: Accepted at Detection and Classification of Acoustic Scenes and Events 2019 (DCASE Workshop 2019)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[8] arXiv:1909.02869 [pdf, other]
Title: Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification
Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini, Andreas Arzt, Gerhard Widmer
Comments: Published at the Workshop on Detection and Classification of Acoustic Scenes and Events, 25-26 October 2019, New York, USA
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[9] arXiv:1909.03965 [pdf, other]
Title: Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs
Rob Clark, Hanna Silen, Tom Kenter, Ralph Leith
Comments: Accepted for The 10th ISCA Speech Synthesis Workshop (SSW10), 6 pages
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[10] arXiv:1909.03974 [pdf, other]
Title: DNN-based cross-lingual voice conversion using Bottleneck Features
M Kiran Reddy, K Sreenivasa Rao
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[11] arXiv:1909.04157 [pdf, other]
Title: Self-Teaching Networks
Liang Lu, Eric Sun, Yifan Gong
Comments: 5 pages, Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[12] arXiv:1909.04301 [pdf, other]
Title: Frequency domain variant of Velvet noise and its application to acoustic measurements
Hideki Kawahara, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Hideki Banno, Masanori Morise, Toshio Irino
Comments: 10 pages, 14 figures, APSIPA ASC 2019. arXiv admin note: text overlap with arXiv:1806.06812
Journal-ref: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 2019, pp. 1523-1532
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[13] arXiv:1909.04776 [pdf, other]
Title: Generative Speech Enhancement Based on Cloned Networks
Michael Chinen, W. Bastiaan Kleijn, Felicia S. C. Lim, Jan Skoglund
Comments: Accepted WASPAA 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14] arXiv:1909.05330 [pdf, other]
Title: Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model
Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee
Comments: Accepted in Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[15] arXiv:1909.05639 [pdf, other]
Title: Quantifying and Correlating Rhythm Formants in Speech
Dafydd Gibbon, Peng Li
Comments: 6 pagers, 7 figures, 2 tables, accepted: LPSS (Linguistic Properties of Spontaneous Speech, Taipei 2019)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[16] arXiv:1909.05746 [pdf, other]
Title: Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation
Tingle Li, Jiawei Chen, Haowen Hou, Ming Li
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD)
[17] arXiv:1909.05952 [pdf, other]
Title: End-to-End Neural Speaker Diarization with Permutation-Free Objectives
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe
Comments: Accepted to INTERSPEECH 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[18] arXiv:1909.06178 [pdf, other]
Title: Guided Learning Convolution System for DCASE 2019 Task 4
Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian
Comments: Accept by DCASE2019 Workshop
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[19] arXiv:1909.06247 [pdf, other]
Title: End-to-End Neural Speaker Diarization with Self-attention
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe
Comments: Accepted for ASRU 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[20] arXiv:1909.06351 [pdf, other]
Title: Probing the Information Encoded in X-vectors
Desh Raj, David Snyder, Daniel Povey, Sanjeev Khudanpur
Comments: Accepted at IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2019
Journal-ref: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (2019): 726-733
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[21] arXiv:1909.06522 [pdf, other]
Title: Multilingual Graphemic Hybrid ASR with Massive Data Augmentation
Chunxi Liu, Qiaochu Zhang, Xiaohui Zhang, Kritika Singh, Yatharth Saraf, Geoffrey Zweig
Comments: Accepted for publication at the 1st Joint Workshop of SLTU (Spoken Language Technologies for Under-resourced languages) and CCURL (Collaboration and Computing for Under-Resourced Languages) (SLTU-CCURL 2020)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[22] arXiv:1909.06532 [pdf, other]
Title: Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech
Hieu-Thi Luong, Junichi Yamagishi
Comments: Accepted for IEEE ASRU 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[23] arXiv:1909.06614 [pdf, other]
Title: Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li, Chao Zhang, Philip C. Woodland
Comments: To appear in Proc. ASRU2019, December 14-18, 2019, Sentosa, Singapore
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[24] arXiv:1909.06678 [pdf, other]
Title: An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Khe Chai Sim, Petr Zadrazil, Françoise Beaufays
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[25] arXiv:1909.06805 [pdf, other]
Title: Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders
Keonnyeong Lee, In-Chul Yoo, Dongsuk Yook
Comments: 6 pages
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[26] arXiv:1909.07352 [pdf, other]
Title: Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network
Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu
Comments: 13 pages, accepted by IEEE JSTSP Special Issue on Deep Learning for Multi-modal Intelligence across Speech, Language, Vision, and Heterogeneous Signals
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[27] arXiv:1909.07655 [pdf, other]
Title: Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion
Xiaohai Tian, Rohan Kumar Das, Haizhou Li
Comments: 6 pages, 3 figures, This paper is submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS)
[28] arXiv:1909.08315 [pdf, other]
Title: Bayesian Strategies for Likelihood Ratio Computation in Forensic Voice Comparison with Automatic Systems
Daniel Ramos, Juan Maroñas, Alicia Lozano-Diez
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[29] arXiv:1909.08500 [pdf, other]
Title: Emotion Filtering at the Edge
Ranya Aloufi, Hamed Haddadi, David Boyle
Comments: 6 pages, 6 figures, Sensys-ML19 workshop in conjunction with the 17th ACM Conference on Embedded Networked Sensor Systems (SenSys 2019)
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Human-Computer Interaction (cs.HC); Sound (cs.SD)
[30] arXiv:1909.08961 [pdf, other]
Title: Acoustic scene analysis with multi-head attention networks
Weimin Wang, Weiran Wang, Ming Sun, Chao Wang
Comments: 8 pages, 6 figures
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[31] arXiv:1909.09024 [pdf, other]
Title: WEnets: A Convolutional Framework for Evaluating Audio Waveforms
Andrew A. Catellier, Stephen D. Voran
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[32] arXiv:1909.09132 [pdf, other]
Title: Spoken Speech Enhancement using EEG
Gautam Krishna, Co Tran, Yan Han, Mason Carnahan, Ahmed H Tewfik
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[33] arXiv:1909.10200 [pdf, other]
Title: Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?
Chitralekha Gupta, Emre Yılmaz, Haizhou Li
Comments: Submitted to 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[34] arXiv:1909.10302 [pdf, other]
Title: Sequence to Sequence Neural Speech Synthesis with Prosody Modification Capabilities
Slava Shechtman, Alex Sorin
Comments: published at 10th ISCA Speech Synthesis Workshop (SSW-10, 2019)
Journal-ref: Proc. 10th ISCA Speech Synthesis Workshop, 275-280 (2019)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[35] arXiv:1909.10924 [pdf, other]
Title: Understanding Semantics from Speech Through Pre-training
Pengwei Wang, Liangchen Wei, Yong Cao, Jinghui Xie, Yuji Cao, Zaiqing Nie
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[36] arXiv:1909.11200 [pdf, other]
Title: Improving Noise Robustness In Speaker Identification Using A Two-Stage Attention Model
Yanpei Shi, Qiang Huang, Thomas Hain
Comments: Submitted to Interspeech2020
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD)
[37] arXiv:1909.11549 [pdf, other]
Title: MPEG-H Audio for Improving Accessibility in Broadcasting and Streaming
Christian Simon, Matteo Torcoli, Jouni Paulus
Comments: White Paper
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[38] arXiv:1909.11727 [pdf, other]
Title: Disentangling Speech and Non-Speech Components for Building Robust Acoustic Models from Found Data
Nishant Gurunath, Sai Krishna Rallabandi, Alan Black
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[39] arXiv:1909.11886 [pdf, other]
Title: Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification
Youngmoon Jung, Yeunju Choi, Hoirin Kim
Comments: Accepted at 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019)
Journal-ref: Proc. of ASRU 2019, pp. 365-372
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[40] arXiv:1909.13037 [pdf, other]
Title: Self-Attention Transducers for End-to-End Speech Recognition
Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen
Journal-ref: Proc. Interspeech 2019, 4395-4399
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[41] arXiv:1909.13387 [pdf, other]
Title: FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing
Yi Luo, Enea Ceolini, Cong Han, Shih-Chii Liu, Nima Mesgarani
Comments: Accepted to ASRU 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[42] arXiv:1909.13447 [pdf, other]
Title: DiPCo -- Dinner Party Corpus
Maarten Van Segbroeck, Ahmed Zaid, Ksenia Kutsenko, Cirenia Huerta, Tinh Nguyen, Xuewen Luo, Björn Hoffmeister, Jan Trmal, Maurizio Omologo, Roland Maas
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[43] arXiv:1909.13695 [pdf, other]
Title: Non-native Speaker Verification for Spoken Language Assessment
Linlin Wang, Yu Wang, Mark J. F. Gales
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[44] arXiv:1909.13759 [pdf, other]
Title: Acoustic Model Adaptation from Raw Waveforms with SincNet
Joachim Fainberg, Ondřej Klejch, Erfan Loweimi, Peter Bell, Steve Renals
Comments: Accepted to IEEE ASRU 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[45] arXiv:1909.00935 (cross-list from cs.SD) [pdf, other]
Title: Voice Spoofing Detection Corpus for Single and Multi-order Audio Replays
Roland Baumann, Khalid Mahmood Malik, Ali Javed, Andersen Ball, Brandon Kujawa, Hafiz Malik
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46] arXiv:1909.01019 (cross-list from cs.SD) [pdf, other]
Title: On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
Morten Kolbæk, Zheng-Hua Tan, Søren Holdt Jensen, Jesper Jensen
Comments: Published in the IEEE Transactions on Audio, Speech and Language Processing
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[47] arXiv:1909.01067 (cross-list from cs.LG) [pdf, other]
Title: Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples
Habibeh Naderi, Behrouz Haji Soleimani, Stan Matwin
Comments: arXiv admin note: text overlap with arXiv:1811.09362 by other authors
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[48] arXiv:1909.01167 (cross-list from cs.HC) [pdf, other]
Title: Feasibility of Using Automatic Speech Recognition with Voices of Deaf and Hard-of-Hearing Individuals
Abraham Glasser, Kesavan Kushalnagar, Raja Kushalnagar
Comments: 2 pages, 3 figures
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[49] arXiv:1909.01174 (cross-list from cs.SD) [pdf, other]
Title: Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed
Alexandre Défossez (SIERRA, PSL, FAIR), Nicolas Usunier (FAIR), Léon Bottou (FAIR), Francis Bach (PSL, DI-ENS, SIERRA)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[50] arXiv:1909.01218 (cross-list from cs.CV) [pdf, other]
Title: Translating Visual Art into Music
Maximilian Müller-Eberstein, Nanne van Noord
Comments: Accepted for ICCV 2019 Workshop on Fashion, Art and Design
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[51] arXiv:1909.01265 (cross-list from cs.SD) [pdf, other]
Title: Multiresolution analysis (discrete wavelet transform) through Daubechies family for emotion recognition in speech
Damian Campo, Manuela Bastidas, Olga Lucía Quintero
Comments: Published in: Conference, XX Congreso Argentino de Bioingeniería, SABI 2015, Octubre 28-30, 2015
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Applications (stat.AP)
[52] arXiv:1909.01302 (cross-list from cs.SD) [pdf, other]
Title: An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks
Zihan Pan, Yansong Chua, Jibin Wu, Malu Zhang, Haizhou Li, Eliathamby Ambikairajah
Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[53] arXiv:1909.01417 (cross-list from cs.CV) [pdf, other]
Title: Multi-level Attention network using text, audio and video for Depression Prediction
Anupama Ray, Siddharth Kumar, Rutvik Reddy, Prerana Mukherjee, Ritu Garg
Comments: in Proceedings of the 9th International Workshop on Audio/Visual Emotion Challenge, AVEC 2019, ACM Multimedia Workshop, Nice, France
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[54] arXiv:1909.01622 (cross-list from cs.SD) [pdf, other]
Title: Towards Interpretable Polyphonic Transcription with Invertible Neural Networks
Rainer Kelz, Gerhard Widmer
Comments: Published at the 20th International Society for Music Information Retrieval Conference, Delft, The Netherlands, 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[55] arXiv:1909.01700 (cross-list from cs.CL) [pdf, other]
Title: DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[56] arXiv:1909.01904 (cross-list from cs.CR) [pdf, other]
Title: VoIPLoc: Passive VoIP call provenance via acoustic side-channels
Shishir Nagaraja, Ryan Shah
Comments: 12 pages, 8 figures, 5 tables
Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[57] arXiv:1909.02764 (cross-list from cs.CL) [pdf, other]
Title: Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning
Deniz Cevher, Sebastian Zepf, Roman Klinger
Comments: 12 pages, 2 figures, accepted at KONVENS 2019
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[58] arXiv:1909.02853 (cross-list from cs.HC) [pdf, other]
Title: Automatic Speech Recognition Services: Deaf and Hard-of-Hearing Usability
Abraham Glasser
Comments: 6 pages, 4 figures
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[59] arXiv:1909.02924 (cross-list from cs.HC) [pdf, other]
Title: An Edge Computing Robot Experience for Automatic Elderly Mental Health Care Based on Voice
C. Yvanoff-Frenchin, V. Ramos, T. Belabed, C. Valderrama
Subjects: Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[60] arXiv:1909.03030 (cross-list from cs.SD) [pdf, other]
Title: Neural Network-Based Modeling of Phonetic Durations
Xizi Wei, Melvyn Hunt, Adrian Skilling
Comments: 5 pages, 5 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[61] arXiv:1909.03377 (cross-list from eess.SY) [pdf, other]
Title: Ultra-broadband local active noise control with remote acoustic sensing
Tong Xiao, Xiaojun Qiu, Benjamin Halkon
Journal-ref: Sci. Rep. 10 (2020)
Subjects: Systems and Control (eess.SY); Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[62] arXiv:1909.03434 (cross-list from cs.LG) [pdf, other]
Title: Order-free Learning Alleviating Exposure Bias in Multi-label Classification
Che-Ping Tsai, Hung-Yi Lee
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[63] arXiv:1909.03522 (cross-list from cs.LG) [pdf, other]
Title: MIDI-Sandwich2: RNN-based Hierarchical Multi-modal Fusion Generation VAE networks for multi-track symbolic music generation
Xia Liang, Junmin Wu, Jing Cao
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[64] arXiv:1909.03642 (cross-list from cs.SD) [pdf, other]
Title: Impulse Response Data Augmentation and Deep Neural Networks for Blind Room Acoustic Parameter Estimation
Nicholas J. Bryan
Comments: Under Review
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[65] arXiv:1909.03650 (cross-list from cs.SD) [pdf, other]
Title: Real-time and interactive tools for vocal training based on an analytic signal with a cosine series envelope
Hideki Kawahara, Ken-Ichi Sakakibara, Eri Haneishi, Kaori Hagiwara
Comments: 4 pages, 6 figures, APSIPA ASC 2019
Journal-ref: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 2019, pp. 907-910
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[66] arXiv:1909.04198 (cross-list from cs.CR) [pdf, other]
Title: Preech: A System for Privacy-Preserving Speech Transcription
Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, Parmesh Ramanathan
Comments: 21 pages, 8 figures, 5 tables. The paper is accepted at the 29th USENIX Security Symposium - URL: this https URL
Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[67] arXiv:1909.04425 (cross-list from cs.SD) [pdf, other]
Title: Automatic detection of estuarine dolphin whistles in spectrogram images
O. M. Serra, F. P. R. Martins, L. R. Padovese
Comments: 10 pages; 18 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[68] arXiv:1909.05030 (cross-list from cs.SD) [pdf, other]
Title: Computer Assisted Composition in Continuous Time
Chamin Hewa Koneputugodage, Rhys Healy, Sean Lamont, Ian Mallett, Matt Brown, Matt Walters, Ushini Attanayake, Libo Zhang, Roger T. Dean, Alexander Hunter, Charles Gretton, Christian Walder
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[69] arXiv:1909.05645 (cross-list from cs.CL) [pdf, other]
Title: Learning Alignment for Multimodal Emotion Recognition from Speech
Haiyang Xu, Hui Zhang, Kun Han, Yun Wang, Yiping Peng, Xiangang Li
Comments: InterSpeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[70] arXiv:1909.05882 (cross-list from cs.SD) [pdf, other]
Title: The emotions that we perceive in music: the influence of language and lyrics comprehension on agreement
Juan Sebastián Gómez Cañón, Perfecto Herrera, Emilia Gómez, Estefanía Cano
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[71] arXiv:1909.06317 (cross-list from cs.CL) [pdf, other]
Title: A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang
Comments: Accepted at ASRU 2019
Journal-ref: IEEE Automatic Speech Recognition and Understanding Workshop 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[72] arXiv:1909.06515 (cross-list from cs.CL) [pdf, other]
Title: Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade
Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath
Comments: IWSLT 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:1909.06654 (cross-list from cs.SD) [pdf, other]
Title: musicnn: Pre-trained convolutional neural networks for music audio tagging
Jordi Pons, Xavier Serra
Comments: Accepted to be presented at the Late-Breaking/Demo session of ISMIR 2019
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[74] arXiv:1909.07147 (cross-list from eess.IV) [pdf, other]
Title: Alternative Visual Units for an Optimized Phoneme-Based Lipreading System
Helen Bear, Richard Harvey
Comments: Accepted and published in Applied Sciences, 22pgs plus appendices and references
Journal-ref: Applied. Sciences. 2019, 9(18), 3870
Subjects: Image and Video Processing (eess.IV); Audio and Speech Processing (eess.AS)
[75] arXiv:1909.07208 (cross-list from cs.HC) [pdf, other]
Title: MFCC-based Recurrent Neural Network for Automatic Clinical Depression Recognition and Assessment from Speech
Emna Rejaibi, Ali Komaty, Fabrice Meriaudeau, Said Agrebi, Alice Othmani
Comments: 14 pages, 7 figures, 9 tables
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[76] arXiv:1909.07526 (cross-list from cs.CV) [pdf, other]
Title: Data-Efficient Classification of Birdcall Through Convolutional Neural Networks Transfer Learning
Dina B. Efremova, Mangalam Sankupellay, Dmitry A. Konovalov
Comments: Accepted for IEEE Digital Image Computing: Techniques and Applications, 2019 (DICTA 2019), 2-4 December 2019 in Perth, Australia, this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[77] arXiv:1909.07575 (cross-list from cs.CL) [pdf, other]
Title: Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation
Chengyi Wang, Yu Wu, Shujie Liu, Zhenglu Yang, Ming Zhou
Comments: AAAI2020
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[78] arXiv:1909.08050 (cross-list from cs.SD) [pdf, other]
Title: A scalable noisy speech dataset and online subjective test framework
Chandan K. A. Reddy, Ebrahim Beyrami, Jamie Pool, Ross Cutler, Sriram Srinivasan, Johannes Gehrke
Comments: InterSpeech 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[79] arXiv:1909.08103 (cross-list from cs.CL) [pdf, other]
Title: Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe
Comments: Accepted to ASRU 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[80] arXiv:1909.08444 (cross-list from cs.SD) [pdf, other]
Title: Musical Instrument Classification via Low-Dimensional Feature Vectors
Zishuo Zhao, Haoyun Wang
Comments: low quality as is my undergraduate work
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[81] arXiv:1909.08494 (cross-list from cs.SD) [pdf, other]
Title: Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity
Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux
Comments: Accepted for publication at WASPAA 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[82] arXiv:1909.08685 (cross-list from cs.CV) [pdf, other]
Title: Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
Shah Nawaz, Muhammad Kamran Janjua, Ignazio Gallo, Arif Mahmood, Alessandro Calefati
Comments: Accepted to DICTA 2019
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83] arXiv:1909.08723 (cross-list from cs.CL) [pdf, other]
Title: Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur
Comments: Accepted to ASRU 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[84] arXiv:1909.08782 (cross-list from cs.CV) [pdf, other]
Title: Large-scale representation learning from visually grounded untranscribed speech
Gabriel Ilharco, Yuan Zhang, Jason Baldridge
Journal-ref: The SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2019
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85] arXiv:1909.09116 (cross-list from cs.CL) [pdf, other]
Title: Self-Training for End-to-End Speech Recognition
Jacob Kahn, Ann Lee, Awni Hannun
Comments: To be published in the 45th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[86] arXiv:1909.09235 (cross-list from cs.SD) [pdf, other]
Title: On the Impact of Ground Sound
Ante Qu, Doug L. James
Comments: 8 pages, 11 figures. In Proceedings of the 22nd International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK, September 2-6, 2019. Audio examples can be downloaded publicly at this http URL
Subjects: Sound (cs.SD); Computational Engineering, Finance, and Science (cs.CE); Audio and Speech Processing (eess.AS)
[87] arXiv:1909.09347 (cross-list from cs.SD) [pdf, other]
Title: MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection
Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, Yohei Kawaguchi
Comments: 5 pages, to appear in DCASE 2019 Workshop
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[88] arXiv:1909.09577 (cross-list from cs.LG) [pdf, other]
Title: NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen
Comments: 6 pages plus references
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[89] arXiv:1909.09585 (cross-list from cs.SD) [pdf, other]
Title: An extended two-dimensional vocal tract model for fast acoustic simulation of single-axis symmetric three-dimensional tubes
Debasish Ray Mohapatra, Victor Zappi, Sidney Fels
Comments: 5 pages, 2 figures, Interspeech 2019 submission
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[90] arXiv:1909.10407 (cross-list from cs.SD) [pdf, other]
Title: CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement
Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain
Comments: 34 pages, 11 figures, Submitted to Information Fusion
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[91] arXiv:1909.10861 (cross-list from cs.CL) [pdf, other]
Title: Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding
Chao-Wei Huang, Yun-Nung Chen
Comments: ICASSP 2020
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[92] arXiv:1909.11391 (cross-list from cs.SD) [pdf, other]
Title: HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari
Comments: Submitted to IEEE ICASSP 2020
Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[93] arXiv:1909.11430 (cross-list from cs.CL) [pdf, other]
Title: Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training
Qiao Cheng, Meiyuan Fang, Yaqian Han, Jin Huang, Yitao Duan
Comments: Accepted at the 16th International Workshop on Spoken Language Translation (IWSLT 2019)
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[94] arXiv:1909.11646 (cross-list from cs.SD) [pdf, other]
Title: High Fidelity Speech Synthesis with Adversarial Networks
Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[95] arXiv:1909.11699 (cross-list from cs.CL) [pdf, other]
Title: Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu
Comments: Accepted for publication at ASRU 2020
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[96] arXiv:1909.11909 (cross-list from cs.SD) [pdf, other]
Title: Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks
Chang-Le Liu, Sze-Wei Fu, You-Jin Li, Jen-Wei Huang, Hsin-Min Wang, Yu Tsao
Comments: Accepted to IEEE/ACM Transactions on Audio, Speech and Language Processing
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[97] arXiv:1909.11912 (cross-list from cs.SD) [pdf, other]
Title: Improving the Intelligibility of Electric and Acoustic Stimulation Speech Using Fully Convolutional Networks Based Speech Enhancement
Natalie Yu-Hsien Wang, Hsiao-Lan Sharon Wang, Tao-Wei Wang, Szu-Wei Fu, Xugan Lu, Yu Tsao, Hsin-Min Wang
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[98] arXiv:1909.11919 (cross-list from cs.SD) [pdf, other]
Title: A Study of Joint Effect on Denoising Techniques and Visual Cues to Improve Speech Intelligibility in Cochlear Implant Simulation
Rung-Yu Tseng, Tao-Wei Wang, Szu-Wei Fu, Chia-Ying Lee, Yu Tsao
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[99] arXiv:1909.12208 (cross-list from cs.CL) [pdf, other]
Title: An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription
Catalin Zorila, Christoph Boeddeker, Rama Doddipatla, Reinhold Haeb-Umbach
Comments: Accepted for ASRU 2019
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[100] arXiv:1909.12232 (cross-list from cs.CL) [pdf, other]
Title: A Comparison of Hybrid and End-to-End Models for Syllable Recognition
Sebastian P. Bayerl, Korbinian Riedhammer
Comments: 22th International Conference of Text, Speech and Dialogue TSD2019
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[101] arXiv:1909.12289 (cross-list from cs.LG) [pdf, other]
Title: Attention Forcing for Sequence-to-sequence Model Training
Qingyun Dou, Yiting Lu, Joshua Efiong, Mark J. F. Gales
Comments: 11 pages, 4 figures, conference
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[102] arXiv:1909.12408 (cross-list from cs.CL) [pdf, other]
Title: Optimizing Speech Recognition For The Edge
Yuan Shangguan, Jian Li, Qiao Liang, Raziel Alvarez, Ian McGraw
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[103] arXiv:1909.12415 (cross-list from cs.CL) [pdf, other]
Title: Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong
Comments: Accepted by IEEE ASRU workshop, 2019
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[104] arXiv:1909.12681 (cross-list from cs.CL) [pdf, other]
Title: End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Xianghu Yue, Grandee Lee, Emre Yılmaz, Fang Deng, Haizhou Li
Comments: Accepted for publication at IEEE ASRU Workshop 2019
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[105] arXiv:1909.12699 (cross-list from cs.SD) [pdf, other]
Title: Urban Sound Tagging using Convolutional Neural Networks
Sainath Adapa
Comments: 5 pages
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[106] arXiv:1909.12780 (cross-list from cs.CV) [pdf, other]
Title: Learning to Have an Ear for Face Super-Resolution
Givi Meishvili, Simon Jenni, Paolo Favaro
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[107] arXiv:1909.13070 (cross-list from cs.SD) [pdf, other]
Title: Emirati-Accented Speaker Identification in Stressful Talking Conditions
Ismail Shahin, Ali Bou Nassif
Comments: 6 pages, this work has been accepted in The International Conference on Electrical and Computing Technologies and Applications, 2019 (ICECTA 2019)
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[108] arXiv:1909.13244 (cross-list from cs.SD) [pdf, other]
Title: Speaker Verification in Emotional Talking Environments based on Third-Order Circular Suprasegmental Hidden Markov Model
Ismail Shahin, Ali Bou Nassif
Comments: 6 pages, accepted in The International Conference on Electrical and Computing Technologies and Applications, 2019 (ICECTA 2019). arXiv admin note: text overlap with arXiv:1903.09803
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[109] arXiv:1909.13287 (cross-list from cs.MM) [pdf, other]
Title: MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style
Jing Luo, Xinyu Yang, Shulei Ji, Juan Li
Comments: Accepted by the 7th Conference on Sound and Music Technology, 2019, Harbin, China
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[110] arXiv:1909.13332 (cross-list from cs.CL) [pdf, other]
Title: Recent Advances in End-to-End Spoken Language Understanding
Natalia Tomashenko, Antoine Caubriere, Yannick Esteve, Antoine Laurent, Emmanuel Morin
Journal-ref: Statistical Language and Speech Processing. SLSP 2019
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[111] arXiv:1909.13537 (cross-list from cs.CL) [pdf, other]
Title: Embeddings for DNN speaker adaptive training
Joanna Rownicka, Peter Bell, Steve Renals
Comments: Accepted at ASRU 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[112] arXiv:1909.13775 (cross-list from cs.HC) [pdf, other]
Title: Ephemeral instruments
Vincent Goudard (SU)
Comments: New Interfaces for Musical Expression, Jun 2019, Porto-Alegre, Brazil
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[113] arXiv:1909.13790 (cross-list from cs.CL) [pdf, other]
Title: Incremental processing of noisy user utterances in the spoken language understanding task
Stefan Constantin, Jan Niehues, Alex Waibel
Comments: 10 pages, 3 figures, 7 tables, forthcoming in W-NUT 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Total of 113 entries
Showing up to 2000 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack