Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for March 2020

Total of 117 entries : 1-25 26-50 51-75 76-100 101-117
Showing up to 25 entries per page: fewer | more | all
[26] arXiv:2003.03135 [pdf, other]
Title: Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages
Astik Biswas, Emre Yılmaz, Febe de Wet, Ewald van der Westhuizen, Thomas Niesler
Comments: Conference
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[27] arXiv:2003.03375 [pdf, other]
Title: Multi-Time-Scale Convolution for Emotion Recognition from Speech Audio Signals
Eric Guizzo, Tillman Weyde, Jack Barnett Leveson
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[28] arXiv:2003.03432 [pdf, other]
Title: Lightweight Speaker Verification for Online Identification of New Speakers with Short Segments
Ivette Velez, Caleb Rascon, Gibran Fuentes-Pineda
Comments: This paper has been accepted for publication in Applied Soft Computing Journal
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[29] arXiv:2003.03927 [pdf, other]
Title: Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning
Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu
Comments: accepted in ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[30] arXiv:2003.03987 [pdf, other]
Title: Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system
Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani
Comments: 8 pages, to appear in ICASSP2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[31] arXiv:2003.03998 [pdf, other]
Title: Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani
Comments: 5 pages, to appear in ICASSP2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[32] arXiv:2003.04194 [pdf, other]
Title: Toward Cross-Domain Speech Recognition with End-to-End Models
Thai-Son Nguyen, Sebastian Stüker, Alex Waibel
Comments: Presented in Life-Long Learning for Spoken Language Systems Workshop - ASRU 2019
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD)
[33] arXiv:2003.04241 [pdf, other]
Title: Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Vincent Roger, Jérôme Farinas, Julien Pinquier
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[34] arXiv:2003.04640 [pdf, other]
Title: Vowels and Prosody Contribution in Neural Network Based Voice Conversion Algorithm with Noisy Training Data
Olaide Agbolade
Comments: 5 pages
Journal-ref: European Journal of Engineering Research and Science, 5(3), pp.229-233 (2020)
Subjects: Audio and Speech Processing (eess.AS)
[35] arXiv:2003.04710 [pdf, other]
Title: Development of Automatic Speech Recognition for Kazakh Language using Transfer Learning
Amirgaliyev E.N., Kuanyshbay D.N., Baimuratov O
Comments: 9 pages, 3 fig., 1 table
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[36] arXiv:2003.04733 [pdf, other]
Title: Speaker Identification using EEG
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP); Machine Learning (stat.ML)
[37] arXiv:2003.05184 [pdf, other]
Title: Voice conversion using coefficient mapping and neural network
Olaide Ayodeji Agbolade, Samson A. Oyetunji
Comments: 5 pages
Journal-ref: In 2016 International Conference for Students on Applied Engineering (ICSAE) (pp. 479-483) IEEE
Subjects: Audio and Speech Processing (eess.AS)
[38] arXiv:2003.05223 [pdf, other]
Title: Robust Audio Watermarking Using Graph-based Transform and Singular Value Decomposition
Majid Farzaneh, Rahil Mahdian Toroghi
Comments: 5 pages, 5 images, 4 tables, submitted to an IRED conference
Subjects: Audio and Speech Processing (eess.AS)
[39] arXiv:2003.05897 [pdf, other]
Title: Bringing in the outliers: A sparse subspace clustering approach to learn a dictionary of mouse ultrasonic vocalizations
Jiaxi Wang, Karel Mundnich, Allison T. Knoll, Pat Levitt, Shrikanth Narayanan
Comments: 5 pages, 4 figures, conference paper, accepted in ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[40] arXiv:2003.06182 [pdf, other]
Title: A Wide Dataset of Ear Shapes and Pinna-Related Transfer Functions Generated by Random Ear Drawings
Corentin Guezenoc (IETR), Renaud Seguier (IETR)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Classical Physics (physics.class-ph)
[41] arXiv:2003.06183 [pdf, other]
Title: HRTF Individualization: A Survey
Corentin Guezenoc (IETR), Renaud Seguier (IETR)
Comments: Audio Engineering Society Convention 145, Audio Engineering Society, Oct 2018, New York, United States
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[42] arXiv:2003.06226 [pdf, other]
Title: Quantifying Musical Style: Ranking Symbolic Music based on Similarity to a Style
Jeff Ens, Philippe Pasquier
Journal-ref: In Proceedings of the International Symposium on Music Information Retrieval. Vol. 20. 2019, 870-877
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[43] arXiv:2003.06227 [pdf, other]
Title: Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis
Ting-Yao Hu, Ashish Shrivastava, Oncel Tuzel, Chandra Dhir
Comments: Accepted at ICASSP 2020 (for presentation in a lecture session)
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (cs.LG); Sound (cs.SD)
[44] arXiv:2003.06656 [pdf, other]
Title: Audio-Visual Spatial Aligment Requirements of Central and Peripheral Object Events
Davide Berghi, Hanne Stenzel, Marco Volino, Adrian Hilton, Philip J.B. Jackson
Comments: Two-pages poster abstract
Journal-ref: IEEE VR 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)
[45] arXiv:2003.06686 [pdf, other]
Title: Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0
Zack Hodari, Catherine Lai, Simon King
Comments: Published to the 10th ISCA International Conference on Speech Prosody (SP2020)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[46] arXiv:2003.06779 [pdf, other]
Title: A proto-object based audiovisual saliency map
Sudarshan Ramenahalli
Comments: 50 pages, 12 figures
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[47] arXiv:2003.06894 [pdf, other]
Title: Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models
Natalia Tomashenko, Yuri Khokhlov, Yannick Esteve
Comments: 36 pages; originally was submitted to CSL in February 2017
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[48] arXiv:2003.07032 [pdf, other]
Title: Multi-modal Multi-channel Target Speech Separation
Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu
Comments: accepted in IEEE Journal of Selcted Topics in Signal Processing
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)
[49] arXiv:2003.07393 [pdf, other]
Title: TensorFlow Audio Models in Essentia
Pablo Alonso-Jiménez, Dmitry Bogdanov, Jordi Pons, Xavier Serra
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[50] arXiv:2003.07482 [pdf, other]
Title: High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Jinyu Li, Rui Zhao, Eric Sun, Jeremy H. M. Wong, Amit Das, Zhong Meng, Yifan Gong
Comments: Accepted by ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Total of 117 entries : 1-25 26-50 51-75 76-100 101-117
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack