close this message
arXiv smileybones

Happy Birthday to arXiv!

It's our birthday — woohoo! On August 14th, 1991, the very first paper was submitted to arXiv. That's 34 years of open science! Give today and help support arXiv for many birthdays to come.

Give a gift!
Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for April 2022

Total of 320 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 301-320
Showing up to 25 entries per page: fewer | more | all
[101] arXiv:2204.11232 [pdf, other]
Title: Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
Natsuo Yamashita, Shota Horiguchi, Takeshi Homma
Comments: Accepted to Speaker Odyssey 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[102] arXiv:2204.11286 [pdf, other]
Title: Improved far-field speech recognition using Joint Variational Autoencoder
Shashi Kumar, Shakti P. Rath, Abhishek Pandey
Comments: 5 pages, 2 figures, 3 tables
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[103] arXiv:2204.11501 [pdf, other]
Title: Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li
Comments: Accepted by ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[104] arXiv:2204.11933 [pdf, other]
Title: Cleanformer: A multichannel array configuration-invariant neural enhancement frontend for ASR in smart speakers
Joseph Caroselli, Arun Narayanan, Nathan Howard, Tom O'Malley
Comments: Accepted to ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[105] arXiv:2204.12076 [pdf, other]
Title: ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li, Xiaofei Li
Comments: INTERSPEECH2022(Accepted)
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[106] arXiv:2204.12092 [pdf, other]
Title: Mask scalar prediction for improving robust automatic speech recognition
Arun Narayanan, James Walker, Sankaran Panchapagesan, Nathan Howard, Yuma Koizumi
Comments: Submitted to Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[107] arXiv:2204.12260 [pdf, other]
Title: Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino
Comments: 22 pages, 8 figures. Under the review process
Journal-ref: HEAR: Holistic Evaluation of Audio Representations (NeurIPS 2021 Competition) PMLR 166 (2022) 1-24
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[108] arXiv:2204.12279 [pdf, other]
Title: Low-dimensional representation of infant and adult vocalization acoustics
Silvia Pagliarini, Sara Schneider, Christopher T. Kello, Anne S. Warlaumont
Comments: Under review at Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[109] arXiv:2204.12308 [pdf, other]
Title: Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Gene-Ping Yang, Hao Tang
Comments: Accepted at ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[110] arXiv:2204.12649 [pdf, other]
Title: Study on the Fairness of Speaker Verification Systems on Underrepresented Accents in English
Mariel Estevez, Luciana Ferrer
Comments: 5 pages, 2 figures, submitted to INTERSPEECH
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[111] arXiv:2204.12777 [pdf, other]
Title: Ultra Fast Speech Separation Model with Teacher Student Learning
Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Takuya Yoshioka, Shujie Liu, Jinyu Li, Xiangzhan Yu
Comments: Accepted by interspeech 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[112] arXiv:2204.13883 [pdf, other]
Title: Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker and Gain
Karn N. Watcharasupat, Kenneth Ooi, Bhan Lam, Trevor Wong, Zhen-Ting Ong, Woon-Seng Gan
Comments: Accepted to IEEE Signal Processing Letters. (c) 2022 IEEE
Journal-ref: IEEE Signal Processing Letters, Vol. 29, pp. 1749 - 1753, 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[113] arXiv:2204.13890 [pdf, other]
Title: Deployment of an IoT System for Adaptive In-Situ Soundscape Augmentation
Trevor Wong, Karn N. Watcharasupat, Bhan Lam, Kenneth Ooi, Zhen-Ting Ong, Furi Andi Karnapi, Woon-Seng Gan
Comments: To be presented at the 51st International Congress and Exposition on Noise Control Engineering
Journal-ref: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Feb. 2022, vol. 265, no. 5, pp. 2013-2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Systems and Control (eess.SY)
[114] arXiv:2204.00061 (cross-list from cs.SD) [pdf, other]
Title: Data-augmented cross-lingual synthesis in a teacher-student framework
Marcel de Korte, Jaebok Kim, Aki Kunikoshi, Adaeze Adigwe, Esther Klabbers
Comments: Submitted to INTERSPEECH 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[115] arXiv:2204.00088 (cross-list from cs.SD) [pdf, other]
Title: Speech and the n-Back task as a lens into depression. How combining both may allow us to isolate different core symptoms of depression
Salvatore Fara, Stefano Goria, Emilia Molimpakis, Nicholas Cummins
Comments: Submitted to Interspeech 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Quantitative Methods (q-bio.QM)
[116] arXiv:2204.00094 (cross-list from cs.SD) [pdf, other]
Title: Perceptive, non-linear Speech Processing and Spiking Neural Networks
Jean Rouat, Ramin Pichevar, Stéphane Loiselle
Comments: preprint of the 2005 published paper: Perceptive, Non-linear Speech Processing and Spiking Neural Networks. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science, vol 3445. Springer, Berlin, Heidelberg
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[117] arXiv:2204.00164 (cross-list from cs.CL) [pdf, other]
Title: Filter-based Discriminative Autoencoders for Children Speech Recognition
Chiang-Lin Tai, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang
Comments: Published in EUSIPCO 2022
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[118] arXiv:2204.00174 (cross-list from cs.CL) [pdf, other]
Title: InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR
Yu Nakagome, Tatsuya Komatsu, Yusuke Fujita, Shuta Ichimura, Yusuke Kida
Comments: This paper was submitted to INTERSPEECH2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[119] arXiv:2204.00175 (cross-list from cs.CL) [pdf, other]
Title: Alternate Intermediate Conditioning with Syllable-level and Character-level Targets for Japanese ASR
Yusuke Fujita, Tatsuya Komatsu, Yusuke Kida
Comments: SLT 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[120] arXiv:2204.00176 (cross-list from cs.CL) [pdf, other]
Title: Better Intermediates Improve CTC Inference
Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida
Comments: 5 pages, submitted INTERSPEECH2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[121] arXiv:2204.00212 (cross-list from cs.CL) [pdf, other]
Title: Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon
Comments: Accepted to Interspeech 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[122] arXiv:2204.00291 (cross-list from cs.CL) [pdf, other]
Title: Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Rodolfo Zevallos
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[123] arXiv:2204.00311 (cross-list from cs.SD) [pdf, other]
Title: Speaker verification in mismatch training and testing conditions
Marcos Faundez-Zanuy, Adam Slupinski
Comments: 4 pages, published in 6th international conference on spoken language processing (ICSLP 2000), Vol. II, pp.322-325. ICSLP 2000, ISBN 7-80150-144-4/G.18Beijing (China). October 16-20, 2000. arXiv admin note: substantial text overlap with arXiv:2203.00513
Journal-ref: 6th international conference on spoken language processing (ICSLP 2000), Vol. II, pp.322-325, 2000
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[124] arXiv:2204.00331 (cross-list from cs.SD) [pdf, other]
Title: Using segment-based features of jaw movements to recognize foraging activities in grazing cattle
José O. Chelotti, Sebastián R. Vanrell, Luciano S. Martinez-Rau, Julio R. Galli, Santiago A. Utsumi, Alejandra M. Planisich, Suyai A. Almirón, Diego H. Milone, Leonardo L. Giovanini, H. Leonardo Rufiner
Comments: Preprint submitted to journal
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[125] arXiv:2204.00348 (cross-list from cs.CL) [pdf, other]
Title: WavFT: Acoustic model finetuning with labelled and unlabelled data
Utkarsh Chauhan, Vikas Joshi, Rupesh R. Mehta
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Total of 320 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 301-320
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack