Audio and Speech Processing

Authors and titles for April 2021

Total of 266 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 251-266

Showing up to 25 entries per page: fewer | more | all

[101] arXiv:2104.13069 [pdf, other]: Title: Visualization of Linear Operations in the Spherical Harmonics Domain

Maximilian Kentgens, Peter Jax

Comments: Pre-print/author version of paper presented at International Conference on Immersive and 3D Audio (I3DA), Sept. 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[102] arXiv:2104.13168 [pdf, other]: Title: dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal Processing

Diego Di Carlo, Pinchas Tandeitnik, Cédric Foy (UMRAE), Antoine Deleforge, Nancy Bertin, Sharon Gannot

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[103] arXiv:2104.13247 [pdf, other]: Title: IATos: AI-powered pre-screening tool for COVID-19 from cough audio samples

D. Trejo Pizzo, S. Esteban

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[104] arXiv:2104.13347 [pdf, other]: Title: BeamLearning: an end-to-end Deep Learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data

Hadrien Pujol, Éric Bavu, Alexandre Garcia

Comments: The following article has been submitted to the special issue on Machine Learning in Acoustics in JASA. After it is published, it will be found at this http URL

Journal-ref: J. Acoust. Soc. Am. 149 (6), June 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[105] arXiv:2104.13423 [pdf, other]: Title: DASEE A Synthetic Database of Domestic Acoustic Scenes and Events in Dementia Patients Environment

Abigail Copiaco, Christian Ritz, Stefano Fasciani, Nidhal Abdulaziz

Comments: 5 pages, 4 figures, 6 tables

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD); Signal Processing (eess.SP)
[106] arXiv:2104.13553 [pdf, other]: Title: AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries

Woosung Choi, Minseok Kim, Marco A. Martínez Ramírez, Jaehwa Chung, Soonyoung Jung

Comments: 10 pages, 8 figures, 3 tables, under reviewing of ACMMM 21

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[107] arXiv:2104.13620 [pdf, other]: Title: IDMT-Traffic: An Open Benchmark Dataset for Acoustic Traffic Monitoring Research

Jakob Abeßer, Saichand Gourishetti, András Kátai, Tobias Clauß, Prachi Sharma, Judith Liebetrau

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[108] arXiv:2104.13970 [pdf, other]: Title: Personalized Keyphrase Detection using Speaker and Environment Information

Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng (Arden)Huang, Arun Narayanan, Ian McGraw

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[109] arXiv:2104.14264 [pdf, other]: Title: Hardware-Friendly Synaptic Orders and Timescales in Liquid State Machines for Speech Classification

Vivek Saraswat, Ajinkya Gorad, Anand Naik, Aakash Patil, Udayan Ganguly

Subjects: Audio and Speech Processing (eess.AS); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Neurons and Cognition (q-bio.NC)
[110] arXiv:2104.14791 [pdf, other]: Title: Deformable TDNN with adaptive receptive fields for speech recognition

Keyu An, Yi Zhang, Zhijian Ou

Comments: 5 pages. submitted to Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[111] arXiv:2104.14921 [pdf, other]: Title: Crackle Detection In Lung Sounds Using Transfer Learning And Multi-Input Convolitional Neural Networks

Truc Nguyen, Franz Pernkopf

Comments: Under Review in Proceeding of EMBC 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[112] arXiv:2104.00235 (cross-list from cs.CL) [pdf, other]: Title: Multilingual and code-switching ASR challenges for low resource Indian languages

Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham

Comments: 6 pages

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[113] arXiv:2104.00239 (cross-list from cs.CV) [pdf, other]: Title: Positive Sample Propagation along the Audio-Visual Event Line

Jinxing Zhou, Liang Zheng, Yiran Zhong, Shijie Hao, Meng Wang

Comments: Accepted to CVPR 2021. Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[114] arXiv:2104.00315 (cross-list from cs.CV) [pdf, other]: Title: Unsupervised Sound Localization via Iterative Contrastive Learning

Yan-Bo Lin, Hung-Yu Tseng, Hsin-Ying Lee, Yen-Yu Lin, Ming-Hsuan Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[115] arXiv:2104.00355 (cross-list from cs.SD) [pdf, other]: Title: Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux

Comments: In Proceedings of Interspeech 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[116] arXiv:2104.00437 (cross-list from cs.SD) [pdf, other]: Title: Enriched Music Representations with Multiple Cross-modal Contrastive Learning

Andres Ferraro, Xavier Favory, Konstantinos Drossos, Yuntae Kim, Dmitry Bogdanov

Comments: Accepted for publication to IEEE Signal Processing Letters

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[117] arXiv:2104.00705 (cross-list from cs.SD) [pdf, other]: Title: Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

Qing He, Zhiping Xiu, Thilo Koehler, Jilong Wu

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[118] arXiv:2104.00732 (cross-list from cs.SD) [pdf, other]: Title: Out of a hundred trials, how many errors does your speaker verifier make?

Niko Brümmer, Luciana Ferrer, Albert Swart

Comments: Submitted to Interspeech 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[119] arXiv:2104.00824 (cross-list from cs.CL) [pdf, other]: Title: Tusom2021: A Phonetically Transcribed Speech Dataset from an Endangered Language for Universal Phone Recognition Experiments

David R. Mortensen, Jordan Picone, Xinjian Li, Kathleen Siminyu

Comments: 4 pages, 3 figures

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[120] arXiv:2104.01027 (cross-list from cs.SD) [pdf, other]: Title: Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[121] arXiv:2104.01160 (cross-list from cs.SD) [pdf, other]: Title: PhyAug: Physics-Directed Data Augmentation for Deep Sensing Model Transfer in Cyber-Physical Systems

Wenjie Luo, Zhenyu Yan, Qun Song, Rui Tan

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[122] arXiv:2104.01161 (cross-list from cs.SD) [pdf, other]: Title: An Audio-Based Deep Learning Framework For BBC Television Programme Classification

Lam Pham, Chris Baume, Qiuqiang Kong, Tassadaq Hussain, Wenwu Wang, Mark Plumbley

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[123] arXiv:2104.01271 (cross-list from cs.SD) [pdf, other]: Title: PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification

Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Comments: Accepted to Interspeech 2021

Journal-ref: Proc. Interspeech 2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[124] arXiv:2104.01304 (cross-list from cs.SD) [pdf, other]: Title: Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio

Jeffrey Tumminia, Amanda Kuznecov, Sophia Tsilerides, Ilana Weinstein, Brian McFee, Michael Picheny, Aaron R. Kaufman

Comments: Under review for InterSpeech 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[125] arXiv:2104.01378 (cross-list from cs.CL) [pdf, other]: Title: speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang

Comments: Accepted in INTERSPEECH 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Total of 266 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 251-266

Showing up to 25 entries per page: fewer | more | all