Audio and Speech Processing

Authors and titles for April 2022

Total of 320 entries : 1-25 ... 176-200 201-225 226-250 251-275 276-300 301-320

Showing up to 25 entries per page: fewer | more | all

[251] arXiv:2204.07763 (cross-list from cs.SD) [pdf, other]: Title: UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio

Jiangeng Chang, Yucheng Ruan, Cui Shaoze, John Soong Tshon Yit, Mengling Feng

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[252] arXiv:2204.07848 (cross-list from cs.CL) [pdf, other]: Title: STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation

Saad Naeem, Omer Beg

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[253] arXiv:2204.08026 (cross-list from cs.SD) [pdf, other]: Title: Advances in Thunder Sound Synthesis

Eva Fineberg, Jack Walters, Joshua Reiss

Comments: 9 pages, 6 figures, conference paper accepted to the AES Europe Spring 2022 Audio Engineering 152nd Convention

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[254] arXiv:2204.08164 (cross-list from cs.SD) [pdf, other]: Title: Robust End-to-end Speaker Diarization with Generic Neural Clustering

Chenyu Yang, Yu Wang

Comments: submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[255] arXiv:2204.08269 (cross-list from cs.SD) [pdf, other]: Title: Differentiable Time-Frequency Scattering on GPU

John Muradeli, Cyrus Vahidi, Changhong Wang, Han Han, Vincent Lostanlen, Mathieu Lagrange, George Fazekas

Comments: 8 pages, 6 figures. Submitted to the International Conference on Digital Audio Effects (DAFX) 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[256] arXiv:2204.08345 (cross-list from cs.SD) [pdf, other]: Title: Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Ehsan Amid, Om Thakkar, Arun Narayanan, Rajiv Mathews, Françoise Beaufays

Comments: Accepted to appear at Interspeech'22

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[257] arXiv:2204.08409 (cross-list from cs.SD) [pdf, other]: Title: Caption Feature Space Regularization for Audio Captioning

Yiming Zhang, Hong Yu, Ruoyi Du, Zhanyu Ma, Yuan Dong

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[258] arXiv:2204.08411 (cross-list from eess.SP) [pdf, other]: Title: Robust, Nonparametric, Efficient Decomposition of Spectral Peaks under Distortion and Interference

Kaan Gokcesu, Hakan Gokcesu

Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Optimization and Control (math.OC); Machine Learning (stat.ML)
[259] arXiv:2204.08474 (cross-list from cs.SD) [pdf, other]: Title: AB/BA analysis: A framework for estimating keyword spotting recall improvement while maintaining audio privacy

Raphael Petegrosso, Vasistakrishna Baderdinni, Thibaud Senechal, Benjamin L. Bullough

Comments: Accepted to NAACL 2022 Industry Track

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[260] arXiv:2204.08567 (cross-list from cs.SD) [pdf, other]: Title: Automated Audio Captioning using Audio Event Clues

Ayşegül Özkaya Eren, Mustafa Sert

Comments: submitted to IEEE/ACM Transactions on Audio Speech and Language Processing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[261] arXiv:2204.08625 (cross-list from cs.SD) [pdf, other]: Title: Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition

Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn Schuller

Comments: Accepted in IEEE Transactions on Affective Computing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[262] arXiv:2204.08686 (cross-list from cs.SD) [pdf, other]: Title: Audio-Visual Wake Word Spotting System For MISP Challenge 2021

Yanguang Xu, Jianwei Sun, Yang Han, Shuaijiang Zhao, Chaoyang Mei, Tingwei Guo, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li

Comments: Accepted to ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[263] arXiv:2204.08822 (cross-list from cs.SD) [pdf, other]: Title: A Convolutional-Attentional Neural Framework for Structure-Aware Performance-Score Synchronization

Ruchit Agrawal, Daniel Wolff, Simon Dixon

Comments: Published in IEEE Signal Processing Letters, Volume 29, December 2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[264] arXiv:2204.08920 (cross-list from cs.CL) [pdf, other]: Title: Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora

Comments: Submitted to Interspeech2022

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[265] arXiv:2204.08977 (cross-list from cs.SD) [pdf, other]: Title: Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking

Jinghui Xu, Jifeng Zhu, Yong Yang

Comments: 13 pages, 4 figures. arXiv admin note: text overlap with arXiv:1903.10346 by other authors

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[266] arXiv:2204.09028 (cross-list from cs.CL) [pdf, other]: Title: On the Locality of Attention in Direct Speech Translation

Belen Alastruey, Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà

Comments: ACL-SRW 2022. Equal contribution between Belen Alastruey and Javier Ferrando

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[267] arXiv:2204.09224 (cross-list from cs.SD) [pdf, other]: Title: ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David Cox, Mark Hasegawa-Johnson, Shiyu Chang

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[268] arXiv:2204.09227 (cross-list from cs.CL) [pdf, other]: Title: Cross-stitched Multi-modal Encoders

Karan Singla, Daniel Pressel, Ryan Price, Bhargav Srinivas Chinnari, Yeon-Jun Kim, Srinivas Bangalore

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[269] arXiv:2204.09381 (cross-list from cs.SD) [pdf, other]: Title: Exploration strategies for articulatory synthesis of complex syllable onsets

Daniel R. van Niekerk, Anqi Xu, Branislav Gerazov, Paul K. Krug, Peter Birkholz, Yi Xu

Comments: Accepted at Interspeech 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[270] arXiv:2204.09595 (cross-list from cs.CL) [pdf, other]: Title: Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation

Chih-Chiang Chang, Hung-yi Lee

Comments: INTERSPEECH 2022 camera ready

Journal-ref: Proc. Interspeech 2022, 5175-5179

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[271] arXiv:2204.09606 (cross-list from cs.CL) [pdf, other]: Title: Detecting Unintended Memorization in Language-Model-Fused ASR

W. Ronny Huang, Steve Chien, Om Thakkar, Rajiv Mathews

Comments: Interspeech 2022

Subjects: Computation and Language (cs.CL); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[272] arXiv:2204.09634 (cross-list from cs.SD) [pdf, other]: Title: Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering

Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[273] arXiv:2204.09647 (cross-list from eess.SP) [pdf, other]: Title: Parametric Models for DOA Trajectory Localization

Ruchi Pandey, Santosh Nannuru

Subjects: Signal Processing (eess.SP); Audio and Speech Processing (eess.AS)
[274] arXiv:2204.09657 (cross-list from cs.CL) [pdf, other]: Title: The MIT Voice Name System

Brian Subirana, Harry Levinson, Ferran Hueto, Prithvi Rajasekaran, Alexander Gaidis, Esteve Tarragó, Peter Oliveira-Soens

Comments: White Paper

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[275] arXiv:2204.09764 (cross-list from eess.SP) [pdf, other]: Title: Delamination prediction in composite panels using unsupervised-feature learning methods with wavelet-enhanced guided wave representations

Mahindra Rautela, J. Senthilnath, Ernesto Monaco, S. Gopalakrishnan

Subjects: Signal Processing (eess.SP); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)

Total of 320 entries : 1-25 ... 176-200 201-225 226-250 251-275 276-300 301-320

Showing up to 25 entries per page: fewer | more | all