Sound

Authors and titles for April 2022

Total of 291 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 276-291

Showing up to 25 entries per page: fewer | more | all

[101] arXiv:2204.09883 [pdf, other]: Title: Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Xun Gong, Yizhou Lu, Zhikai Zhou, Yanmin Qian

Comments: Accepted by Interspeech2021

Journal-ref: Proc. Interspeech 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[102] arXiv:2204.09911 [pdf, other]: Title: STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency

Zhong-Qiu Wang, Gordon Wichern, Shinji Watanabe, Jonathan Le Roux

Comments: in IEEE/ACM Transactions on Audio, Speech, and Language Processing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[103] arXiv:2204.09917 [pdf, other]: Title: SinTra: Learning an inspiration model from a single multi-track music segment

Qingwei Song, Qiwei Sun, Dongsheng Guo, Haiyong Zheng

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[104] arXiv:2204.09976 [pdf, other]: Title: Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion

Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen, Nicholas Evans

Comments: 8 pages, accepted by Odyssey 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[105] arXiv:2204.10125 [pdf, other]: Title: Physical Modeling using Recurrent Neural Networks with Fast Convolutional Layers

Julian D. Parker, Sebastian J. Schlecht, Rudolf Rabenstein, Maximilian Schäfer

Comments: Accepted to DAFx2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Computational Physics (physics.comp-ph)
[106] arXiv:2204.10523 [pdf, other]: Title: Unifying Cosine and PLDA Back-ends for Speaker Verification

Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee, Guanglu Wan

Comments: submitted to interspeech2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[107] arXiv:2204.10561 [pdf, other]: Title: Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

Detai Xin, Shinnosuke Takamichi, Takuma Okamoto, Hisashi Kawai, Hiroshi Saruwatari

Comments: submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[108] arXiv:2204.10581 [pdf, other]: Title: Fused Audio Instance and Representation for Respiratory Disease Detection

Tuan Truong, Matthias Lenga, Antoine Serrurier, Sadegh Mohammadi

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[109] arXiv:2204.10749 [pdf, other]: Title: E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

W. Ronny Huang, Shuo-yiin Chang, David Rybach, Rohit Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Cal Peyser, Zhiyun Lu

Comments: Interspeech 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[110] arXiv:2204.11139 [pdf, other]: Title: Musical Stylistic Analysis: A Study of Intervallic Transition Graphs via Persistent Homology

Martín Mijangos, Alessandro Bravetti, Pablo Padilla

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Algebraic Topology (math.AT)
[111] arXiv:2204.11304 [pdf, other]: Title: Dictionary Attacks on Speaker Verification

Mirko Marras, Pawel Korus, Anubhav Jain, Nasir Memon

Comments: Accepted in IEEE Transactions on Information Forensics and Security

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[112] arXiv:2204.11320 [pdf, other]: Title: Emotion-Aware Transformer Encoder for Empathetic Dialogue Generation

Raman Goel, Seba Susan, Sachin Vashisht, Armaan Dhanda

Comments: Accepted in 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[113] arXiv:2204.11382 [pdf, other]: Title: Real-time Speech Emotion Recognition Based on Syllable-Level Feature Extraction

Abdul Rehman, Zhen-Tao Liu, Min Wu, Wei-Hua Cao, Cheng-Shan Jiang

Comments: Significant revisions

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[114] arXiv:2204.11403 [pdf, other]: Title: Back-ends Selection for Deep Speaker Embeddings

Zhuo Li, Runqiu Xiao, Zihan Zhang, Zhenduo Zhao, Wenchao Wang, Pengyuan Zhang

Comments: submitted to interspeech2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[115] arXiv:2204.11437 [pdf, other]: Title: Understanding Audio Features via Trainable Basis Functions

Kwan Yee Heung, Kin Wai Cheuk, Dorien Herremans

Comments: under review in Interspeech 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[116] arXiv:2204.11479 [pdf, other]: Title: End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network

Avi Gazneli, Gadi Zimerman, Tal Ridnik, Gilad Sharir, Asaf Noy

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[117] arXiv:2204.11792 [pdf, other]: Title: SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech

Zhenhui Ye, Zhou Zhao, Yi Ren, Fei Wu

Comments: Accepted by IJCAI-2022. 12 pages

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[118] arXiv:2204.11806 [pdf, html, other]: Title: Parallel Synthesis for Autoregressive Speech Generation

Po-chun Hsu, Da-rong Liu, Andy T. Liu, Hung-yi Lee

Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[119] arXiv:2204.11942 [pdf, other]: Title: Meta-AF: Meta-Learning for Adaptive Filters

Jonah Casebeer, Nicholas J. Bryan, Paris Smaragdis

Comments: Accepted to ACM/IEEE TASLP. Source code and audio examples: this https URL

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[120] arXiv:2204.12112 [pdf, other]: Title: Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure

Siqi Zheng, Hongbin Suo

Comments: ICASSP 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[121] arXiv:2204.12177 [pdf, other]: Title: A Comparative Study on Approaches to Acoustic Scene Classification using CNNs

Ishrat Jahan Ananya, Sarah Suad, Shadab Hafiz Choudhury, Mohammad Ashrafuzzaman Khan

Comments: Presented at 2021 Mexican International Conference on Artificial Intelligence. Published in Advances in Computational Intelligence, MICAI 2021, Lecture Notes in Computer Science. 12 pages, 3 figures, 5 tables

Journal-ref: Advances in Computational Intelligence, MICAI 2021, Lecture Notes in Artificial Intelligence vol. 13067, pp. 81-91 (2021)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[122] arXiv:2204.12290 [pdf, other]: Title: On Machine Learning-Driven Surrogates for Sound Transmission Loss Simulations

Barbara Cunha (LTDS), Abdel-Malek Zine (ICJ), Mohamed Ichchou (ECL), Christophe Droz (COSYS-SII), Stéphane Foulard

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Medical Physics (physics.med-ph)
[123] arXiv:2204.12486 [pdf, other]: Title: Measurement uncertainty and unicity of single number quantities describing the spatial decay of speech level in open-plan offices

Lucas Lenne (INRS (Vandoeuvre lès Nancy)), Patrick Chevret, Étienne Parizet

Journal-ref: Applied Acoustics, Elsevier, 2021, 182, pp.108269

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[124] arXiv:2204.12622 [pdf, other]: Title: Named Entity Recognition for Audio De-Identification

Guillaume Baril, Patrick Cardinal, Alessandro Lameiras Koerich

Comments: 8 pages

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[125] arXiv:2204.12768 [pdf, other]: Title: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training

Dading Chong, Helin Wang, Peilin Zhou, Qingcheng Zeng

Comments: Submit to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Total of 291 entries : 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 ... 276-291

Showing up to 25 entries per page: fewer | more | all