Audio and Speech Processing

Authors and titles for February 2021

Total of 208 entries

Showing up to 2000 entries per page: fewer | more | all

[101] arXiv:2102.01991 (cross-list from cs.SD) [pdf, other]: Title: Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

Shengkui Zhao, Hao Wang, Trung Hieu Nguyen, Bin Ma

Comments: 5 pages, 2 figures, 4 tables, accepted by ICASSP 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[102] arXiv:2102.01993 (cross-list from cs.SD) [pdf, html, other]: Title: Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Shengkui Zhao, Trung Hieu Nguyen, Bin Ma

Comments: 5 pages, 4 figures, 2 tables, accepted by ICASSP 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[103] arXiv:2102.02028 (cross-list from cs.SD) [pdf, other]: Title: Music source separation conditioned on 3D point clouds

Francesc Lluís, Vasileios Chatziioannou, Alex Hofmann

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[104] arXiv:2102.02074 (cross-list from cs.SD) [pdf, other]: Title: Data Generation Using Pass-phrase-dependent Deep Auto-encoders for Text-Dependent Speaker Verification

Achintya Kumar Sarkar, Md Sahidullah, Zheng-Hua Tan

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[105] arXiv:2102.02270 (cross-list from cs.CL) [pdf, other]: Title: Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[106] arXiv:2102.02282 (cross-list from cs.SD) [pdf, other]: Title: Downbeat Tracking with Tempo-Invariant Convolutional Neural Networks

Bruno Di Giorgi, Matthias Mauch, Mark Levy

Comments: 7 pages, 5 figures, Proceedings of the 21st International Society for Music Information Retrieval Conference, ISMIR 2020

Journal-ref: Proceedings of the 21st International Society for Music Information Retrieval Conference (2020) 216-222

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[107] arXiv:2102.02417 (cross-list from cs.SD) [pdf, other]: Title: Audio Adversarial Examples: Attacks Using Vocal Masks

Kai Yuan Tay, Lynnette Ng, Wei Han Chua, Lucerne Loke, Danqi Ye, Melissa Chua

Comments: 9 pages, 1 figure, 2 tables. Submitted to COLING2020

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[108] arXiv:2102.02640 (cross-list from cs.SD) [pdf, other]: Title: Low Bit-Rate Wideband Speech Coding: A Deep Generative Model based Approach

Gang Min, Xiongwei Zhang, Xia Zou, Xiangyang Liu

Comments: 6 pages

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[109] arXiv:2102.02964 (cross-list from cs.SD) [pdf, other]: Title: Diversity-Robust Acoustic Feature Signatures Based on Multiscale Fractal Dimension for Similarity Search of Environmental Sounds

Motohiro Sunouchi, Masaharu Yoshioka

Comments: 15 pages, 14 figures

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[110] arXiv:2102.03049 (cross-list from cs.SD) [pdf, other]: Title: Benchmarking of eight recurrent neural network variants for breath phase and adventitious sound detection on a self-developed open-access lung sound database-HF_Lung_V1

Fu-Shun Hsu, Shang-Ran Huang, Chien-Wen Huang, Chao-Jung Huang, Yuan-Ren Cheng, Chun-Chieh Chen, Jack Hsiao, Chung-Wei Chen, Li-Chin Chen, Yen-Chun Lai, Bi-Fang Hsu, Nian-Jhen Lin, Wan-Lin Tsai, Yi-Lin Wu, Tzu-Ling Tseng, Ching-Ting Tseng, Yi-Tsun Chen, Feipei Lai

Comments: 48 pages, 8 figures. Accepted by PLoS One

Journal-ref: PLoS ONE, 2021, 16(7): e0254134

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[111] arXiv:2102.03055 (cross-list from cs.SD) [pdf, other]: Title: Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream End-to-End ASR

Ruizhi Li, Gregory Sell, Hynek Hermansky

Comments: Accepted at IEEE SLT 2021

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[112] arXiv:2102.03170 (cross-list from cs.SD) [pdf, other]: Title: White-box Audio VST Effect Programming

Christopher Mitcheltree, Hideki Koike

Comments: The latest version of the system is to appear at EvoMUSART 2021 as a full paper. Audio samples of the latest system can be listened to at this https URL

Journal-ref: 4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020, Vancouver, Canada

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[113] arXiv:2102.03207 (cross-list from cs.SD) [pdf, other]: Title: Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

Hyeong-Seok Choi, Sungjin Park, Jie Hwan Lee, Hoon Heo, Dongsuk Jeon, Kyogu Lee

Comments: 5 pages, 2 figures, 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). arXiv admin note: text overlap with arXiv:2006.00687

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[114] arXiv:2102.03229 (cross-list from cs.SD) [pdf, other]: Title: Multi-Task Self-Supervised Pre-Training for Music Classification

Ho-Hsiang Wu, Chieh-Chi Kao, Qingming Tang, Ming Sun, Brian McFee, Juan Pablo Bello, Chao Wang

Comments: Copyright 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[115] arXiv:2102.03424 (cross-list from cs.CV) [pdf, other]: Title: Learning Audio-Visual Correlations from Variational Cross-Modal Generation

Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, Yan Yan

Comments: Accepted to ICASSP 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[116] arXiv:2102.03662 (cross-list from cs.CL) [pdf, other]: Title: A bandit approach to curriculum generation for automatic speech recognition

Anastasia Kuznetsova, Anurag Kumar, Francis M. Tyers

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[117] arXiv:2102.03868 (cross-list from cs.SD) [pdf, other]: Title: U-vectors: Generating clusterable speaker embedding from unlabeled data

M. F. Mridha, Abu Quwsar Ohi, Muhammad Mostafa Monowar, Md. Abdul Hamid, Md. Rashedul Islam, Yutaka Watanobe

Comments: 18 pages, 7 figures

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[118] arXiv:2102.03957 (cross-list from cs.SD) [pdf, other]: Title: Extracting the Auditory Attention in a Dual-Speaker Scenario from EEG using a Joint CNN-LSTM Model

Ivine Kuruvila, Jan Muncke, Eghart Fischer, Ulrich Hoppe

Comments: 18 pages, 6 figures

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[119] arXiv:2102.04040 (cross-list from cs.SD) [pdf, other]: Title: LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu

Comments: Accepted to ICASSP 21

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[120] arXiv:2102.04051 (cross-list from cs.HC) [pdf, other]: Title: HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perception

Yota Ueda, Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari

Comments: 5 pages, 6 figures, to be published in 2021 IEEE International Conference on Acoustics, Speech and Signal Processing

Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[121] arXiv:2102.04056 (cross-list from cs.SD) [pdf, other]: Title: Speaker and Direction Inferred Dual-channel Speech Separation

Chenxing Li, Jiaming Xu, Nima Mesgarani, Bo Xu

Comments: Accepted by ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[122] arXiv:2102.04062 (cross-list from cs.SD) [pdf, other]: Title: An Update on a Progressively Expanded Database for Automated Lung Sound Analysis

Fu-Shun Hsu, Shang-Ran Huang, Chien-Wen Huang, Yuan-Ren Cheng, Chun-Chieh Chen, Jack Hsiao, Chung-Wei Chen, Feipei Lai

Comments: Under review, 14 pages, 5 figures, 3 tables

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[123] arXiv:2102.04198 (cross-list from cs.SD) [pdf, other]: Title: ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and Phase Optimization with a Two-Stage Deep Network

Andong Li, Wenzhe Liu, Xiaoxue Luo, Chengshi Zheng, Xiaodong Li

Comments: 5 pages, 3 figures, accepted by ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[124] arXiv:2102.04254 (cross-list from cs.CE) [pdf, other]: Title: A Data-Driven Approach to Violin Making

Sebastian Gonzalez, Davide Salvi, Daniel Baeza, Fabio Antonacci, Augusto Sarti

Subjects: Computational Engineering, Finance, and Science (cs.CE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[125] arXiv:2102.04429 (cross-list from cs.SD) [pdf, other]: Title: Federated Acoustic Modeling For Automatic Speech Recognition

Xiaodong Cui, Songtao Lu, Brian Kingsbury

Comments: Accepted by ICASSP 2021

Subjects: Sound (cs.SD); Distributed, Parallel, and Cluster Computing (cs.DC); Audio and Speech Processing (eess.AS)
[126] arXiv:2102.04488 (cross-list from cs.CL) [pdf, other]: Title: Wake Word Detection with Streaming Transformers

Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur

Comments: Accepted at IEEE ICASSP 2021. 5 pages, 3 figures

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[127] arXiv:2102.04588 (cross-list from cs.SD) [pdf, other]: Title: A comparative study of two-dimensional vocal tract acoustic modeling based on Finite-Difference Time-Domain methods

Debasish Ray Mohapatra, Victor Zappi, Sidney Fels

Comments: 4 pages, 3 figures

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[128] arXiv:2102.04680 (cross-list from cs.SD) [pdf, other]: Title: TräumerAI: Dreaming Music with StyleGAN

Dasaem Jeong, Seungheon Doh, Taegyun Kwon

Comments: presented in NeurIPS Workshop 2020: Machine Learning for Creativity and Design

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[129] arXiv:2102.04740 (cross-list from stat.ME) [pdf, other]: Title: Principal components variable importance reconstruction (PC-VIR): Exploring predictive importance in multicollinear acoustic speech data

Christopher Carignan, Ander Egurtzegi

Comments: 10 pages, 3 figures, GitHub repository

Subjects: Methodology (stat.ME); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[130] arXiv:2102.04832 (cross-list from eess.SP) [pdf, other]: Title: Fast and Accurate Amplitude Demodulation of Wideband Signals

Mantas Gabrielaitis

Comments: Accepted for publication in IEEE Transactions on Signal Processing

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[131] arXiv:2102.04880 (cross-list from cs.SD) [pdf, other]: Title: Diagnosis of COVID-19 and Non-COVID-19 Patients by Classifying Only a Single Cough Sound

Masoud Maleki

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Optimization and Control (math.OC)
[132] arXiv:2102.04932 (cross-list from cs.LG) [pdf, other]: Title: Sparsification via Compressed Sensing for Automatic Speech Recognition

Kai Zhen (1 and 2), Hieu Duy Nguyen (2), Feng-Ju Chang (2), Athanasios Mouchtaris (2), Ariya Rastrow (2). ((1) Indiana University Bloomington, (2) Alexa Machine Learning, Amazon, USA)

Comments: 5 pages, accepted for publication in (ICASSP 2021) 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing. June 6-12, 2021. Location: Toronto, ON, Canada

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[133] arXiv:2102.04945 (cross-list from cs.SD) [pdf, other]: Title: On permutation invariant training for speech source separation

Xiaoyu Liu, Jordi Pons

Comments: In proceedings of ICASSP2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[134] arXiv:2102.04997 (cross-list from cs.LG) [pdf, other]: Title: Deep Neural Network based Cough Detection using Bed-mounted Accelerometer Measurements

Madhurananda Pahar, Igor Miranda, Andreas Diacon, Thomas Niesler

Comments: It has been accepted in ICASSP, 2021. Copyright information is shown at the very first page

Journal-ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[135] arXiv:2102.05151 (cross-list from cs.SD) [pdf, other]: Title: Enhancing Audio Augmentation Methods with Consistency Learning

Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

Comments: Accepted to 46th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[136] arXiv:2102.05225 (cross-list from cs.SD) [pdf, other]: Title: Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data

Jing Han, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

Comments: 5 pages, 3 figures, 2 tables, Accepted for publication at ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[137] arXiv:2102.05630 (cross-list from cs.SD) [pdf, other]: Title: Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning

Giuseppe Ruggiero, Enrico Zovato, Luigi Di Caro, Vincent Pollet

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[138] arXiv:2102.05749 (cross-list from cs.SD) [pdf, other]: Title: Self-Supervised VQ-VAE for One-Shot Music Style Transfer

Ondřej Cífka, Alexey Ozerov, Umut Şimşekli, Gaël Richard

Comments: ICASSP 2021. Website: this https URL

Journal-ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (2021) 96-100

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[139] arXiv:2102.05872 (cross-list from cs.SD) [pdf, other]: Title: Onoma-to-wave: Environmental sound synthesis from onomatopoeic words

Yuki Okamoto, Keisuke Imoto, Shinnosuke Takamichi, Ryosuke Yamanishi, Takahiro Fukumori, Yoichi Yamashita

Comments: Accepted to APSIPA Transactions on Signal and Information Processing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[140] arXiv:2102.05894 (cross-list from cs.SD) [pdf, other]: Title: CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions

Ali Bou Nassif, Ismail Shahin, Shibani Hamsa, Nawel Nemmour, Keikichi Hirose

Comments: Published in Applied Soft Computing journal

Journal-ref: Applied Soft Computing, Elsevier, 2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[141] arXiv:2102.06003 (cross-list from cs.SD) [pdf, other]: Title: Language Independent Emotion Quantification using Non linear Modelling of Speech

Uddalok Sarkar, Sayan Nag, Chirayata Bhattacharya, Shankha Sanyal, Archi Banerjee, Ranjan Sengupta, Dipak Ghosh

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[142] arXiv:2102.06034 (cross-list from cs.SD) [pdf, other]: Title: Speech enhancement with mixture-of-deep-experts with clean clustering pre-training

Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

Comments: arXiv admin note: text overlap with arXiv:1703.09302

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[143] arXiv:2102.06038 (cross-list from cs.SD) [pdf, other]: Title: A Fractal Approach to Characterize Emotions in Audio and Visual Domain: A Study on Cross-Modal Interaction

Sayan Nag, Uddalok Sarkar, Shankha Sanyal, Archi Banerjee, Souparno Roy, Samir Karmakar, Ranjan Sengupta, Dipak Ghosh

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[144] arXiv:2102.06142 (cross-list from cs.SD) [pdf, other]: Title: Multichannel-based learning for audio object extraction

Daniel Arteaga, Jordi Pons

Comments: In proceedings of ICASSP2021. Appendix added

Journal-ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 206-210

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[145] arXiv:2102.06269 (cross-list from eess.IV) [pdf, other]: Title: Disentanglement for audio-visual emotion recognition using multitask setup

Raghuveer Peri, Srinivas Parthasarathy, Charles Bradshaw, Shiva Sundaram

Comments: Accepted for ICASSP 2021, 5 pages

Subjects: Image and Video Processing (eess.IV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[146] arXiv:2102.06283 (cross-list from cs.CL) [pdf, other]: Title: Speech-language Pre-training for End-to-end Spoken Language Understanding

Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[147] arXiv:2102.06291 (cross-list from cs.SD) [pdf, other]: Title: A Multi-View Approach To Audio-Visual Speaker Verification

Leda Sarı, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[148] arXiv:2102.06357 (cross-list from cs.SD) [pdf, other]: Title: Contrastive Unsupervised Learning for Speech Emotion Recognition

Mao Li, Bo Yang, Joshua Levy, Andreas Stolcke, Viktor Rozgic, Spyros Matsoukas, Constantinos Papayiannis, Daniel Bone, Chao Wang

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[149] arXiv:2102.06380 (cross-list from cs.CL) [pdf, other]: Title: Neural Inverse Text Normalization

Monica Sunkara, Chaitanya Shivade, Sravan Bodapati, Katrin Kirchhoff

Comments: 5 pages, accepted to ICASSP 2021

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[150] arXiv:2102.06393 (cross-list from eess.SP) [pdf, other]: Title: Mind the beat: detecting audio onsets from EEG recordings of music listening

Ashvala Vinay, Alexander Lerch, Grace Leslie

Comments: to be published in ICASSP 2021 4 figures, 5 pages (4 pages of content + 1 page of references)

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[151] arXiv:2102.06431 (cross-list from cs.SD) [pdf, other]: Title: VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[152] arXiv:2102.06455 (cross-list from cs.SD) [pdf, other]: Title: Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL Sound Field Dataset

Miklas Strøm Kristoffersen, Martin Bo Møller, Pablo Martínez-Nuevo, Jan Østergaard

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[153] arXiv:2102.06467 (cross-list from cs.SD) [pdf, other]: Title: Content-Aware Speaker Embeddings for Speaker Diarisation

G. Sun, D. Liu, C. Zhang, P. C. Woodland

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[154] arXiv:2102.06657 (cross-list from cs.CV) [pdf, other]: Title: End-to-end Audio-visual Speech Recognition with Conformers

Pingchuan Ma, Stavros Petridis, Maja Pantic

Comments: Accepted to ICASSP 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[155] arXiv:2102.06750 (cross-list from cs.CL) [pdf, other]: Title: Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding

Milind Rao, Pranav Dheram, Gautam Tiwari, Anirudh Raju, Jasha Droppo, Ariya Rastrow, Andreas Stolcke

Comments: Proc. IEEE ICASSP 2021

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[156] arXiv:2102.06930 (cross-list from cs.SD) [pdf, other]: Title: Deep Convolutional and Recurrent Networks for Polyphonic Instrument Classification from Monophonic Raw Audio Waveforms

Kleanthis Avramidis, Agelos Kratimenos, Christos Garoufis, Athanasia Zlatintsi, Petros Maragos

Comments: 5 pages, 4 figures, 6 tables, to be published in the Proc. of the 46th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021) @ Toronto, Ontario, Canada

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[157] arXiv:2102.06934 (cross-list from cs.SD) [pdf, other]: Title: Multi-Channel Speech Enhancement using Graph Neural Networks

Panagiotis Tzirakis, Anurag Kumar, Jacob Donley

Journal-ref: Proc. ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[158] arXiv:2102.07133 (cross-list from cs.SD) [pdf, other]: Title: Parametric Optimization of Violin Top Plates using Machine Learning

Davide Salvi, Sebastian Gonzalez, Fabio Antonacci, Augusto Sarti

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[159] arXiv:2102.07259 (cross-list from cs.SD) [pdf, other]: Title: Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition

Priyabrata Karmakar, Shyh Wei Teng, Guojun Lu

Comments: Submitted to IEEE/ACM Trans. on Audio, Speech, and Language Processing

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[160] arXiv:2102.07307 (cross-list from cs.SD) [pdf, other]: Title: I-vector Based Within Speaker Voice Quality Identification on connected speech

Chuyao Feng, Eva van Leer, Mackenzie Lee Curtis, David V. Anderson

Comments: s

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[161] arXiv:2102.07594 (cross-list from cs.CL) [pdf, other]: Title: Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT

Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang

Comments: 14 pages, 7 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[162] arXiv:2102.07896 (cross-list from eess.SP) [pdf, other]: Title: A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Yongwan Lim, Asterios Toutios, Yannick Bliesener, Ye Tian, Sajan Goud Lingala, Colin Vaz, Tanner Sorensen, Miran Oh, Sarah Harper, Weiyi Chen, Yoonjeong Lee, Johannes Töger, Mairym Lloréns Montesserin, Caitlin Smith, Bianca Godinez, Louis Goldstein, Dani Byrd, Krishna S. Nayak, Shrikanth S. Narayanan

Comments: 27 pages, 6 figures, 5 tables, submitted to Nature Scientific Data

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[163] arXiv:2102.07982 (cross-list from cs.SD) [pdf, other]: Title: Voice Gender Scoring and Independent Acoustic Characterization of Perceived Masculinity and Femininity

Fuling Chen, Roberto Togneri, Murray Maybery, Diana Tan

Comments: 24 pages, 7 figures, journal

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[164] arXiv:2102.07990 (cross-list from eess.SP) [pdf, other]: Title: Through-the-Wall Radar under Electromagnetic Complex Wall: A Deep Learning Approach

Fardin Ghorbani, Hossein Soleimani

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[165] arXiv:2102.08015 (cross-list from cs.SD) [pdf, other]: Title: Improving speech recognition models with small samples for air traffic control systems

Yi Lin, Qin Li, Bo Yang, Zhen Yan, Huachun Tan, Zhengmao Chen

Comments: This work has been accepted by Neurocomputing for publication

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[166] arXiv:2102.08074 (cross-list from cs.SD) [pdf, other]: Title: Semi Supervised Learning For Few-shot Audio Classification By Episodic Triplet Mining

Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

Comments: 5 pages

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[167] arXiv:2102.08183 (cross-list from cs.SD) [pdf, other]: Title: Comparison of semi-supervised deep learning algorithms for audio classification

Léo Cances, Etienne Labbé, Thomas Pellegrini

Comments: 9 pages, 5 figures, 5 tables. This is the version 3 of the paper. Contains minor fixes compared to the EURASIP one (which is the version 2 of the paper)

Journal-ref: EURASIP Journal on Audio, Speech, and Music Processing, 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[168] arXiv:2102.08359 (cross-list from cs.SD) [pdf, other]: Title: End-2-End COVID-19 Detection from Breath & Cough Audio

Harry Coppock, Alexander Gaskell, Panagiotis Tzirakis, Alice Baird, Lyn Jones, Björn W. Schuller

Comments: 5 pages

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[169] arXiv:2102.08535 (cross-list from cs.CL) [pdf, other]: Title: ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems

Yi Lin, Bo Yang, Linchao Li, Dongyue Guo, Jianwei Zhang, Hu Chen, Yi Zhang

Comments: An improved work based on our previous Interspeech 2020 paper (this https URL)

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[170] arXiv:2102.08551 (cross-list from cs.SD) [pdf, other]: Title: Weighted Recursive Least Square Filter and Neural Network based Residual Echo Suppression for the AEC-Challenge

Ziteng Wang, Yueyue Na, Zhang Liu, Biao Tian, Qiang Fu

Comments: 5 pages, 2 figures, accepted by ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[171] arXiv:2102.08575 (cross-list from cs.SD) [pdf, other]: Title: End-to-end lyrics Recognition with Voice to Singing Style Transfer

Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi

Comments: accepted at ICASSP 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[172] arXiv:2102.08833 (cross-list from cs.SD) [pdf, other]: Title: DESED-FL and URBAN-FL: Federated Learning Datasets for Sound Event Detection

David S. Johnson, Wolfgang Lorenz, Michael Taenzer, Stylianos Mimilakis, Sascha Grollmisch, Jakob Abeßer, Hanna Lukashevich

Comments: To be published in EUSIPCO 2021

Subjects: Sound (cs.SD); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[173] arXiv:2102.09202 (cross-list from cs.SD) [pdf, other]: Title: Low Resource Audio-to-Lyrics Alignment From Polyphonic Music Recordings

Emir Demirel, Sven Ahlbäck, Simon Dixon

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[174] arXiv:2102.09281 (cross-list from cs.LG) [pdf, other]: Title: DINO: A Conditional Energy-Based GAN for Domain Translation

Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Comments: Accepted to ICLR 2021

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[175] arXiv:2102.09607 (cross-list from cs.LG) [pdf, other]: Title: Modelling Paralinguistic Properties in Conversational Speech to Detect Bipolar Disorder and Borderline Personality Disorder

Bo Wang, Yue Wu, Nemanja Vaci, Maria Liakata, Terry Lyons, Kate E A Saunders

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[176] arXiv:2102.09680 (cross-list from cs.CL) [pdf, other]: Title: Fixing Errors of the Google Voice Recognizer through Phonetic Distance Metrics

Diego Campos-Sobrino, Mario Campos-Soberanis, Iván Martínez-Chin, Víctor Uc-Cetina

Comments: 13 pages, 4 figures. This article is a translation of the paper "Corrección de errores del reconocedor de voz de Google usando métricas de distancia fonética" presented in COMIA 2018

Journal-ref: Research in Computing Science 148(1), 2019, pp. 57-70. ISSN 1870-4069

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[177] arXiv:2102.09737 (cross-list from cs.CV) [pdf, other]: Title: One Shot Audio to Animated Video Generation

Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh Lall, Mujtaba Hasan, Pranshu Agarwal, Dipankar Sarkar

Comments: arXiv admin note: substantial text overlap with arXiv:2012.07842, arXiv:2012.07304

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[178] arXiv:2102.09763 (cross-list from cs.SD) [pdf, other]: Title: Frequency-Temporal Attention Network for Singing Melody Extraction

Shuai Yu, Xiaoheng Sun, Yi Yu, Wei Li

Comments: This paper has been accepted by ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[179] arXiv:2102.09794 (cross-list from cs.SD) [pdf, other]: Title: Hierarchical Recurrent Neural Networks for Conditional Melody Generation with Long-term Structure

Zixun Guo, Makris Dimos, Herremans Dorien

Journal-ref: Proc. of the International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18-22 July 2021(virtual)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[180] arXiv:2102.09817 (cross-list from cs.SD) [pdf, other]: Title: Unit selection synthesis based data augmentation for fixed phrase speaker verification

Houjun Huang, Xu Xiang, Fei Zhao, Shuai Wang, Yanmin Qian

Comments: Accepted to ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[181] arXiv:2102.09828 (cross-list from cs.SD) [pdf, other]: Title: AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge

Houjun Huang, Xu Xiang, Yexin Yang, Rao Ma, Yanmin Qian

Comments: Accepted to ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[182] arXiv:2102.09914 (cross-list from cs.CL) [pdf, other]: Title: Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input

Brooke Stephenson, Thomas Hueber, Laurent Girin, Laurent Besacier

Comments: 4 pages

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[183] arXiv:2102.09966 (cross-list from cs.SD) [pdf, other]: Title: CatNet: music source separation system with mix-audio augmentation

Xuchen Song, Qiuqiang Kong, Xingjian Du, Yuxuan Wang

Comments: 5 pages

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[184] arXiv:2102.09971 (cross-list from cs.SD) [pdf, other]: Title: Speech enhancement with weakly labelled data from AudioSet

Qiuqiang Kong, Haohe Liu, Xingjian Du, Li Chen, Rui Xia, Yuxuan Wang

Comments: 5 pages

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[185] arXiv:2102.09978 (cross-list from cs.SD) [pdf, other]: Title: TransMask: A Compact and Fast Speech Separation Model Based on Transformer

Zining Zhang, Bingsheng He, Zhenjie Zhang

Comments: Accepted in ICASSP2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[186] arXiv:2102.10233 (cross-list from cs.SD) [pdf, other]: Title: The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods

Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie

Comments: Accepted by ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[187] arXiv:2102.10236 (cross-list from cs.SD) [pdf, other]: Title: Singer Identification Using Deep Timbre Feature Learning with KNN-Net

Xulong Zhang, Jiale Qian, Yi Yu, Yifu Sun, Wei Li

Comments: Published as a conference paper at ICASSP 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[188] arXiv:2102.10322 (cross-list from cs.SD) [pdf, other]: Title: Learnable MFCCs for Speaker Verification

Xuechen Liu, Md Sahidullah, Tomi Kinnunen

Comments: Accepted to ISCAS 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[189] arXiv:2102.10331 (cross-list from q-bio.NC) [pdf, other]: Title: Separating Stimulus-Induced and Background Components of Dynamic Functional Connectivity in Naturalistic fMRI

Chee-Ming Ting, Jeremy I. Skipper, Steven L. Small, Hernando Ombao

Comments: Main paper: 10 pages, 8 figures. Supplemental file: 3 pages

Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV); Signal Processing (eess.SP); Applications (stat.AP)
[190] arXiv:2102.10515 (cross-list from cs.SD) [pdf, other]: Title: Anomaly Detection in Audio with Concept Drift using Adaptive Huffman Coding

Pratibha Kumari, Mukesh Saini

Comments: 22 pages, 8 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[191] arXiv:2102.10905 (cross-list from cs.CL) [pdf, other]: Title: Joint Intent Detection And Slot Filling Based on Continual Learning Model

Yanfei Hui, Jianzong Wang, Ning Cheng, Fengying Yu, Tianbo Wu, Jing Xiao

Comments: Accepted to ICASSP 2021

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[192] arXiv:2102.11058 (cross-list from cs.SD) [pdf, other]: Title: Anyone GAN Sing

Shreeviknesh Sankaran, Sukavanan Nanjundan, G. Paavai Anand

Comments: 5 pages, 8 figures

Journal-ref: International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN: 2349-5162, Vol.7, Issue 5, page no. 25-29, May-2020

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[193] arXiv:2102.11114 (cross-list from cs.CL) [pdf, other]: Title: Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model

Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng

Comments: Accepted in 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[194] arXiv:2102.11420 (cross-list from cs.SD) [pdf, other]: Title: Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion

Samuel J. Broughton, Md Asif Jalal, Roger K. Moore

Comments: For demo, see this https URL

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[195] arXiv:2102.11457 (cross-list from cs.SD) [pdf, other]: Title: Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Zeyu Xie, Kai Yu

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[196] arXiv:2102.11474 (cross-list from cs.SD) [pdf, other]: Title: Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events

Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[197] arXiv:2102.11488 (cross-list from cs.SD) [pdf, other]: Title: Senone-aware Adversarial Multi-task Training for Unsupervised Child to Adult Speech Adaptation

Richeng Duan, Nancy F. Chen

Comments: accepted for presentation at ICASSP-2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[198] arXiv:2102.11531 (cross-list from cs.SD) [pdf, other]: Title: Memory-efficient Speech Recognition on Smart Devices

Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra

Journal-ref: ICASSP 2021

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[199] arXiv:2102.11588 (cross-list from cs.SD) [pdf, other]: Title: Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain

Julio Wissing, Benedikt Boenninghoff, Dorothea Kolossa, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Christopher Schymura

Comments: 4 pages, 6 figures, ICASSP 2021

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[200] arXiv:2102.11771 (cross-list from cs.SD) [pdf, other]: Title: Improving Deep Learning Sound Events Classifiers using Gram Matrix Feature-wise Correlations

Antonio Joia Neto, Andre G C Pacheco, Diogo C Luvizon

Comments: To appear on ICASSP 2021

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[201] arXiv:2102.12111 (cross-list from cs.SD) [pdf, other]: Title: Deep Learning Approach for Singer Voice Classification of Vietnamese Popular Music

Toan Pham Van, Ngoc N. Tran, Ta Minh Thanh

Comments: Published in SoICT 2019: Proceedings of the Tenth International Symposium on Information and Communication Technology

Journal-ref: SoICT 2019: Proceedings of the Tenth International Symposium on Information and Communication Technology

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[202] arXiv:2102.12289 (cross-list from cs.SD) [pdf, other]: Title: Automatic Feature Extraction for Heartbeat Anomaly Detection

Robert-George Colt, Csongor-Huba Várady, Riccardo Volpi, Luigi Malagò

Comments: 7 pages, 2 figures, Presented at PharML 2020 Workshop - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), see this https URL, source-code: this https URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[203] arXiv:2102.12564 (cross-list from cs.SD) [pdf, other]: Title: Triplet loss based embeddings for forensic speaker identification in Spanish

Emmanuel Maqueda, Javier Alvarez-Jimenez, Carlos Mena, Ivan Meza

Comments: Long Paper: Neural Computing and Applications, Special Issue on LatinX in AI Research (2021). 11 pages, 5 figures

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[204] arXiv:2102.12664 (cross-list from cs.CL) [pdf, other]: Title: MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

Comments: To appear at ICASSP 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[205] arXiv:2102.12841 (cross-list from cs.SD) [pdf, other]: Title: MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames

Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

Comments: Accepted to ICASSP 2021. Project page: this http URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[206] arXiv:2102.13314 (cross-list from cs.LG) [pdf, other]: Title: Efficient Client Contribution Evaluation for Horizontal Federated Learning

Jie Zhao, Xinghua Zhu, Jianzong Wang, Jing Xiao

Comments: Accepted to ICASSP 2021

Subjects: Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[207] arXiv:2102.13479 (cross-list from cs.SD) [pdf, other]: Title: Towards Explaining Expressive Qualities in Piano Recordings: Transfer of Explanatory Features via Acoustic Domain Adaptation

Shreyan Chowdhury, Gerhard Widmer

Comments: 5 pages, 3 figures; accepted for IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[208] arXiv:2102.13552 (cross-list from cs.SD) [pdf, other]: Title: The NPU System for the 2020 Personalized Voice Trigger Challenge

Jingyong Hou, Li Zhang, Yihui Fu, Qing Wang, Zhanheng Yang, Qijie Shao, Lei Xie

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Total of 208 entries

Showing up to 2000 entries per page: fewer | more | all