Sound

Authors and titles for April 2022

Total of 291 entries : 1-25 26-50 51-75 76-100 101-125 126-150 ... 276-291

Showing up to 25 entries per page: fewer | more | all

[51] arXiv:2204.03255 [pdf, other]: Title: Arabic Text-To-Speech (TTS) Data Preparation

Hala Al Masri, Muhy Eddin Za'ter

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[52] arXiv:2204.03307 [pdf, other]: Title: Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

Comments: 5 pages, 1 figure, accepted by IEEE ICASSP 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[53] arXiv:2204.03398 [pdf, other]: Title: Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

Comments: Accepted by Interspeech 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[54] arXiv:2204.03421 [pdf, other]: Title: Self-supervised learning for robust voice cloning

Konstantinos Klapsas, Nikolaos Ellinas, Karolos Nikitaras, Georgios Vamvoukakis, Panos Kakoulidis, Konstantinos Markopoulos, Spyros Raptis, June Sig Sung, Gunu Jho, Aimilios Chalamandaris, Pirros Tsiakoulis

Comments: Accepted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[55] arXiv:2204.03594 [pdf, other]: Title: Heterogeneous Target Speech Separation

Efthymios Tzinis, Gordon Wichern, Aswin Subramanian, Paris Smaragdis, Jonathan Le Roux

Comments: Submitted to Interspeech 2022

Journal-ref: Interspeech 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[56] arXiv:2204.03740 [pdf, other]: Title: Successes and critical failures of neural networks in capturing human-like speech recognition

Federico Adolfi, Jeffrey S. Bowers, David Poeppel

Journal-ref: Neural Networks, 162, 199-211 (2023)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[57] arXiv:2204.03847 [pdf, other]: Title: Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion

Weida Liang, Lantian Li, Wenqiang Du, Dong Wang

Comments: submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[58] arXiv:2204.03852 [pdf, other]: Title: Reliable Visualization for Deep Speaker Recognition

Pengqi Li, Lantian Li, Askar Hamdulla, Dong Wang

Comments: submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[59] arXiv:2204.03889 [pdf, other]: Title: Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition

Nick J.C. Wang, Zongfeng Quan, Shaojun Wang, Jing Xiao

Comments: Submitted to INTERSPEECH 2022 (5 pages, 2 figures)

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[60] arXiv:2204.03967 [pdf, other]: Title: The Sillwood Technologies System for the VoiceMOS Challenge 2022

Jiameng Gao

Comments: Submitted to Interspeech 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[61] arXiv:2204.04166 [pdf, other]: Title: Self-supervised Speaker Diarization

Yehoshua Dissen, Felix Kreuk, Joseph Keshet

Comments: Submitted to Interspeech 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[62] arXiv:2204.04464 [pdf, other]: Title: Multichannel Speech Separation with Narrow-band Conformer

Changsheng Quan, Xiaofei Li

Comments: accepted by INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[63] arXiv:2204.04579 [pdf, other]: Title: Inferring Pitch from Coarse Spectral Features

Danni Ma, Neville Ryant, Mark Liberman

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[64] arXiv:2204.04645 [pdf, other]: Title: Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data

Yu Kang, Tianqiao Liu, Hang Li, Yang Hao, Wenbiao Ding

Comments: AAAI 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[65] arXiv:2204.04646 [pdf, other]: Title: Deep Embeddings for Robust User-Based Amateur Vocal Percussion Classification

Alejandro Delgado, Emir Demirel, Vinod Subramanian, Charalampos Saitis, Mark Sandler

Comments: Accepted at Sound and Music Computing (SMC) conference 2022

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[66] arXiv:2204.04651 [pdf, other]: Title: Deep Conditional Representation Learning for Drum Sample Retrieval by Vocalisation

Alejandro Delgado, Charalampos Saitis, Emmanouil Benetos, Mark Sandler

Comments: Submitted to Interspeech 2022 (under review)

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[67] arXiv:2204.04756 [pdf, other]: Title: Towards Evaluation of Autonomously Generated Musical Compositions: A Comprehensive Survey

Daniel Kvak

Subjects: Sound (cs.SD); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[68] arXiv:2204.04802 [pdf, other]: Title: On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice

Ankit Shah, Hira Dhamyal, Yang Gao, Daniel Arancibia, Mario Arancibia, Bhiksha Raj, Rita Singh

Comments: Submitted to ICASSP 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[69] arXiv:2204.04855 [pdf, other]: Title: Fusion of Self-supervised Learned Models for MOS Prediction

Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao

Comments: MOS 2022 shared task system description paper

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[70] arXiv:2204.05070 [pdf, other]: Title: Fine-grained Noise Control for Multispeaker Speech Synthesis

Karolos Nikitaras, Georgios Vamvoukakis, Nikolaos Ellinas, Konstantinos Klapsas, Konstantinos Markopoulos, Spyros Raptis, June Sig Sung, Gunu Jho, Aimilios Chalamandaris, Pirros Tsiakoulis

Comments: Accepted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[71] arXiv:2204.05082 [pdf, other]: Title: An approach to improving sound-based vehicle speed estimation

Nikola Bulatovic, Slobodan Djukanovic

Comments: Submitted to: 2022 Zooming Innovation in Consumer Technologies Conference (ZINC)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[72] arXiv:2204.05156 [pdf, other]: Title: How to Listen? Rethinking Visual Sound Localization

Ho-Hsiang Wu, Magdalena Fuentes, Prem Seetharaman, Juan Pablo Bello

Comments: Submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:2204.05222 [pdf, other]: Title: INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge

Lorenz Diener, Sten Sootla, Solomiya Branets, Ando Saabas, Robert Aichner, Ross Cutler

Comments: 4 pages + 1 page references, 1 figure, 2 tables. Submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[74] arXiv:2204.05445 [pdf, other]: Title: Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

Dianwen Ng, Jin Hui Pang, Yang Xiao, Biao Tian, Qiang Fu, Eng Siong Chng

Comments: submitted to INTERSPEECH 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[75] arXiv:2204.05571 [pdf, other]: Title: Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation

Wenjing Zhu, Xiang Li

Comments: 6 pages, 3 figures, ICASSP 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Total of 291 entries : 1-25 26-50 51-75 76-100 101-125 126-150 ... 276-291

Showing up to 25 entries per page: fewer | more | all