Sound

Authors and titles for June 2021

Total of 249 entries : 1-25 ... 101-125 126-150 151-175 176-200 201-225 226-249

Showing up to 25 entries per page: fewer | more | all

[176] arXiv:2106.08637 (cross-list from cs.CL) [pdf, other]: Title: Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Tan Liu, Wu Guo, Bin Gu

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[177] arXiv:2106.08649 (cross-list from eess.AS) [pdf, other]: Title: Improving the expressiveness of neural vocoding with non-affine Normalizing Flows

Adam Gabryś, Yunlong Jiao, Viacheslav Klimkov, Daniel Korzekwa, Roberto Barra-Chicote

Comments: Accepted to Interspeech 2021, 5 pages,3 figures

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[178] arXiv:2106.08672 (cross-list from eess.AS) [pdf, other]: Title: DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement

Shubo Lv, Yanxin Hu, Shimin Zhang, Lei Xie

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[179] arXiv:2106.08686 (cross-list from cs.CL) [pdf, other]: Title: Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study

Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova, Bernd Möbius, Dietrich Klakow

Comments: Accepted in Interspeech 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[180] arXiv:2106.08689 (cross-list from cs.CL) [pdf, other]: Title: Alzheimer's Disease Detection from Spontaneous Speech through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models

Yu Qiao, Xuefeng Yin, Daniel Wiechmann, Elma Kerz

Comments: accepted at Interspeech2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[181] arXiv:2106.08706 (cross-list from eess.IV) [pdf, other]: Title: Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI

Laxmi Pandey, Ahmed Sabbir Arif

Comments: 8 pages

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[182] arXiv:2106.08741 (cross-list from eess.AS) [pdf, other]: Title: Enriching Source Style Transfer in Recognition-Synthesis based Non-Parallel Voice Conversion

Zhichao Wang, Xinyong Zhou, Fengyu Yang, Tao Li, Hongqiang Du, Lei Xie, Wendong Gan, Haitao Chen, Hai Li

Comments: Accepted by Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[183] arXiv:2106.08859 (cross-list from cs.CL) [pdf, other]: Title: Attention-Based Keyword Localisation in Speech using Visual Grounding

Kayode Olaleye, Herman Kamper

Comments: Accepted to Interspeech 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[184] arXiv:2106.08922 (cross-list from eess.AS) [pdf, other]: Title: Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

Comments: Accepted to Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[185] arXiv:2106.08960 (cross-list from cs.CL) [pdf, other]: Title: Collaborative Training of Acoustic Encoders for Speech Recognition

Varun Nagaraja, Yangyang Shi, Ganesh Venkatesh, Ozlem Kalinli, Michael L. Seltzer, Vikas Chandra

Comments: INTERSPEECH 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[186] arXiv:2106.09008 (cross-list from eess.AS) [pdf, other]: Title: A Flow-Based Neural Network for Time Domain Speech Enhancement

Martin Strauss, Bernd Edler

Comments: Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[187] arXiv:2106.09009 (cross-list from cs.CL) [pdf, other]: Title: End-to-End Spoken Language Understanding for Generalized Voice Assistants

Michael Saxon, Samridhi Choudhary, Joseph P. McKenna, Athanasios Mouchtaris

Comments: Accepted to Interspeech 2021; 5 pages, 2 tables, 1 figure

Journal-ref: Proc. Interspeech 2021, 4738-4742

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[188] arXiv:2106.09093 (cross-list from eess.AS) [pdf, other]: Title: A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler

Comments: accepted in INTERSPEECH 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[189] arXiv:2106.09171 (cross-list from cs.LG) [pdf, other]: Title: LiRA: Learning Visual Speech Representations from Audio through Self-supervision

Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic

Comments: Accepted for publication at Interspeech 2021

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[190] arXiv:2106.09216 (cross-list from eess.AS) [pdf, other]: Title: Layer Pruning on Demand with Intermediate CTC

Jaesong Lee, Jingu Kang, Shinji Watanabe

Comments: Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[191] arXiv:2106.09296 (cross-list from cs.LG) [pdf, other]: Title: Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Chao-Han Huck Yang, Yun-Yun Tsai, Pin-Yu Chen

Comments: Updated version with a correction. The full draft was submitted in Jan 2021. The Voice2Series project initially was launched in Sep 2020. Accepted to ICML 2021, 16 Pages

Journal-ref: Proceedings of the 38th International Conference on Machine Learning 2021

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[192] arXiv:2106.09317 (cross-list from cs.CL) [pdf, other]: Title: EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao

Comments: Accepted by Interspeech 2021

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[193] arXiv:2106.09488 (cross-list from eess.AS) [pdf, other]: Title: Scaling Laws for Acoustic Models

Jasha Droppo, Oguz Elibol

Comments: Submitted to Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[194] arXiv:2106.09532 (cross-list from eess.AS) [pdf, other]: Title: ASR Adaptation for E-commerce Chatbots using Cross-Utterance Context and Multi-Task Language Modeling

Ashish Shenoy, Sravan Bodapati, Katrin Kirchhoff

Comments: Accepted at ACL-IJCNLP 2021 Workshop on e-Commerce and NLP (ECNLP)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[195] arXiv:2106.09539 (cross-list from eess.AS) [pdf, other]: Title: Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit

Einari Vaaras, Sari Ahlqvist-Björkroth, Konstantinos Drossos, Okko Räsänen

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[196] arXiv:2106.09545 (cross-list from eess.AS) [pdf, other]: Title: STAN: A stuttering therapy analysis helper

Sebastian P. Bayerl, Marc Wenninger, Jochen Schmidt, Alexander Wolff von Gudenberg, Korbinian Riedhammer

Journal-ref: Demo presented at 2021 IEEE Spoken Language Technology Workshop (SLT)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[197] arXiv:2106.09574 (cross-list from eess.AS) [pdf, other]: Title: Localization based on enhanced low frequency interaural level difference

Metin Calis, Steven van de Par, Richard Heusdens, Richard C. Hendriks

Comments: 15 pages, 8 figures, preprint for a journal submission, paper in review, not yet accepted

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[198] arXiv:2106.09622 (cross-list from eess.AS) [pdf, other]: Title: Extracting Different Levels of Speech Information from EEG Using an LSTM-Based Model

Mohammad Jalilpour Monesi, Bernd Accou, Tom Francart, Hugo Van Hamme

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[199] arXiv:2106.09660 (cross-list from eess.AS) [pdf, other]: Title: WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

Comments: Proceedings of INTERSPEECH

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[200] arXiv:2106.09760 (cross-list from eess.AS) [pdf, other]: Title: Multi-mode Transformer Transducer with Stochastic Future Context

Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

Comments: Accepted to Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Total of 249 entries : 1-25 ... 101-125 126-150 151-175 176-200 201-225 226-249

Showing up to 25 entries per page: fewer | more | all