Sound

Authors and titles for July 2019

Total of 125 entries : 1-50 51-100 101-125

Showing up to 50 entries per page: fewer | more | all

[51] arXiv:1907.01164 (cross-list from cs.LG) [pdf, other]: Title: Learning to Traverse Latent Spaces for Musical Score Inpainting

Ashis Pati, Alexander Lerch, Gaëtan Hadjeres

Comments: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019, Delft, The Netherlands; 6 pages, 8 figures

Journal-ref: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019, Delft, The Netherlands

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[52] arXiv:1907.01277 (cross-list from eess.AS) [pdf, other]: Title: Conditioned-U-Net: Introducing a Control Mechanism in the U-Net for Multiple Source Separations

Gabriel Meseguer-Brocal, Geoffroy Peeters

Journal-ref: Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR, Delft, Netherlands, 2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[53] arXiv:1907.01367 (cross-list from eess.AS) [pdf, other]: Title: Lipper: Synthesizing Thy Speech using Multi-View Lipreading

Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang yin, Roger Zimmermann

Comments: Accepted at AAAI 2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[54] arXiv:1907.01369 (cross-list from eess.AS) [pdf, other]: Title: Analyzing Verbal and Nonverbal Features for Predicting Group Performance

Uliyana Kubasova, Gabriel Murray, McKenzie Braley

Comments: Accepted to INTERSPEECH 2019 (Graz, Austria)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[55] arXiv:1907.01372 (cross-list from eess.AS) [pdf, other]: Title: Improving Performance of End-to-End ASR on Numeric Sequences

Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[56] arXiv:1907.01409 (cross-list from eess.AS) [pdf, other]: Title: Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Wilfried Michel, Ralf Schlüter, Hermann Ney

Comments: Submitted to Interspeech 2019

Journal-ref: Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1601--1605

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[57] arXiv:1907.01413 (cross-list from eess.AS) [pdf, other]: Title: Speaker-independent classification of phonetic segments from raw ultrasound in child speech

Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals

Comments: 5 pages, 4 figures, published in ICASSP2019 (IEEE International Conference on Acoustics, Speech and Signal Processing, 2019)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV)
[58] arXiv:1907.01448 (cross-list from eess.AS) [pdf, other]: Title: Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification

Chieh-Chi Kao, Ming Sun, Yixin Gao, Shiv Vitaladevuni, Chao Wang

Comments: Accepted by Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[59] arXiv:1907.01607 (cross-list from eess.AS) [pdf, other]: Title: MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation

Xia Liang, Junmin Wu, Yan Yin

Comments: cast KSEM2019 on May 3, 2019 (weak rejected)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[60] arXiv:1907.01640 (cross-list from cs.IR) [pdf, other]: Title: SeER: An Explainable Deep Learning MIDI-based Hybrid Song Recommender System

Khalil Damak, Olfa Nasraoui

Comments: 8 pages, 6 figures; added offline validation of explainability method

Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[61] arXiv:1907.01803 (cross-list from cs.LG) [pdf, other]: Title: The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer

Comments: IEEE EUSIPCO 2019

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[62] arXiv:1907.01914 (cross-list from eess.AS) [pdf, other]: Title: Attention model for articulatory features detection

Ievgen Karaulov, Dmytro Tkanov

Comments: Interspeech 2019, 5 pages, 2 figures

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[63] arXiv:1907.01957 (cross-list from eess.AS) [pdf, other]: Title: End-to-End Speech Recognition with High-Frame-Rate Features Extraction

Cong-Thanh Do

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[64] arXiv:1907.02404 (cross-list from eess.SP) [pdf, other]: Title: Blind Audio Source Separation with Minimum-Volume Beta-Divergence NMF

Valentin Leplat, Nicolas Gillis, Man Shun Ang

Comments: 24 pages, 10 figures, 3 tables

Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[65] arXiv:1907.02477 (cross-list from cs.LG) [pdf, other]: Title: Adversarial Attacks in Sound Event Classification

Vinod Subramanian, Emmanouil Benetos, Ning Xu, SKoT McDonald, Mark Sandler

Comments: Fixed Freesound data reference to FSDKaggle2018

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[66] arXiv:1907.02663 (cross-list from eess.AS) [pdf, other]: Title: The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion

Weicheng Cai, Haiwei Wu, Danwei Cai, Ming Li

Comments: Accepted for INTERSPEECH 2019

Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[67] arXiv:1907.02670 (cross-list from cs.LG) [pdf, other]: Title: Zero-shot Learning for Audio-based Music Classification and Tagging

Jeong Choi, Jongpil Lee, Jiyoung Park, Juhan Nam

Comments: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019

Subjects: Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68] arXiv:1907.02784 (cross-list from eess.AS) [pdf, other]: Title: A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach

Noé Tits

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[69] arXiv:1907.03233 (cross-list from cs.CL) [pdf, other]: Title: NIESR: Nuisance Invariant End-to-end Speech Recognition

I-Hung Hsu, Ayush Jaiswal, Premkumar Natarajan

Comments: To appear in Proceedings of Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[70] arXiv:1907.04224 (cross-list from cs.CL) [pdf, other]: Title: Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition

Yonatan Belinkov, Ahmed Ali, James Glass

Comments: Corrected dataset statistics

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[71] arXiv:1907.04258 (cross-list from cs.NE) [pdf, other]: Title: Melody Generation using an Interactive Evolutionary Algorithm

Majid Farzaneh, Rahil Mahdian Toroghi

Comments: 5 pages, 4 images, submitted to MEDPRAI2019 conference

Subjects: Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[72] arXiv:1907.04294 (cross-list from cs.IR) [pdf, other]: Title: An Attention Mechanism for Musical Instrument Recognition

Siddharth Gururani, Mohit Sharma, Alexander Lerch

Comments: To appear in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Delft, 2019

Subjects: Information Retrieval (cs.IR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:1907.04355 (cross-list from cs.CL) [pdf, other]: Title: Transfer Learning from Audio-Visual Grounding to Speech Recognition

Wei-Ning Hsu, David Harwath, James Glass

Comments: Accepted to Interspeech 2019. 4 pages, 2 figures

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[74] arXiv:1907.04448 (cross-list from cs.CL) [pdf, other]: Title: Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

Comments: 5 pages, submitted to Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[75] arXiv:1907.04462 (cross-list from cs.CL) [pdf, other]: Title: Multi-Speaker End-to-End Speech Synthesis

Jihyun Park, Kexin Zhao, Kainan Peng, Wei Ping

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[76] arXiv:1907.04536 (cross-list from cs.LG) [pdf, other]: Title: Multi-layer Attention Mechanism for Speech Keyword Recognition

Ruisen Luo, Tianran Sun, Chen Wang, Miao Du, Zuodong Tang, Kai Zhou, Xiaofeng Gong, Xiaomei Yang

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[77] arXiv:1907.04655 (cross-list from eess.SP) [pdf, other]: Title: Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition

Antoine Deleforge, Diego Di Carlo, Martin Strauss, Romain Serizel, Lucio Marcenaro

Journal-ref: IEEE Signal Processing Magazine, Institute of Electrical and Electronics Engineers, In press

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[78] arXiv:1907.04743 (cross-list from eess.AS) [pdf, other]: Title: Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak

Comments: 5 pages, 5 figures, Accepted for Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[79] arXiv:1907.04887 (cross-list from eess.AS) [pdf, other]: Title: Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

Khoi-Nguyen C. Mac, Xiaodong Cui, Wei Zhang, Michael Picheny

Comments: Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[80] arXiv:1907.04916 (cross-list from eess.AS) [pdf, other]: Title: Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR

Felix Weninger, Jesús Andrés-Ferrer, Xinwei Li, Puming Zhan

Comments: To appear in INTERSPEECH 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[81] arXiv:1907.04926 (cross-list from eess.AS) [pdf, other]: Title: Synchronizing Audio-Visual Film Stimuli in Unity (version 5.5.1f1): Game Engines as a Tool for Research

Javier Sanz, Andreas Wulff-Abramsson, Carlos Aguilar-Paredes, Luis Emilio Bruni, Lydia Sanchez

Comments: 13 Pages

Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD); Image and Video Processing (eess.IV)
[82] arXiv:1907.04927 (cross-list from eess.AS) [pdf, other]: Title: Speech bandwidth extension with WaveNet

Archit Gupta, Brendan Shillingford, Yannis Assael, Thomas C. Walters

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[83] arXiv:1907.04928 (cross-list from eess.AS) [pdf, other]: Title: Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

Mohammed Senoussaoui, Patrick Cardinal, Alessandro Lameiras Koerich

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[84] arXiv:1907.04975 (cross-list from cs.CV) [pdf, other]: Title: My lips are concealed: Audio-visual speech enhancement through obstructions

Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman

Comments: Accepted to Interspeech 2019

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85] arXiv:1907.05122 (cross-list from eess.AS) [pdf, other]: Title: Polyphonic Sound Event and Sound Activity Detection: A Multi-task approach

Arjun Pankajakshan, Helen L. Bear, Emmanouil Benetos

Comments: Accepted to WASPAA 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[86] arXiv:1907.05337 (cross-list from cs.CL) [pdf, other]: Title: Joint Speech Recognition and Speaker Diarization via Sequence Transduction

Laurent El Shafey, Hagen Soltau, Izhak Shafran

Journal-ref: Proc. Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[87] arXiv:1907.05351 (cross-list from eess.SP) [pdf, other]: Title: Optimized Sharing of Coefficients in Parallel Filter Banks

M. Tunç Arslan, Onur Yorulmaz, Erdinç L. Atılgan

Comments: 10 pages, submitted to IEEE Transactions on Signal Processing

Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[88] arXiv:1907.05599 (cross-list from eess.AS) [pdf, other]: Title: Effective Incorporation of Speaker Information in Utterance Encoding in Dialog

Tianyu Zhao, Tatsuya Kawahara

Comments: 8+1 pages, 3 figures, and 5 tables. Rejected by SIGDIAL 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[89] arXiv:1907.05698 (cross-list from eess.AS) [pdf, other]: Title: Teach an all-rounder with experts in different domains

Zhao You, Dan Su, Dong Yu

Comments: 5 pages and 2 figures; accepted by 2019 IEEE International Conference on Acoustics, Speech and Signal Processing

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[90] arXiv:1907.05701 (cross-list from eess.AS) [pdf, other]: Title: A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition

Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David Kung, Michael Picheny

Journal-ref: INTERSPEECH 2019

Subjects: Audio and Speech Processing (eess.AS); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[91] arXiv:1907.05708 (cross-list from eess.AS) [pdf, other]: Title: Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks

Diego Perna, Andrea Tagarelli

Comments: Paper accepted for publication with Procs. of the 32th IEEE CBMS International Symposium on Computer-Based Medical Systems (CBMS 2019)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[92] arXiv:1907.05905 (cross-list from eess.AS) [pdf, other]: Title: Voice Pathology Detection Using Deep Learning: a Preliminary Study

Pavol Harar, Jesus B. Alonso-Hernandez, Jiri Mekyska, Zoltan Galaz, Radim Burget, Zdenek Smekal

Comments: 4 pages, 1 figure, 5 tables

Journal-ref: In 2017 international conference and workshop on bioinspired intelligence (IWOBI), pp. 1-4. IEEE, 2017

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[93] arXiv:1907.06111 (cross-list from eess.AS) [pdf, other]: Title: Speaker Recognition with Random Digit Strings Using Uncertainty Normalized HMM-based i-vectors

Nooshin Maghsoodi, Hossein Sameti, Hossein Zeinali, ThemosStafylakis

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[94] arXiv:1907.06112 (cross-list from eess.AS) [pdf, other]: Title: BUT VOiCES 2019 System Description

Hossein Zeinali, Pavel Matějka, Ladislav Mošner, Oldřich Plchot, Anna Silnova, Ondřej Novotný, Ján Profant, Ondřej Glembek, Lukáš Burget

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[95] arXiv:1907.06286 (cross-list from q-bio.NC) [pdf, other]: Title: Autoencoding sensory substitution

Viktor Tóth, Lauri Parkkonen

Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[96] arXiv:1907.06342 (cross-list from cs.CL) [pdf, other]: Title: Joint Language Identification of Code-Switching Speech using Attention based E2E Network

Sreeram Ganji, Kunal Dhawan, Kumar Priyadarshi, Rohit Sinha

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[97] arXiv:1907.06639 (cross-list from eess.AS) [pdf, other]: Title: Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling

Hangting Chen, Zuozhen Liu, Zongming Liu, Pengyuan Zhang, Yonghong Yan

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[98] arXiv:1907.06859 (cross-list from eess.AS) [pdf, other]: Title: Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features

Kunal Dhawan, Colin Vaz, Ruchir Travadi, Shrikanth Narayanan

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[99] arXiv:1907.07127 (cross-list from eess.AS) [pdf, other]: Title: Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

Hossein Zeinali, Lukáš Burget, Jan "Honza'' Černocký

Comments: arXiv admin note: text overlap with arXiv:1810.04273

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[100] arXiv:1907.07564 (cross-list from cs.HC) [pdf, other]: Title: Conversational Help for Task Completion and Feature Discovery in Personal Assistants

Madan Gopal Jhawar, Vipindeep Vangala, Nishchay Sharma, Ankur Hayatnagarkar, Mansi Saxena, Swati Valecha

Subjects: Human-Computer Interaction (cs.HC); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)

Total of 125 entries : 1-50 51-100 101-125

Showing up to 50 entries per page: fewer | more | all