Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.SD

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Sound

Authors and titles for July 2019

Total of 125 entries : 1-25 26-50 51-75 76-100 101-125
Showing up to 25 entries per page: fewer | more | all
[51] arXiv:1907.01164 (cross-list from cs.LG) [pdf, other]
Title: Learning to Traverse Latent Spaces for Musical Score Inpainting
Ashis Pati, Alexander Lerch, Gaëtan Hadjeres
Comments: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019, Delft, The Netherlands; 6 pages, 8 figures
Journal-ref: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019, Delft, The Netherlands
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[52] arXiv:1907.01277 (cross-list from eess.AS) [pdf, other]
Title: Conditioned-U-Net: Introducing a Control Mechanism in the U-Net for Multiple Source Separations
Gabriel Meseguer-Brocal, Geoffroy Peeters
Journal-ref: Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR, Delft, Netherlands, 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[53] arXiv:1907.01367 (cross-list from eess.AS) [pdf, other]
Title: Lipper: Synthesizing Thy Speech using Multi-View Lipreading
Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang yin, Roger Zimmermann
Comments: Accepted at AAAI 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[54] arXiv:1907.01369 (cross-list from eess.AS) [pdf, other]
Title: Analyzing Verbal and Nonverbal Features for Predicting Group Performance
Uliyana Kubasova, Gabriel Murray, McKenzie Braley
Comments: Accepted to INTERSPEECH 2019 (Graz, Austria)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[55] arXiv:1907.01372 (cross-list from eess.AS) [pdf, other]
Title: Improving Performance of End-to-End ASR on Numeric Sequences
Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[56] arXiv:1907.01409 (cross-list from eess.AS) [pdf, other]
Title: Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR
Wilfried Michel, Ralf Schlüter, Hermann Ney
Comments: Submitted to Interspeech 2019
Journal-ref: Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1601--1605
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[57] arXiv:1907.01413 (cross-list from eess.AS) [pdf, other]
Title: Speaker-independent classification of phonetic segments from raw ultrasound in child speech
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals
Comments: 5 pages, 4 figures, published in ICASSP2019 (IEEE International Conference on Acoustics, Speech and Signal Processing, 2019)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV)
[58] arXiv:1907.01448 (cross-list from eess.AS) [pdf, other]
Title: Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification
Chieh-Chi Kao, Ming Sun, Yixin Gao, Shiv Vitaladevuni, Chao Wang
Comments: Accepted by Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[59] arXiv:1907.01607 (cross-list from eess.AS) [pdf, other]
Title: MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation
Xia Liang, Junmin Wu, Yan Yin
Comments: cast KSEM2019 on May 3, 2019 (weak rejected)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[60] arXiv:1907.01640 (cross-list from cs.IR) [pdf, other]
Title: SeER: An Explainable Deep Learning MIDI-based Hybrid Song Recommender System
Khalil Damak, Olfa Nasraoui
Comments: 8 pages, 6 figures; added offline validation of explainability method
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[61] arXiv:1907.01803 (cross-list from cs.LG) [pdf, other]
Title: The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification
Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer
Comments: IEEE EUSIPCO 2019
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[62] arXiv:1907.01914 (cross-list from eess.AS) [pdf, other]
Title: Attention model for articulatory features detection
Ievgen Karaulov, Dmytro Tkanov
Comments: Interspeech 2019, 5 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[63] arXiv:1907.01957 (cross-list from eess.AS) [pdf, other]
Title: End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Cong-Thanh Do
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[64] arXiv:1907.02404 (cross-list from eess.SP) [pdf, other]
Title: Blind Audio Source Separation with Minimum-Volume Beta-Divergence NMF
Valentin Leplat, Nicolas Gillis, Man Shun Ang
Comments: 24 pages, 10 figures, 3 tables
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[65] arXiv:1907.02477 (cross-list from cs.LG) [pdf, other]
Title: Adversarial Attacks in Sound Event Classification
Vinod Subramanian, Emmanouil Benetos, Ning Xu, SKoT McDonald, Mark Sandler
Comments: Fixed Freesound data reference to FSDKaggle2018
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[66] arXiv:1907.02663 (cross-list from eess.AS) [pdf, other]
Title: The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion
Weicheng Cai, Haiwei Wu, Danwei Cai, Ming Li
Comments: Accepted for INTERSPEECH 2019
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[67] arXiv:1907.02670 (cross-list from cs.LG) [pdf, other]
Title: Zero-shot Learning for Audio-based Music Classification and Tagging
Jeong Choi, Jongpil Lee, Jiyoung Park, Juhan Nam
Comments: 20th International Society for Music Information Retrieval Conference (ISMIR), 2019
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68] arXiv:1907.02784 (cross-list from eess.AS) [pdf, other]
Title: A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[69] arXiv:1907.03233 (cross-list from cs.CL) [pdf, other]
Title: NIESR: Nuisance Invariant End-to-end Speech Recognition
I-Hung Hsu, Ayush Jaiswal, Premkumar Natarajan
Comments: To appear in Proceedings of Interspeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[70] arXiv:1907.04224 (cross-list from cs.CL) [pdf, other]
Title: Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Yonatan Belinkov, Ahmed Ali, James Glass
Comments: Corrected dataset statistics
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[71] arXiv:1907.04258 (cross-list from cs.NE) [pdf, other]
Title: Melody Generation using an Interactive Evolutionary Algorithm
Majid Farzaneh, Rahil Mahdian Toroghi
Comments: 5 pages, 4 images, submitted to MEDPRAI2019 conference
Subjects: Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[72] arXiv:1907.04294 (cross-list from cs.IR) [pdf, other]
Title: An Attention Mechanism for Musical Instrument Recognition
Siddharth Gururani, Mohit Sharma, Alexander Lerch
Comments: To appear in: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Delft, 2019
Subjects: Information Retrieval (cs.IR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:1907.04355 (cross-list from cs.CL) [pdf, other]
Title: Transfer Learning from Audio-Visual Grounding to Speech Recognition
Wei-Ning Hsu, David Harwath, James Glass
Comments: Accepted to Interspeech 2019. 4 pages, 2 figures
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[74] arXiv:1907.04448 (cross-list from cs.CL) [pdf, other]
Title: Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran
Comments: 5 pages, submitted to Interspeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[75] arXiv:1907.04462 (cross-list from cs.CL) [pdf, other]
Title: Multi-Speaker End-to-End Speech Synthesis
Jihyun Park, Kexin Zhao, Kainan Peng, Wei Ping
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Total of 125 entries : 1-25 26-50 51-75 76-100 101-125
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack