Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.SD

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Sound

Authors and titles for July 2019

Total of 125 entries : 1-25 26-50 51-75 76-100 ... 101-125
Showing up to 25 entries per page: fewer | more | all
[1] arXiv:1907.01160 [pdf, other]
Title: WHAM!: Extending Speech Separation to Noisy Environments
Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux
Comments: Accepted for publication at Interspeech 2019
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[2] arXiv:1907.01169 [pdf, other]
Title: Can a Robot Hear the Shape and Dimensions of a Room?
Linh Nguyen, Jaime Valls Miro, Xiaojun Qiu
Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2019)
Subjects: Sound (cs.SD); Robotics (cs.RO); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[3] arXiv:1907.01195 [pdf, other]
Title: Kite: Automatic speech recognition for unmanned aerial vehicles
Dan Oneata, Horia Cucu
Comments: 5 pages, accepted at Interspeech 2019
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[4] arXiv:1907.01742 [pdf, other]
Title: Supervised Classifiers for Audio Impairments with Noisy Labels
Chandan K A Reddy, Ross Cutler, Johannes Gehrke
Comments: To appear in INTERSPEECH 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[5] arXiv:1907.01813 [pdf, other]
Title: A Case Study of Deep-Learned Activations via Hand-Crafted Audio Features
Olga Slizovskaia, Emilia Gómez, Gloria Haro
Comments: The 2018 Joint Workshop on Machine Learning for Music, The Federated Artificial Intelligence Meeting (FAIM), Joint workshop program of ICML, IJCAI/ECAI, and AAMAS, Stockholm, Sweden, Saturday, July 14th, 2018
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[6] arXiv:1907.01824 [pdf, other]
Title: Cover Detection using Dominant Melody Embeddings
Guillaume Doras, Geoffroy Peeters
Journal-ref: 20th International Society for Music Information Retrieval Conference, Delft, The Netherlands, 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Machine Learning (stat.ML)
[7] arXiv:1907.02230 [pdf, other]
Title: Attention based Convolutional Recurrent Neural Network for Environmental Sound Classification
Zhichao Zhang, Shugong Xu, Tianhao Qiao, Shunqing Zhang, Shan Cao
Comments: Accepted to Chinese Conference on Pattern Recognition and Computer Vision (PRCV) 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[8] arXiv:1907.02265 [pdf, other]
Title: Supervised Symbolic Music Style Translation Using Synthetic Data
Ondřej Cífka, Umut Şimşekli, Gaël Richard
Comments: ISMIR 2019 camera-ready
Journal-ref: Proceedings of the 20th International Society for Music Information Retrieval Conference (2019) 588-595
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[9] arXiv:1907.02526 [pdf, other]
Title: Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients
Nursadul Mamun, Soheil Khorram, John H.L. Hansen
Comments: Interspeech 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[10] arXiv:1907.02637 [pdf, other]
Title: Neural Drum Machine : An Interactive System for Real-time Synthesis of Drum Sounds
Cyran Aouameur, Philippe Esling, Gaëtan Hadjeres
Comments: 8 pages, accepted at the International Conference on Computational Creativity 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[11] arXiv:1907.02698 [pdf, other]
Title: A Bi-directional Transformer for Musical Chord Recognition
Jonggwon Park, Kyoyun Choi, Sungwook Jeon, Dokyun Kim, Jonghun Park
Comments: 20th International Society for Music Information Retrieval Conference (ISMIR), Delft, The Netherlands, 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[12] arXiv:1907.02864 [pdf, other]
Title: Deep Neural Baselines for Computational Paralinguistics
Daniel Elsner, Stefan Langer, Fabian Ritz, Robert Müller, Steffen Illium
Comments: 5 pages, 3 figures; This paper was accepted at INTERSPEECH 2019, Graz, 15-19th September 2019. DOI will be added after publishment of the accepted paper
Journal-ref: Proc. Interspeech 2019, 2388-2392
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[13] arXiv:1907.03572 [pdf, other]
Title: Towards Explainable Music Emotion Recognition: The Route via Mid-level Features
Shreyan Chowdhury, Andreu Vall, Verena Haunschmid, Gerhard Widmer
Comments: International Society for Music Information Retrieval Conference, Delft, The Netherlands, 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Machine Learning (stat.ML)
[14] arXiv:1907.03988 [pdf, other]
Title: Improving Reverberant Speech Training Using Diffuse Acoustic Simulation
Zhenyu Tang, Lianwu Chen, Bo Wu, Dong Yu, Dinesh Manocha
Comments: Accepted to ICASSP 2020, impulse response generation code at this https URL
Journal-ref: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6969-6973)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15] arXiv:1907.04292 [pdf, other]
Title: Evolution of the Informational Complexity of Contemporary Western Music
Thomas Parmer, Yong-Yeol Ahn
Comments: 8 pages, 6 figures; added supplementary materials
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[16] arXiv:1907.04352 [pdf, other]
Title: Exploring Conditioning for Generative Music Systems with Human-Interpretable Controls
Nicholas Meade, Nicholas Barreyre, Scott C. Lowe, Sageev Oore
Journal-ref: International Conference on Computational Creativity, 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[17] arXiv:1907.04868 [pdf, other]
Title: LakhNES: Improving multi-instrumental music generation with cross-domain pre-training
Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian McAuley
Comments: Published as a conference paper at ISMIR 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[18] arXiv:1907.04984 [pdf, other]
Title: Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming
Yoshiki Masuyama, Masahito Togami, Tatsuya Komatsu
Comments: 5 pages, Accepted at INTERSPEECH 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19] arXiv:1907.05208 [pdf, other]
Title: Explicitly Conditioned Melody Generation: A Case Study with Interdependent RNNs
Benjamin Genchel, Ashis Pati, Alexander Lerch
Comments: In Proceedings of the 7th International Workshop on Musical Meta-creation (MUME). Charlotte, North Carolina 2019
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[20] arXiv:1907.05584 [pdf, other]
Title: Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams
Harishchandra Dubey, Abhijeet Sangwan, John Hansen
Comments: 6 Pages, 3 Fiigures, 5 Equations
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[21] arXiv:1907.05982 [pdf, other]
Title: Learning Complex Basis Functions for Invariant Representations of Audio
Stefan Lattner, Monika Dörfler, Andreas Arzt
Comments: Paper accepted at the 20th International Society for Music Information Retrieval Conference, ISMIR 2019, Delft, The Netherlands, November 4-8; 8 pages, 4 figures, 4 tables
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[22] arXiv:1907.06078 [pdf, other]
Title: Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition
Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Julien Epps, Björn W. Schuller
Comments: Accepted in IEEE Transactions on Affective Computing
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23] arXiv:1907.06083 [pdf, other]
Title: Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
Siddique Latif, Junaid Qadir, Muhammad Bilal
Comments: Accepted in Affective Computing & Intelligent Interaction (ACII 2019)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24] arXiv:1907.06129 [pdf, other]
Title: Towards Robust Voice Pathology Detection
Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, Jiri Mekyska, Radim Burget, Zdenek Smekal
Comments: 11 pages, 1 figure, 10 tables. Keywords: Voice pathology detection, deep learning, gradient boosting, anomaly detection
Journal-ref: Neural Computing and Applications (2018): 1-11
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[25] arXiv:1907.06637 [pdf, other]
Title: The Bach Doodle: Approachable music composition with machine learning at scale
Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, Jacob Howcroft
Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2019
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Total of 125 entries : 1-25 26-50 51-75 76-100 ... 101-125
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack