Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.SD

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Sound

Authors and titles for July 2019

Total of 125 entries : 1-25 26-50 51-75 76-100 101-125
Showing up to 25 entries per page: fewer | more | all
[26] arXiv:1907.07398 [pdf, other]
Title: HODGEPODGE: Sound event detection based on ensemble of semi-supervised learning methods
Ziqiang Shi, Liu Liu, Huibin Lin, Rujie Liu, Anyan Shi
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[27] arXiv:1907.08506 [pdf, other]
Title: Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling
Konstantinos Drossos, Shayan Gharib, Paul Magron, Tuomas Virtanen
Comments: Fixed the display of URLs at footnote, updated the results
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[28] arXiv:1907.08520 [pdf, other]
Title: Data Augmentation for Instrument Classification Robust to Audio Effects
António Ramires, Xavier Serra
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29] arXiv:1907.08698 [pdf, other]
Title: Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation
Elena V. Epure, Anis Khlif, Romain Hennequin
Comments: Published in ISMIR 2019
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[30] arXiv:1907.09238 [pdf, other]
Title: Crowdsourcing a Dataset of Audio Captions
Samuel Lipping, Konstantinos Drossos, Tuomas Virtanen
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[31] arXiv:1907.09884 [pdf, other]
Title: Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features
Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen
Comments: 5 pages, 1 figure, accepted by INTERSPEECH 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[32] arXiv:1907.09936 [pdf, other]
Title: Log Complex Color for Visual Pattern Recognition of Total Sound
Stephen Wedekind, P. Fraundorf
Comments: 6 pages, 5 figures, 28 references, cf. this http URL
Journal-ref: Audio Engineering Society Convention 141 (2016) paper 9647; Subject of US patent 10,341,795
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[33] arXiv:1907.11238 [pdf, other]
Title: Interactive Lungs Auscultation with Reinforcement Learning Agent
Tomasz Grzywalski, Riccardo Belluzzo, Szymon Drgas, Agnieszka Cwalinska, Honorata Hafke-Dys
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[34] arXiv:1907.11956 [pdf, other]
Title: Dilated FCN: Listening Longer to Hear Better
Shuyu Gong, Zhewei Wang, Tao Sun, Yuanhang Zhang, Charles D. Smith, Li Xu, Jundong Liu
Comments: 5 pages; will appear in WASPAA conference
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[35] arXiv:1907.12279 [pdf, other]
Title: StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Comments: Accepted to Interspeech 2019. Project page: this http URL
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[36] arXiv:1907.13188 [pdf, other]
Title: Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation
Mark Thomas, Bruce Martin, Katie Kowarski, Briand Gaudet, Stan Matwin
Comments: 16 pages, To appear in ECML-PKDD 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[37] arXiv:1907.00112 (cross-list from cs.CL) [pdf, other]
Title: Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice
Vikramjit Mitra, Sue Booker, Erik Marchi, David Scott Farrar, Ute Dorothea Peitz, Bridget Cheng, Ermine Teves, Anuj Mehta, Devang Naik
Comments: 5 pages, 6 figures
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[38] arXiv:1907.00443 (cross-list from cs.CL) [pdf, other]
Title: Multilingual Bottleneck Features for Query by Example Spoken Term Detection
Dhananjay Ram, Lesly Miculicich, Hervé Bourlard
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[39] arXiv:1907.00457 (cross-list from cs.CL) [pdf, other]
Title: BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff
Comments: Odyssey 2020 camera-ready (presented Nov. 2020)
Journal-ref: Proc. the Speaker and Language Recognition Workshop (Odyssey 2020), 9-16
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:1907.00477 (cross-list from cs.CL) [pdf, other]
Title: Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions
Tejas Srinivasan, Ramon Sanabria, Florian Metze
Comments: Accepted to How2 Workshop, ICML 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[41] arXiv:1907.00758 (cross-list from cs.CL) [pdf, other]
Title: Synchronising audio and ultrasound by learning cross-modal embeddings
Aciel Eshky, Manuel Sam Ribeiro, Korin Richmond, Steve Renals
Comments: 5 pages, 1 figure, 4 tables; Interspeech 2019 with the following edits: 1) Loss and accuracy upon convergence were accidentally reported from an older model. Now updated with model described throughout the paper. All other results remain unchanged. 2) Max true offset in the training data corrected from 179ms to 1789ms. 3) Detectability "boundary/range" renamed to detectability "thresholds"
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[42] arXiv:1907.00772 (cross-list from eess.AS) [pdf, other]
Title: Analysis by Adversarial Synthesis -- A Novel Approach for Speech Vocoding
Ahmed Mustafa, Arijit Biswas, Christian Bergler, Julia Schottenhamml, Andreas Maier
Comments: Accepted to Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[43] arXiv:1907.00797 (cross-list from eess.AS) [pdf, other]
Title: Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda
Comments: 5 pages, 4 figures, Proc. Interspeech, 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[44] arXiv:1907.00818 (cross-list from eess.AS) [pdf, other]
Title: Ultrasound tongue imaging for diarization and alignment of child speech therapy sessions
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals
Comments: 5 pages, 3 figures, Accepted for publication at Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Image and Video Processing (eess.IV)
[45] arXiv:1907.00824 (cross-list from cs.HC) [pdf, other]
Title: Designing Deep Reinforcement Learning for Human Parameter Exploration
Hugo Scurto, Bavo Van Kerrebroeck, Baptiste Caramiaux, Frédéric Bevilacqua
Comments: Author's version of the work. The definitive Version of Record was published in ACM Transactions on Computer-Human Interaction (TOCHI)
Journal-ref: ACM Trans. Comput.-Hum. Interact. 28, 1, Article 1 (January 2021), 35 pages (2021)
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46] arXiv:1907.00835 (cross-list from cs.CL) [pdf, other]
Title: UltraSuite: A Repository of Ultrasound and Acoustic Data from Child Speech Therapy Sessions
Aciel Eshky, Manuel Sam Ribeiro, Joanne Cleland, Korin Richmond, Zoe Roxburgh, James Scobbie, Alan Wrench
Comments: 5 pages, 1 figure, 3 tables; accepted to Interspeech 2018: 19th Annual Conference of the International Speech Communication Association (ISCA)
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[47] arXiv:1907.00873 (cross-list from eess.AS) [pdf, other]
Title: Compression of Acoustic Event Detection Models With Quantized Distillation
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang
Comments: Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[48] arXiv:1907.00971 (cross-list from cs.LG) [pdf, other]
Title: Universal audio synthesizer control with normalizing flows
Philippe Esling, Naotake Masuda, Adrien Bardet, Romeo Despres, Axel Chemla--Romeu-Santos
Comments: DaFX 2019
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[49] arXiv:1907.01030 (cross-list from eess.AS) [pdf, other]
Title: LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring
Eugen Beck, Wei Zhou, Ralf Schlüter, Hermann Ney
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[50] arXiv:1907.01154 (cross-list from cs.MM) [pdf, other]
Title: Adaptive Music Composition for Games
Patrick Hutchings, Jon McCormack
Comments: Preprint. Accepted for publication in IEEE Transactions on Games, 2019
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Total of 125 entries : 1-25 26-50 51-75 76-100 101-125
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack