Sound

Authors and titles for July 2019

Total of 125 entries : 1-25 26-50 51-75 76-100 101-125

Showing up to 25 entries per page: fewer | more | all

[26] arXiv:1907.07398 [pdf, other]: Title: HODGEPODGE: Sound event detection based on ensemble of semi-supervised learning methods

Ziqiang Shi, Liu Liu, Huibin Lin, Rujie Liu, Anyan Shi

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[27] arXiv:1907.08506 [pdf, other]: Title: Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling

Konstantinos Drossos, Shayan Gharib, Paul Magron, Tuomas Virtanen

Comments: Fixed the display of URLs at footnote, updated the results

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[28] arXiv:1907.08520 [pdf, other]: Title: Data Augmentation for Instrument Classification Robust to Audio Effects

António Ramires, Xavier Serra

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29] arXiv:1907.08698 [pdf, other]: Title: Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

Elena V. Epure, Anis Khlif, Romain Hennequin

Comments: Published in ISMIR 2019

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[30] arXiv:1907.09238 [pdf, other]: Title: Crowdsourcing a Dataset of Audio Captions

Samuel Lipping, Konstantinos Drossos, Tuomas Virtanen

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[31] arXiv:1907.09884 [pdf, other]: Title: Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features

Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen

Comments: 5 pages, 1 figure, accepted by INTERSPEECH 2019

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[32] arXiv:1907.09936 [pdf, other]: Title: Log Complex Color for Visual Pattern Recognition of Total Sound

Stephen Wedekind, P. Fraundorf

Comments: 6 pages, 5 figures, 28 references, cf. this http URL

Journal-ref: Audio Engineering Society Convention 141 (2016) paper 9647; Subject of US patent 10,341,795

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[33] arXiv:1907.11238 [pdf, other]: Title: Interactive Lungs Auscultation with Reinforcement Learning Agent

Tomasz Grzywalski, Riccardo Belluzzo, Szymon Drgas, Agnieszka Cwalinska, Honorata Hafke-Dys

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[34] arXiv:1907.11956 [pdf, other]: Title: Dilated FCN: Listening Longer to Hear Better

Shuyu Gong, Zhewei Wang, Tao Sun, Yuanhang Zhang, Charles D. Smith, Li Xu, Jundong Liu

Comments: 5 pages; will appear in WASPAA conference

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[35] arXiv:1907.12279 [pdf, other]: Title: StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion

Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

Comments: Accepted to Interspeech 2019. Project page: this http URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[36] arXiv:1907.13188 [pdf, other]: Title: Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Mark Thomas, Bruce Martin, Katie Kowarski, Briand Gaudet, Stan Matwin

Comments: 16 pages, To appear in ECML-PKDD 2019

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[37] arXiv:1907.00112 (cross-list from cs.CL) [pdf, other]: Title: Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice

Vikramjit Mitra, Sue Booker, Erik Marchi, David Scott Farrar, Ute Dorothea Peitz, Bridget Cheng, Ermine Teves, Anuj Mehta, Devang Naik

Comments: 5 pages, 6 figures

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[38] arXiv:1907.00443 (cross-list from cs.CL) [pdf, other]: Title: Multilingual Bottleneck Features for Query by Example Spoken Term Detection

Dhananjay Ram, Lesly Miculicich, Hervé Bourlard

Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[39] arXiv:1907.00457 (cross-list from cs.CL) [pdf, other]: Title: BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition

Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff

Comments: Odyssey 2020 camera-ready (presented Nov. 2020)

Journal-ref: Proc. the Speaker and Language Recognition Workshop (Odyssey 2020), 9-16

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:1907.00477 (cross-list from cs.CL) [pdf, other]: Title: Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions

Tejas Srinivasan, Ramon Sanabria, Florian Metze

Comments: Accepted to How2 Workshop, ICML 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[41] arXiv:1907.00758 (cross-list from cs.CL) [pdf, other]: Title: Synchronising audio and ultrasound by learning cross-modal embeddings

Aciel Eshky, Manuel Sam Ribeiro, Korin Richmond, Steve Renals

Comments: 5 pages, 1 figure, 4 tables; Interspeech 2019 with the following edits: 1) Loss and accuracy upon convergence were accidentally reported from an older model. Now updated with model described throughout the paper. All other results remain unchanged. 2) Max true offset in the training data corrected from 179ms to 1789ms. 3) Detectability "boundary/range" renamed to detectability "thresholds"

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[42] arXiv:1907.00772 (cross-list from eess.AS) [pdf, other]: Title: Analysis by Adversarial Synthesis -- A Novel Approach for Speech Vocoding

Ahmed Mustafa, Arijit Biswas, Christian Bergler, Julia Schottenhamml, Andreas Maier

Comments: Accepted to Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[43] arXiv:1907.00797 (cross-list from eess.AS) [pdf, other]: Title: Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation

Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda

Comments: 5 pages, 4 figures, Proc. Interspeech, 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[44] arXiv:1907.00818 (cross-list from eess.AS) [pdf, other]: Title: Ultrasound tongue imaging for diarization and alignment of child speech therapy sessions

Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals

Comments: 5 pages, 3 figures, Accepted for publication at Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Image and Video Processing (eess.IV)
[45] arXiv:1907.00824 (cross-list from cs.HC) [pdf, other]: Title: Designing Deep Reinforcement Learning for Human Parameter Exploration

Hugo Scurto, Bavo Van Kerrebroeck, Baptiste Caramiaux, Frédéric Bevilacqua

Comments: Author's version of the work. The definitive Version of Record was published in ACM Transactions on Computer-Human Interaction (TOCHI)

Journal-ref: ACM Trans. Comput.-Hum. Interact. 28, 1, Article 1 (January 2021), 35 pages (2021)

Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46] arXiv:1907.00835 (cross-list from cs.CL) [pdf, other]: Title: UltraSuite: A Repository of Ultrasound and Acoustic Data from Child Speech Therapy Sessions

Aciel Eshky, Manuel Sam Ribeiro, Joanne Cleland, Korin Richmond, Zoe Roxburgh, James Scobbie, Alan Wrench

Comments: 5 pages, 1 figure, 3 tables; accepted to Interspeech 2018: 19th Annual Conference of the International Speech Communication Association (ISCA)

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[47] arXiv:1907.00873 (cross-list from eess.AS) [pdf, other]: Title: Compression of Acoustic Event Detection Models With Quantized Distillation

Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

Comments: Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[48] arXiv:1907.00971 (cross-list from cs.LG) [pdf, other]: Title: Universal audio synthesizer control with normalizing flows

Philippe Esling, Naotake Masuda, Adrien Bardet, Romeo Despres, Axel Chemla--Romeu-Santos

Comments: DaFX 2019

Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[49] arXiv:1907.01030 (cross-list from eess.AS) [pdf, other]: Title: LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

Eugen Beck, Wei Zhou, Ralf Schlüter, Hermann Ney

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[50] arXiv:1907.01154 (cross-list from cs.MM) [pdf, other]: Title: Adaptive Music Composition for Games

Patrick Hutchings, Jon McCormack

Comments: Preprint. Accepted for publication in IEEE Transactions on Games, 2019

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Total of 125 entries : 1-25 26-50 51-75 76-100 101-125

Showing up to 25 entries per page: fewer | more | all