Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for June 2018

Total of 67 entries : 1-25 26-50 51-67
Showing up to 25 entries per page: fewer | more | all
[1] arXiv:1806.00273 [pdf, other]
Title: Sparse Pursuit and Dictionary Learning for Blind Source Separation in Polyphonic Music Recordings
Sören Schulze, Emily J. King
Journal-ref: J. Audio Speech Music Proc. (2021) 2021:6
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[2] arXiv:1806.00511 [pdf, other]
Title: Performance Based Cost Functions for End-to-End Speech Separation
Shrikant Venkataramani, Ryley Higa, Paris Smaragdis
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[3] arXiv:1806.00516 [pdf, other]
Title: DNN Based Speech Enhancement for Unseen Noises Using Monte Carlo Dropout
Nazreen P M, A G Ramakrishnan
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[4] arXiv:1806.03209 [pdf, other]
Title: Analysis of Length Normalization in End-to-End Speaker Verification System
Weicheng Cai, Jinkun Chen, Ming Li
Comments: Accepted for Interspeech 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5] arXiv:1806.03464 [pdf, other]
Title: Angular Softmax Loss for End-to-end Speaker Verification
Yutian Li, Feng Gao, Zhijian Ou, Jiasong Sun
Subjects: Audio and Speech Processing (eess.AS)
[6] arXiv:1806.04096 [pdf, other]
Title: Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models
Fanny Roche (1 and 2), Thomas Hueber (1), Samuel Limier (2), Laurent Girin (1 and 3) ((1) Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France, (2) Arturia, Meylan, France, (3) INRIA, Perception Team, Montbonnot, France)
Comments: SMC 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7] arXiv:1806.04885 [pdf, other]
Title: Model-based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids
Mathew Shaji Kavalekalam, Jesper K. Nielsen, Jesper B. Boldt, Mads G. Christensen
Comments: after revision
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:1806.05059 [pdf, other]
Title: Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages
Shiyu Zhou, Shuang Xu, Bo Xu
Comments: arXiv admin note: text overlap with arXiv:1805.06239
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[9] arXiv:1806.05296 [pdf, other]
Title: Multi-View Networks for Denoising of Arbitrary Numbers of Channels
Jonah Casebeer, Brian Luc, Paris Smaragdis
Comments: 5 pages, 6 figures, IWAENC 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10] arXiv:1806.06779 [pdf, other]
Title: A Weighted Superposition of Functional Contours Model for Modelling Contextual Prominence of Elementary Prosodic Contours
Branislav Gerazov, Gérard Bailly, Yi Xu
Comments: Accepted for publication at INTERSPEECH 2018
Journal-ref: Proceedings of Interspeech 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11] arXiv:1806.08086 [pdf, other]
Title: Towards Automated Single Channel Source Separation using Neural Networks
Arpita Gang, Pravesh Biyani, Akshay Soni
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[12] arXiv:1806.08619 [pdf, other]
Title: Multi-task WaveNet: A Multi-task Generative Model for Statistical Parametric Speech Synthesis without Fundamental Frequency Conditions
Yu Gu, Yongguo Kang
Comments: Accepted by Interspeech 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[13] arXiv:1806.08685 [pdf, other]
Title: A Variational Prosody Model for Mapping the Context-Sensitive Variation of Functional Prosodic Prototypes
Branislav Gerazov, Gérard Bailly, Omar Mohammed, Yi Xu, Philip N. Garner
Comments: Updated with recurrent version of contour generators, unified prosodic latent space, and performance evaluation with baseline
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14] arXiv:1806.09169 [pdf, other]
Title: Perceptually Relevant Preservation of Interaural Time Differences in Binaural Hearing Aids
Fábio P. Itturriet, Márcio H. Costa
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[15] arXiv:1806.09276 [pdf, other]
Title: EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
Hao Li, Yongguo Kang, Zhenyu Wang
Comments: Accepted by Interspeech 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:1806.09411 [pdf, other]
Title: Convolutional Neural Networks to Enhance Coded Speech
Ziyue Zhao, Huijun Liu, Tim Fingscheidt
Comments: More analysis are added for version 4
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17] arXiv:1806.10307 [pdf, other]
Title: Independent Deeply Learned Matrix Analysis for Multichannel Audio Source Separation
Shinichi Mogami, Hayato Sumino, Daichi Kitamura, Norihiro Takamune, Shinnosuke Takamichi, Hiroshi Saruwatari, Nobutaka Ono
Comments: 5 pages, 4 figures, To appear in the Proceedings of the 26th European Signal Processing Conference (EUSIPCO 2018)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[18] arXiv:1806.10522 [pdf, other]
Title: Speech Denoising with Deep Feature Losses
Francois G. Germain, Qifeng Chen, Vladlen Koltun
Comments: Code can be found at this https URL . Sound examples can be found at this https URL
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19] arXiv:1806.00195 (cross-list from stat.ML) [pdf, other]
Title: Learning a Latent Space of Multitrack Measures
Ian Simon, Adam Roberts, Colin Raffel, Jesse Engel, Curtis Hawthorne, Douglas Eck
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[20] arXiv:1806.00927 (cross-list from cs.SD) [pdf, other]
Title: Voice Imitating Text-to-Speech Neural Networks
Younggun Lee, Taesu Kim, Soo-Young Lee
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21] arXiv:1806.00984 (cross-list from cs.SD) [pdf, other]
Title: DNN-HMM based Speaker Adaptive Emotion Recognition using Proposed Epoch and MFCC Features
Md. Shah Fahad, Jainath Yadav, Gyadhar Pradhan, Akshay Deepak
Journal-ref: Circuits, Systems, and Signal Processing 2020
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[22] arXiv:1806.01145 (cross-list from cs.SD) [pdf, other]
Title: Machines hear better when they have ears
Deepak Baby, Sarah Verhulst
Comments: 6 pages
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[23] arXiv:1806.01180 (cross-list from cs.SD) [pdf, other]
Title: Revisiting Singing Voice Detection: a Quantitative Review and the Future Outlook
Kyungyun Lee, Keunwoo Choi, Juhan Nam
Comments: Accepted to the 19th International Society of Music Information Retrieval (ISMIR) Conference, Paris, France, 2018
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[24] arXiv:1806.01506 (cross-list from cs.SD) [pdf, other]
Title: Attention Based Fully Convolutional Network for Speech Emotion Recognition
Yuanyuan Zhang, Jun Du, Zirui Wang, Jianshu Zhang
Journal-ref: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[25] arXiv:1806.01665 (cross-list from cs.SD) [pdf, other]
Title: Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions
Rong Gong, Xavier Serra
Comments: Interspeech 2018
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
Total of 67 entries : 1-25 26-50 51-67
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack