Sound

Authors and titles for June 2019

Total of 132 entries : 1-50 51-100 101-132

Showing up to 50 entries per page: fewer | more | all

[101] arXiv:1906.08407 (cross-list from eess.AS) [pdf, other]: Title: Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment

Min-Jae Hwang, Hong-Goo Kang

Comments: Accepted to the conference of INTERSPEECH 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[102] arXiv:1906.08556 (cross-list from cs.LG) [pdf, other]: Title: Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration

Ville Vestman, Kong Aik Lee, Tomi H. Kinnunen, Takafumi Koshinaka

Comments: Accepted to Interspeech 2019

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[103] arXiv:1906.08647 (cross-list from cs.CL) [pdf, other]: Title: Semi-supervised acoustic model training for five-lingual code-switched ASR

Astik Biswas, Emre Yılmaz, Febe de Wet, Ewald van der Westhuizen, Thomas Niesler

Comments: Accepted for publication at Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[104] arXiv:1906.08847 (cross-list from eess.AS) [pdf, other]: Title: A Signal Subspace Rotation Method for Localization of Multiple Wideband Sound Sources

Kainan Chen, Wenyu Jin, Bharadwaj Desikan

Comments: 5 pages, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[105] arXiv:1906.08871 (cross-list from eess.AS) [pdf, other]: Title: Advancing Speech Recognition With No Speech Or With Noisy Speech

Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

Comments: Extended version of our accepted IEEE EUSIPCO 2019 paper with additional results for CTC model based recognition. arXiv admin note: substantial text overlap with arXiv:1906.08045, arXiv:1906.08044

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[106] arXiv:1906.09292 (cross-list from cs.CL) [pdf, other]: Title: Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models

Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[107] arXiv:1906.09322 (cross-list from cs.CL) [pdf, other]: Title: A Syllable-Structured, Contextually-Based Conditionally Generation of Chinese Lyrics

Xu Lu, Jie Wang, Bojin Zhuang, Shaojun Wang, Jing Xiao

Comments: accepted by The 16th Pacific Rim International Conference on AI

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD)
[108] arXiv:1906.09426 (cross-list from eess.AS) [pdf, other]: Title: End-to-End ASR for Code-switched Hindi-English Speech

Brij Mohan Lal Srivastava, Basil Abraham, Sunayana Sitaram, Rupesh Mehta, Preethi Jyothi

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[109] arXiv:1906.09825 (cross-list from cs.CL) [pdf, other]: Title: SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech

Shreyas Seshadri, Okko Räsänen

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[110] arXiv:1906.09832 (cross-list from cs.CL) [pdf, other]: Title: A computational model of early language acquisition from audiovisual experiences of young infants

Okko Räsänen, Khazar Khorrami

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[111] arXiv:1906.10198 (cross-list from cs.CL) [pdf, other]: Title: Multimodal and Multi-view Models for Emotion Recognition

Gustavo Aguilar, Viktor Rozgić, Weiran Wang, Chao Wang

Comments: ACL 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[112] arXiv:1906.10199 (cross-list from cs.LG) [pdf, other]: Title: Neural Transfer Learning for Cry-based Diagnosis of Perinatal Asphyxia

Charles C. Onu, Jonathan Lebensold, William L. Hamilton, Doina Precup

Comments: Accepted at INTERSPEECH 2019

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[113] arXiv:1906.10369 (cross-list from eess.AS) [pdf, other]: Title: Acoustic Modeling for Automatic Lyrics-to-Audio Alignment

Chitralekha Gupta, Emre Yılmaz, Haizhou Li

Comments: Accepted for publication at Interspeech 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[114] arXiv:1906.10508 (cross-list from eess.AS) [pdf, other]: Title: Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations

Jing-Xuan Zhang, Zhen-Hua Ling, Li-Rong Dai

Comments: Accepted by IEEE/ACM Transactions on Aduio, Speech and Language Processing

Journal-ref: IEEE/ACM Transactions on Audio, Speech and Language Processing vol 28 no 1 (2020) 540-552

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[115] arXiv:1906.10606 (cross-list from eess.AS) [pdf, other]: Title: DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm

Gabriel Meseguer-Brocal, Alice Cohen-Hadria, Geoffroy Peeters

Journal-ref: Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR, Paris, France, pp. 431-437, 2018

Subjects: Audio and Speech Processing (eess.AS); Databases (cs.DB); Machine Learning (cs.LG); Sound (cs.SD)
[116] arXiv:1906.10623 (cross-list from cs.LG) [pdf, other]: Title: Emotion Recognition Using Fusion of Audio and Video Features

Juan D. S. Ortega, Patrick Cardinal, Alessandro L. Koerich

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[117] arXiv:1906.10834 (cross-list from cs.CL) [pdf, other]: Title: Essence Knowledge Distillation for Speech Recognition

Zhenchuan Yang, Chun Zhang, Weibin Zhang, Jianxiu Jin, Dongpeng Chen

Comments: 5 pages, 2 figures

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[118] arXiv:1906.10859 (cross-list from eess.AS) [pdf, other]: Title: End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training

Peng-fei Wu, Zhen-hua Ling, Li-juan Liu, Yuan Jiang, Hong-chuan Wu, Li-rong Dai

Comments: 5 pages, 2 figures submitted to APSIPA2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[119] arXiv:1906.10876 (cross-list from cs.CL) [pdf, other]: Title: Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition

Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, Shinji Watanabe

Comments: Accepted to INTERSPEECH 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[120] arXiv:1906.10996 (cross-list from cs.IR) [pdf, other]: Title: Learning Soft-Attention Models for Tempo-invariant Audio-Sheet Music Retrieval

Stefan Balke, Matthias Dorfer, Luis Carvalho, Andreas Arzt, Gerhard Widmer

Comments: Accepted for publication at ISMIR 2019

Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[121] arXiv:1906.11018 (cross-list from eess.AS) [pdf, other]: Title: Integration of TensorFlow based Acoustic Model with Kaldi WFST Decoder

Minkyu Lim, Ji-Hwan Kim

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[122] arXiv:1906.11047 (cross-list from eess.AS) [pdf, other]: Title: Multi-Span Acoustic Modelling using Raw Waveform Signals

Patrick von Platen, Chao Zhang, Philip Woodland

Comments: To appear in INTERSPEECH 2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[123] arXiv:1906.11049 (cross-list from eess.AS) [pdf, other]: Title: Unsupervised Phoneme and Word Discovery from Multiple Speakers using Double Articulation Analyzer and Neural Network with Parametric Bias

Ryo Nakashima, Ryo Ozaki, Tadahiro Taniguchi

Comments: 21 pages. Submitted

Journal-ref: Front. Robot. AI, 2019, 6:92

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[124] arXiv:1906.11521 (cross-list from cs.CL) [pdf, other]: Title: Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

Ondrej Klejch, Joachim Fainberg, Peter Bell, Steve Renals

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[125] arXiv:1906.11571 (cross-list from cs.HC) [pdf, other]: Title: Sensitivity to Haptic-Audio Envelope Asynchrony

Alfonso Balandra, Shoichi Hasegawa

Comments: (The reported results are wrong, we are currently working on this publication, the article contents will be fixed. ) Work in progress paper for World Haptics 2019

Journal-ref: World Haptics 2019 (Work in Progress)

Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[126] arXiv:1906.11604 (cross-list from cs.CL) [pdf, other]: Title: Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion

Suyoun Kim, Siddharth Dalmia, Florian Metze

Comments: ACL 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[127] arXiv:1906.11620 (cross-list from eess.AS) [pdf, other]: Title: Audio-Based Music Classification with DenseNet And Data Augmentation

Wenhao Bian, Jie Wang, Bojin Zhuang, Jiankui Yang, Shaojun Wang, Jing Xiao

Comments: accepted by The 16th Pacific Rim International Conference on AI

Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD)
[128] arXiv:1906.11645 (cross-list from eess.AS) [pdf, other]: Title: RUSLAN: Russian Spoken Language Corpus for Speech Synthesis

Lenar Gabdrakhmanov, Rustem Garaev, Evgenii Razinkov

Comments: Accepted to SPECOM'2019

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[129] arXiv:1906.11759 (cross-list from q-bio.NC) [pdf, other]: Title: Low-dimensional Embodied Semantics for Music and Language

Francisco Afonso Raposo, David Martins de Matos, Ricardo Ribeiro

Comments: 6 pages, 1 figure, 1 table

Subjects: Neurons and Cognition (q-bio.NC); Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[130] arXiv:1906.11783 (cross-list from cs.IR) [pdf, other]: Title: Representation Learning of Music Using Artist, Album, and Track Information

Jongpil Lee, Jiyoung Park, Juhan Nam

Comments: International Conference on Machine Learning (ICML) 2019, Machine Learning for Music Discovery Workshop

Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[131] arXiv:1906.12170 (cross-list from cs.CV) [pdf, other]: Title: LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models

Dilip Kumar Margam, Rohith Aralikatti, Tanay Sharma, Abhinav Thanda, Pujitha A K, Sharad Roy, Shankar M Venkatesan

Comments: Submitted to Interspeech 2019

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[132] arXiv:1906.12286 (cross-list from cs.LG) [pdf, other]: Title: RecurSIA-RRT: Recursive translatable point-set pattern discovery with removal of redundant translators

David Meredith

Comments: Submitted to 12th International Workshop on Machine Learning and Music (this https URL)

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Total of 132 entries : 1-50 51-100 101-132

Showing up to 50 entries per page: fewer | more | all