Sound

Authors and titles for December 2022

Total of 137 entries : 1-25 26-50 51-75 76-100 ... 126-137

Showing up to 25 entries per page: fewer | more | all

[1] arXiv:2212.00369 [pdf, other]: Title: Deep neural network techniques for monaural speech enhancement: state of the art analysis

Peter Ochieng

Comments: conference

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[2] arXiv:2212.00973 [pdf, other]: Title: A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling

Z. Guo, J. Kang, D. Herremans

Comments: This paper is accepted at AAAI 2023

Journal-ref: Proceedings of AAAI 2023

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[3] arXiv:2212.01033 [pdf, other]: Title: Sonus Texere! Automated Dense Soundtrack Construction for Books using Movie Adaptations

Jaidev Shriram, Makarand Tapaswi, Vinoo Alluri

Comments: Accepted to ISMIR 2022. Project page: this https URL

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[4] arXiv:2212.01042 [pdf, other]: Title: AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary

Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingamy, Riccardo Spolaor, Parth Pathaky, Guoming Zhang, Xiuzhen Cheng

Comments: 2022 IEEE Symposium on Security and Privacy (SP)

Journal-ref: 2022 IEEE Symposium on Security and Privacy (SP)

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[5] arXiv:2212.01457 [pdf, other]: Title: NEAL: An open-source tool for audio annotation

Anthony Gibbons, Ian Donohue, Courtney E. Gorman, Emma King, Andrew Parnell

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6] arXiv:2212.01546 [pdf, other]: Title: UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis

Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[7] arXiv:2212.01775 [pdf, other]: Title: Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech

Dominik Wagner, Sebastian P. Bayerl, Hector A. Cordourier Maruri, Tobias Bocklet

Comments: Accepted at SLT 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[8] arXiv:2212.01884 [pdf, other]: Title: Melody transcription via generative pre-training

Chris Donahue, John Thickstun, Percy Liang

Comments: Published as a conference paper at ISMIR 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[9] arXiv:2212.01911 [pdf, other]: Title: Speech MOS multi-task learning and rater bias correction

Haleh Akrami, Hannes Gamper

Comments: Submitted to ICASSP 2023

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[10] arXiv:2212.02076 [pdf, other]: Title: NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer

Changsheng Quan, Xiaofei Li

Comments: submitted to TASLP

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11] arXiv:2212.02084 [pdf, other]: Title: End-to-end Recording Device Identification Based on Deep Representation Learning

Chunyan Zeng, Dongliang Zhu, Zhifeng Wang, Minghu Wu, Wei Xiong, Nan Zhao

Comments: 20 pages, 5 figures, recording device identification

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12] arXiv:2212.02339 [pdf, other]: Title: DeAR: A Deep-learning-based Audio Re-recording Resilient Watermarking

Chang Liu, Jie Zhang, Han Fang, Zehua Ma, Weiming Zhang, Nenghai Yu

Comments: Accepted by AAAI2023

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[13] arXiv:2212.02508 [pdf, other]: Title: MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Chenghua Lin, Xingran Chen, Anton Ragni, Hanzhi Yin, Zhijie Hu, Haoyu He, Emmanouil Benetos, Norbert Gyenge, Ruibo Liu, Jie Fu

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[14] arXiv:2212.02610 [pdf, other]: Title: Audio Latent Space Cartography

Nicolas Jonason, Bob L.T. Sturm

Comments: Late Breaking / Demo, ISMIR 2022 (this https URL)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15] arXiv:2212.03039 [pdf, other]: Title: Covariance Regularization for Probabilistic Linear Discriminant Analysis

Zhiyuan Peng, Mingjie Shao, Xuanji He, Xu Li, Tan Lee, Ke Ding, Guanglu Wan

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[16] arXiv:2212.03090 [pdf, other]: Title: Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition

Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee, Guanglu Wan

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17] arXiv:2212.03435 [pdf, other]: Title: Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

Fengyu Yang, Jian Luan, Yujun Wang

Comments: Submitted to ICASSP2023

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[18] arXiv:2212.05294 [pdf, other]: Title: Variational Speech Waveform Compression to Catalyze Semantic Communications

Shengshi Yao, Zixuan Xiao, Sixian Wang, Jincheng Dai, Kai Niu, Ping Zhang

Subjects: Sound (cs.SD); Information Theory (cs.IT); Audio and Speech Processing (eess.AS)
[19] arXiv:2212.05301 [pdf, other]: Title: Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning

Chen Chen, Yuchen Hu, Qiang Zhang, Heqing Zou, Beier Zhu, Eng Siong Chng

Comments: Accepted by AAAI2023

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[20] arXiv:2212.05335 [pdf, other]: Title: A Comparison of Audio Preprocessing Techniques and Deep Learning Algorithms for Raga Recognition

Devayani Hebbar, Vandana Jagtap

Comments: 7 pages, 6 figures, 7 tables

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[21] arXiv:2212.06387 [pdf, other]: Title: Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric

Hyeongju Kim, Hyeong-Seok Choi

Comments: 5 pages, submitted to ICASSP 2023

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22] arXiv:2212.06397 [pdf, other]: Title: Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis

Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang, Zhongyuan Wang

Comments: Published to ISCSLP 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[23] arXiv:2212.06972 [pdf, other]: Title: Disentangling Prosody Representations with Unsupervised Speech Reconstruction

Leyuan Qu, Taihao Li, Cornelius Weber, Theresa Pekarek-Rosin, Fuji Ren, Stefan Wermter

Comments: Accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[24] arXiv:2212.07065 [pdf, other]: Title: CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos

Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian McAuley, Taylor Berg-Kirkpatrick

Comments: Accepted by ICLR 2023. Audio samples can be found at this https URL

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[25] arXiv:2212.07163 [pdf, other]: Title: Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation

Yinhao Xu, Jian Zhou, Liang Tao, Hon Keung Kwan

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Total of 137 entries : 1-25 26-50 51-75 76-100 ... 126-137

Showing up to 25 entries per page: fewer | more | all