Sound

Authors and titles for December 2022

Total of 137 entries : 1-25 26-50 51-75 76-100 101-125 ... 126-137

Showing up to 25 entries per page: fewer | more | all

[26] arXiv:2212.07738 [pdf, other]: Title: A large-scale and PCR-referenced vocal audio dataset for COVID-19

Jobie Budd, Kieran Baker, Emma Karoune, Harry Coppock, Selina Patel, Ana Tendero Cañadas, Alexander Titcomb, Richard Payne, David Hurley, Sabrina Egglestone, Lorraine Butler, Jonathon Mellor, George Nicholson, Ivan Kiskin, Vasiliki Koutra, Radka Jersakova, Rachel A. McKendry, Peter Diggle, Sylvia Richardson, Björn W. Schuller, Steven Gilmour, Davide Pigoli, Stephen Roberts, Josef Packham, Tracey Thornley, Chris Holmes

Comments: 39 pages, 4 figures

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[27] arXiv:2212.08348 [pdf, other]: Title: Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[28] arXiv:2212.08570 [pdf, other]: Title: Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Harry Coppock, George Nicholson, Ivan Kiskin, Vasiliki Koutra, Kieran Baker, Jobie Budd, Richard Payne, Emma Karoune, David Hurley, Alexander Titcomb, Sabrina Egglestone, Ana Tendero Cañadas, Lorraine Butler, Radka Jersakova, Jonathon Mellor, Selina Patel, Tracey Thornley, Peter Diggle, Sylvia Richardson, Josef Packham, Björn W. Schuller, Davide Pigoli, Steven Gilmour, Stephen Roberts, Chris Holmes

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[29] arXiv:2212.08571 [pdf, other]: Title: Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19

Davide Pigoli, Kieran Baker, Jobie Budd, Lorraine Butler, Harry Coppock, Sabrina Egglestone, Steven G. Gilmour, Chris Holmes, David Hurley, Radka Jersakova, Ivan Kiskin, Vasiliki Koutra, Jonathon Mellor, George Nicholson, Joe Packham, Selina Patel, Richard Payne, Stephen J. Roberts, Björn W. Schuller, Ana Tendero-Cañadas, Tracey Thornley, Alexander Titcomb

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Applications (stat.AP)
[30] arXiv:2212.08601 [pdf, other]: Title: Source Tracing: Detecting Voice Spoofing

Tinglong Zhu, Xingming Wang, Xiaoyi Qin, Ming Li

Comments: Accepted by APSIPA ASC

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[31] arXiv:2212.08952 [pdf, other]: Title: Learning from Taxonomy: Multi-label Few-Shot Classification for Everyday Sound Recognition

Jinhua Liang, Huy Phan, Emmanouil Benetos

Comments: submitted to ICASSP2023

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[32] arXiv:2212.09006 [pdf, other]: Title: A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness

Tiantian Feng, Rajat Hebbar, Nicholas Mehlman, Xuan Shi, Aditya Kommineni, and Shrikanth Narayanan

Journal-ref: APSIPA Transactions on Signal and Information Processing, vol. 12, no. 3, 2023

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[33] arXiv:2212.09090 [pdf, other]: Title: Exploring Workplace Behaviors through Speaking Patterns using Large-scale Multimodal Wearable Recordings: A Study of Healthcare Providers

Tiantian Feng, Shrikanth Narayanan

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[34] arXiv:2212.09730 [pdf, other]: Title: Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units

Gallil Maimon, Yossi Adi

Comments: Accepted at EMNLP 2023

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[35] arXiv:2212.10092 [pdf, other]: Title: Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models

Changli Tang, Yujin Wang, Xie Chen, Wei-Qiang Zhang

Comments: Accepted by NCMMSC2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36] arXiv:2212.10093 [pdf, other]: Title: Visual Transformers for Primates Classification and Covid Detection

Steffen Illium, Robert Müller, Andreas Sedlmeier, Claudia-Linnhoff Popien

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[37] arXiv:2212.10103 [pdf, other]: Title: VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion

Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Shunhui Ji

Comments: 7 pages,5 figures

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[38] arXiv:2212.10191 [pdf, other]: Title: Emotion Selectable End-to-End Text-based Speech Editing

Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang

Comments: Under review, 12 pages, 11 figures, demo page is available at this https URL

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[39] arXiv:2212.10370 [pdf, other]: Title: Hopf Physical Reservoir Computer for Reconfigurable Sound Recognition

Md Raf E Ul Shougat, XiaoFu Li, Siyao Shao, Kathleen Walden McGarvey, Edmon Perkins

Comments: 21 pages, 11 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:2212.10744 [pdf, html, other]: Title: An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

Kai Li, Fenghua Xie, Hang Chen, Kexin Yuan, Xiaolin Hu

Comments: Accepted by TPAMI 2024

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2212.10818 [pdf, other]: Title: 4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders

Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe

Comments: Accepted by INTERRSPEECH2023

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[42] arXiv:2212.10901 [pdf, other]: Title: ALCAP: Alignment-Augmented Music Captioner

Zihao He, Weituo Hao, Wei-Tsung Lu, Changyou Chen, Kristina Lerman, Xuchen Song

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[43] arXiv:2212.11054 [pdf, other]: Title: Polytopic Analysis of Music

Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

Comments: Work document

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[44] arXiv:2212.11134 [pdf, other]: Title: Generating music with sentiment using Transformer-GANs

Pedro Neves, Jose Fornari, João Florindo

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[45] arXiv:2212.11277 [pdf, other]: Title: Audio Denoising for Robust Audio Fingerprinting

Kamil Akesbi

Comments: 63 pages, master thesis

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[46] arXiv:2212.12151 [pdf, other]: Title: EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers

Ahmed Tanvir Mahdad, Cong Shi, Zhengkun Ye, Tianming Zhao, Yan Wang, Yingying Chen, Nitesh Saxena

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[47] arXiv:2212.13369 [pdf, other]: Title: Feature Selection Approaches for Optimising Music Emotion Recognition Methods

Le Cai, Sam Ferguson, Haiyan Lu, Gengfa Fang

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[48] arXiv:2212.13581 [pdf, other]: Title: Voice conversion with limited data and limitless data augmentations

Olga Slizovskaia, Jordi Janer, Pritish Chandna, Oscar Mayor

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[49] arXiv:2212.14490 [pdf, other]: Title: Multi-modal deep learning system for depression and anxiety detection

Brian Diep, Marija Stanojevic, Jekaterina Novikova

Comments: accepted to the PAI4MH workshop at NeurIPS 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[50] arXiv:2212.14597 [pdf, other]: Title: Defense Against Adversarial Attacks on Audio DeepFake Detection

Piotr Kawa, Marcin Plata, Piotr Syga

Comments: Accepted to INTERSPEECH 2023

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Total of 137 entries : 1-25 26-50 51-75 76-100 101-125 ... 126-137

Showing up to 25 entries per page: fewer | more | all