Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > eess.AS

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Audio and Speech Processing

Authors and titles for June 2022

Total of 268 entries : 1-25 ... 151-175 176-200 201-225 226-250 251-268
Showing up to 25 entries per page: fewer | more | all
[226] arXiv:2206.12759 (cross-list from cs.CL) [pdf, other]
Title: Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective
Qingcheng Zeng, Dading Chong, Peilin Zhou, Jie Yang
Comments: INTERSPEECH 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[227] arXiv:2206.12772 (cross-list from cs.CV) [pdf, other]
Title: Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation
Jinxiang Liu, Chen Ju, Weidi Xie, Ya Zhang
Comments: Camera-ready Version for ACMMM 2022, Project page is this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[228] arXiv:2206.12829 (cross-list from cs.SD) [pdf, other]
Title: On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi, Subodh Kumar
Comments: Accepted at SPCOM 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[229] arXiv:2206.12879 (cross-list from cs.CL) [pdf, other]
Title: Data Augmentation for Dementia Detection in Spoken Language
Anna Hlédiková, Dominika Woszczyk, Alican Akman, Soteris Demetriou, Björn Schuller
Comments: Accepted to INTERSPEECH 2022
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[230] arXiv:2206.12931 (cross-list from cs.CL) [pdf, other]
Title: Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi
Ritesh Kumar, Siddharth Singh, Shyam Ratan, Mohit Raj, Sonal Sinha, Bornini Lahiri, Vivek Seshadri, Kalika Bali, Atul Kr. Ojha
Comments: Speech for Social Good Workshop, 2022, Interspeech 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[231] arXiv:2206.12955 (cross-list from cs.CL) [pdf, other]
Title: Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney
Comments: Accepted at INTERSPEECH 2022
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[232] arXiv:2206.13021 (cross-list from cs.SD) [pdf, other]
Title: Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion
Tuan Vu Ho, Maori Kobayashi, Masato Akagi
Comments: Accepted at INTERSPEECH 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[233] arXiv:2206.13071 (cross-list from cs.SD) [pdf, other]
Title: Uncertainty Calibration for Deep Audio Classifiers
Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Jing Xiao
Comments: Accepted by InterSpeech 2022, the first two authors contributed equally
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[234] arXiv:2206.13085 (cross-list from cs.SD) [pdf, other]
Title: Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
Lonce Wyse, Purnima Kamath, Chitralekha Gupta
Journal-ref: International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar) (pp. 308-322). Springer, Cham. 2022
Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[235] arXiv:2206.13101 (cross-list from cs.SD) [pdf, other]
Title: SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Zuheng Kang, Junqing Peng, Jianzong Wang, Jing Xiao
Comments: This paper is accepted by Interspeech 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[236] arXiv:2206.13110 (cross-list from cs.SD) [pdf, other]
Title: Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire
Zhiyun Fan, Linhao Dong, Meng Cai, Zejun Ma, Bo Xu
Comments: Signal Processing Letters 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[237] arXiv:2206.13135 (cross-list from cs.CL) [pdf, other]
Title: TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Chengfei Li, Shuhao Deng, Yaoping Wang, Guangjing Wang, Yaguang Gong, Changbin Chen, Jinfeng Bai
Comments: accepted by INTERSPEECH 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[238] arXiv:2206.13136 (cross-list from cs.SD) [pdf, other]
Title: A two-stage full-band speech enhancement model with effective spectral compression mapping
Zhongshu Hou, Qinwen Hu, Kai Chen, Jing Lu
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[239] arXiv:2206.13390 (cross-list from cs.CV) [pdf, other]
Title: A Comprehensive Survey on Video Saliency Detection with Auditory Information: the Audio-visual Consistency Perceptual is the Key!
Chenglizhao Chen, Mengke Song, Wenfeng Song, Li Guo, Muwei Jian
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[240] arXiv:2206.13415 (cross-list from cs.CL) [pdf, other]
Title: Is the Language Familiarity Effect gradual? A computational modelling approach
Maureen de Seyssel, Guillaume Wisniewski, Emmanuel Dupoux
Comments: 8 pages, 2 figures, accepted at CogSci 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[241] arXiv:2206.13476 (cross-list from cs.SD) [pdf, other]
Title: Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework
Rahil Parikh, Harshavardhan Sundar, Ming Sun, Chao Wang, Spyros Matsoukas
Comments: Accepted at ISCA Interspeech 2022
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[242] arXiv:2206.13611 (cross-list from cs.SD) [pdf, other]
Title: ClearBuds: Wireless Binaural Earbuds for Learning-Based Speech Enhancement
Ishan Chatterjee, Maruchi Kim, Vivek Jayaram, Shyamnath Gollakota, Ira Kemelmacher-Shlizerman, Shwetak Patel, Steven M. Seitz
Comments: 12 pages, Published in Mobisys 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[243] arXiv:2206.13689 (cross-list from cs.SD) [pdf, other]
Title: Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation
Jian Luo, Jianzong Wang, Ning Cheng, Edward Xiao, Xulong Zhang, Jing Xiao
Comments: Accepted by Interspeech 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[244] arXiv:2206.13691 (cross-list from cs.SD) [pdf, other]
Title: Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting
Byeonggeun Kim, Seunghan Yang, Inseop Chung, Simyung Chang
Comments: Proceedings of INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[245] arXiv:2206.13700 (cross-list from cs.SD) [pdf, other]
Title: Domain Agnostic Few-shot Learning for Speaker Verification
Seunghan Yang, Debasmit Das, Janghoon Cho, Hyoungwoo Park, Sungrack Yun
Comments: Proceedings of INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[246] arXiv:2206.13708 (cross-list from cs.SD) [pdf, other]
Title: Personalized Keyword Spotting through Multi-task Learning
Seunghan Yang, Byeonggeun Kim, Inseop Chung, Simyung Chang
Comments: Proceedings of INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[247] arXiv:2206.13758 (cross-list from cs.LG) [pdf, other]
Title: Exploring linguistic feature and model combination for speech recognition based automatic AD detection
Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng
Comments: Accepted by INTERSPEECH 2022
Subjects: Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[248] arXiv:2206.13817 (cross-list from cs.SD) [pdf, other]
Title: Comparison of Speech Representations for the MOS Prediction System
Aki Kunikoshi, Jaebok Kim, Wonsuk Jun, Kåre Sjölander (ReadSpeaker)
Comments: 5 pages, 4 figures
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[249] arXiv:2206.13909 (cross-list from cs.SD) [pdf, other]
Title: QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design
Byeonggeun Kim, Seunghan Yang, Jangho Kim, Simyung Chang
Comments: tech report; won 1st place in DCASE2021 challenge. arXiv admin note: substantial text overlap with arXiv:2111.06531
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[250] arXiv:2206.14009 (cross-list from cs.CV) [pdf, other]
Title: Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai, Lotfy Abdel Khaliq, Timon Ulrich
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
Total of 268 entries : 1-25 ... 151-175 176-200 201-225 226-250 251-268
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack