Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.SD

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Sound

Authors and titles for August 2022

Total of 138 entries : 1-25 26-50 51-75 76-100 101-125 126-138
Showing up to 25 entries per page: fewer | more | all
[51] arXiv:2208.10659 [pdf, other]
Title: Fall Detection from Audios with Audio Transformers
Prabhjot Kaur, Qifan Wang, Weisong Shi
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Robotics (cs.RO); Audio and Speech Processing (eess.AS)
[52] arXiv:2208.11308 [pdf, other]
Title: Deep model with built-in cross-attention alignment for acoustic echo cancellation
Evgenii Indenbom, Nicolae-Cătălin Ristea, Ando Saabas, Tanel Pärnamaa, Jegor Gužvin
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[53] arXiv:2208.11402 [pdf, other]
Title: Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers
Paul Primus, Gerhard Widmer
Comments: published in EUSIPCO 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[54] arXiv:2208.11460 [pdf, other]
Title: Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations
Paul Primus, Gerhard Widmer
Comments: accepted at DCASE Workshop 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[55] arXiv:2208.11671 [pdf, other]
Title: Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model
Yixiao Zhang, Junyan Jiang, Gus Xia, Simon Dixon
Comments: Accepted to ISMIR 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[56] arXiv:2208.11920 [pdf, other]
Title: Digital Audio Tampering Detection Based on ENF Spatio-temporal Features Representation Learning
Chunyan Zeng, Shuai Kong, Zhifeng Wang, Xiangkui Wan, Yunfan Chen
Comments: 19 pages, 6 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[57] arXiv:2208.12086 [pdf, other]
Title: A Study on Broadcast Networks for Music Genre Classification
Ahmed Heakl, Abdelrahman Abdelgawad, Victor Parque
Comments: accepted for oral presentation at the World Congress on Computational Intelligence (WCCI 2022) - International Joint Conference on Neural Networks (IJCNN 2022)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[58] arXiv:2208.12208 [pdf, other]
Title: Contrastive Audio-Language Learning for Music
Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas
Comments: Accepted to ISMIR 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[59] arXiv:2208.12387 [pdf, other]
Title: Music Separation Enhancement with Generative Modeling
Noah Schaffer, Boaz Cogan, Ethan Manilow, Max Morrison, Prem Seetharaman, Bryan Pardo
Comments: Accepted to ISMIR 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[60] arXiv:2208.12410 [pdf, other]
Title: Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer
Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi
Comments: accepted to INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[61] arXiv:2208.12485 [pdf, other]
Title: Concept-Based Techniques for "Musicologist-friendly" Explanations in a Deep Music Classifier
Francesco Foscarin, Katharina Hoedt, Verena Praher, Arthur Flexer, Gerhard Widmer
Comments: In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[62] arXiv:2208.12753 [pdf, other]
Title: Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao
Comments: 29 pages, 4 figures
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[63] arXiv:2208.12782 [pdf, other]
Title: Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi, Mark Levy, Richard Sharp
Comments: 7 pages, 5 figures, Proceedings of the 23st International Society for Music Information Retrieval Conference, ISMIR 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[64] arXiv:2208.13066 [pdf, other]
Title: SA: Sliding attack for synthetic speech detection with resistance to clipping and self-splicing
Deng JiaCheng, Dong Li, Yan Diqun, Wang Rangding, Zeng Jiaming
Comments: Updated description and formula
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[65] arXiv:2208.13183 [pdf, other]
Title: Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark
Comments: To be published in Interspeech 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[66] arXiv:2208.13191 [pdf, other]
Title: Towards Disentangled Speech Representations
Cal Peyser, Ronny Huang Andrew Rosenberg Tara N. Sainath, Michael Picheny, Kyunghyun Cho
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[67] arXiv:2208.13285 [pdf, other]
Title: Computing with Hypervectors for Efficient Speaker Identification
Ping-Chen Huang, Denis Kleyko, Jan M. Rabaey, Bruno A. Olshausen, Pentti Kanerva
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[68] arXiv:2208.14017 [pdf, other]
Title: Gridless 3D Recovery of Image Sources from Room Impulse Responses
Tom Sprunck (IRMA, TONUS), Yannick Privat (IRMA, TONUS), Cédric Foy (UMRAE), Antoine Deleforge (MULTISPEECH)
Comments: IEEE Signal Processing Letters, 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP); Classical Physics (physics.class-ph)
[69] arXiv:2208.14339 [pdf, other]
Title: HPPNet: Modeling the Harmonic Structure and Pitch Invariance in Piano Transcription
Weixing Wei, Peilin Li, Yi Yu, Wei Li
Comments: Accepted to ISMIR 2022
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[70] arXiv:2208.14345 [pdf, other]
Title: MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks
Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[71] arXiv:2208.14355 [pdf, other]
Title: Towards robust music source separation on loud commercial music
Chang-Bin Jeon, Kyogu Lee
Comments: Accepted to ISMIR 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[72] arXiv:2208.14717 [pdf, other]
Title: A Real-Time Tempo and Meter Tracking System for Rhythmic Improvis
Filippo Carnovalini, Antonio Rodà
Journal-ref: In Audio Mostly (AM'19), September 18-20, 2019, Nottingham, UK. ACM, New York, NY, USA, 8 pages
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[73] arXiv:2208.14734 [pdf, other]
Title: Open Challenges in Musical Metacreation
Filippo Carnovalini
Journal-ref: In EAI International Conference on Smart Objects and Technologies for Social Good (GoodTechs '19), September 25-27, 2019, Valencia, Spain. ACM, New York, NY, USA, 2 pages
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[74] arXiv:2208.14747 [pdf, other]
Title: A New Corpus for Computational Music Research and A Novel Method for Musical Structure Analysis
Filippo Carnovalini, Antonio Rodà, Nicholas Harley, Steven T. Homer, Geraint A. Wiggins
Journal-ref: In Audio Mostly 2021 (AM '21), September 1-3, 2021, virtual/Trento, Italy. ACM, New York, NY, USA, 4 pages
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[75] arXiv:2208.14750 [pdf, other]
Title: Harmonization and Evaluation; Tweaking the Parameters on Human Listeners
Filippo Carnovalini, Alessandro Pelizzo, Antonio Rodà, Sergio Canazza
Comments: Accepted for publication in 9th International Conference on Kansei Engineering and Emotion Research 2022
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
Total of 138 entries : 1-25 26-50 51-75 76-100 101-125 126-138
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack