Skip to main content

Showing 1–3 of 3 results for author: Schreiber, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.06370  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Lyrics Transcription for Humans: A Readability-Aware Benchmark

    Authors: Ondřej Cífka, Hendrik Schreiber, Luke Miner, Fabian-Robert Stöter

    Abstract: Writing down lyrics for human consumption involves not only accurately capturing word sequences, but also incorporating punctuation and formatting for clarity and to convey contextual information. This includes song structure, emotional emphasis, and contrast between lead and background vocals. While automatic lyrics transcription (ALT) systems have advanced beyond producing unstructured strings o… ▽ More

    Submitted 30 July, 2024; originally announced August 2024.

    Comments: ISMIR 2024 camera-ready. 6 pages + references + supplementary material. Website https://audioshake.github.io/jam-alt/ Data https://huggingface.co/datasets/audioshake/jam-alt Code https://github.com/audioshake/alt-eval/. arXiv admin note: text overlap with arXiv:2311.13987

  2. arXiv:2311.13987  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark

    Authors: Ondřej Cífka, Constantinos Dimitriou, Cheng-i Wang, Hendrik Schreiber, Luke Miner, Fabian-Robert Stöter

    Abstract: Current automatic lyrics transcription (ALT) benchmarks focus exclusively on word content and ignore the finer nuances of written lyrics including formatting and punctuation, which leads to a potential misalignment with the creative products of musicians and songwriters as well as listeners' experiences. For example, line breaks are important in conveying information about rhythm, emotional emphas… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: 6 pages (3 pages main content); website: https://audioshake.github.io/jam-alt/; data: https://huggingface.co/datasets/audioshake/jam-alt; code: https://github.com/audioshake/alt-eval/

  3. arXiv:1903.10839  [pdf, other

    cs.SD cs.LG eess.AS

    Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters

    Authors: Hendrik Schreiber, Meinard Müller

    Abstract: In this article we explore how the different semantics of spectrograms' time and frequency axes can be exploited for musical tempo and key estimation using Convolutional Neural Networks (CNN). By addressing both tasks with the same network architectures ranging from shallow, domain-specific approaches to deep variants with directional filters, we show that axis-aligned architectures perform simila… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

    Comments: Sound & Music Computing Conference (SMC), Málaga, Spain, May 2019