Skip to main content

Showing 1–14 of 14 results for author: Foscarin, F

.
  1. arXiv:2503.11373  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models

    Authors: Tobias Morocutti, Florian Schmid, Jonathan Greif, Francesco Foscarin, Gerhard Widmer

    Abstract: We target the problem of developing new low-complexity networks for the sound event detection task. Our goal is to meticulously analyze the performance-complexity trade-off, aiming to be competitive with the large state-of-the-art models, at a fraction of the computational requirements. We find that low-complexity convolutional models previously proposed for audio tagging can be effectively adapte… ▽ More

    Submitted 12 June, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: In Proceedings of the 33rd European Signal Processing Conference (EUSIPCO 2025), Palermo, Italy

  2. arXiv:2409.09546  [pdf, other

    eess.AS cs.SD

    Effective Pre-Training of Audio Transformers for Sound Event Detection

    Authors: Florian Schmid, Tobias Morocutti, Francesco Foscarin, Jan Schlüter, Paul Primus, Gerhard Widmer

    Abstract: We propose a pre-training pipeline for audio spectrogram transformers for frame-level sound event detection tasks. On top of common pre-training steps, we add a meticulously designed training routine on AudioSet frame-level annotations. This includes a balanced sampler, aggressive data augmentation, and ensemble knowledge distillation. For five transformers, we obtain a substantial performance imp… ▽ More

    Submitted 28 November, 2024; v1 submitted 14 September, 2024; originally announced September 2024.

    Comments: Submitted to ICASSP'25. Source code available: https://github.com/fschmid56/PretrainedSED

  3. arXiv:2407.21658  [pdf, other

    cs.SD cs.LG eess.AS

    Beat this! Accurate beat tracking without DBN postprocessing

    Authors: Francesco Foscarin, Jan Schlüter, Gerhard Widmer

    Abstract: We propose a system for tracking beats and downbeats with two objectives: generality across a diverse music range, and high accuracy. We achieve generality by training on multiple datasets -- including solo instrument recordings, pieces with time signature changes, and classical music with high tempo variations -- and by removing the commonly used Dynamic Bayesian Network (DBN) postprocessing, whi… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR), 2024

  4. arXiv:2407.21030  [pdf, other

    eess.AS cs.AI cs.LG

    Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving

    Authors: Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, Gerhard Widmer

    Abstract: This paper approaches the problem of separating the notes from a quantized symbolic music piece (e.g., a MIDI file) into multiple voices and staves. This is a fundamental part of the larger task of music score engraving (or score typesetting), which aims to produce readable musical scores for human performers. We focus on piano music and support homophonic voices, i.e., voices that can contain cho… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted at the 25th International Society for Music Information Retrieval (ISMIR) 2024

  5. arXiv:2405.09241  [pdf, other

    cs.SD eess.AS

    SMUG-Explain: A Framework for Symbolic Music Graph Explanations

    Authors: Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer

    Abstract: In this work, we present Score MUsic Graph (SMUG)-Explain, a framework for generating and visualizing explanations of graph neural networks applied to arbitrary prediction tasks on musical scores. Our system allows the user to visualize the contribution of input notes (and note features) to the network output, directly in the context of the musical score. We provide an interactive interface based… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: In Proceedings of the Sound and Music Computing Conference 2024 (SMC2024), Porto, Portugal

  6. arXiv:2405.09224  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Perception-Inspired Graph Convolution for Music Understanding Tasks

    Authors: Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer

    Abstract: We propose a new graph convolutional block, called MusGConv, specifically designed for the efficient processing of musical score data and motivated by general perceptual principles. It focuses on two fundamental dimensions of music, pitch and rhythm, and considers both relative and absolute representations of these components. We evaluate our approach on four different musical understanding proble… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted at the 33rd International Joint Conference on Artificial Intelligence (IJCAI-24)

  7. arXiv:2310.14952  [pdf, other

    cs.SD eess.AS

    8+8=4: Formalizing Time Units to Handle Symbolic Music Durations

    Authors: Emmanouil Karystinaios, Francesco Foscarin, Florent Jacquemard, Masahiko Sakai, Satoshi Tojo, Gerhard Widmer

    Abstract: This paper focuses on the nominal durations of musical events (notes and rests) in a symbolic musical score, and on how to conveniently handle these in computer applications. We propose the usage of a temporal unit that is directly related to the graphical symbols in musical scores and pair this with a set of operations that cover typical computations in music applications. We formalize this time… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: In Proceedings of the International Symposium on Computer Music Multidisciplinary Research (CMMR 2023), Tokyo, Japan

  8. arXiv:2306.16955  [pdf, other

    cs.SD cs.CL eess.AS

    Predicting Music Hierarchies with a Graph-Based Neural Decoder

    Authors: Francesco Foscarin, Daniel Harasim, Gerhard Widmer

    Abstract: This paper describes a data-driven framework to parse musical sequences into dependency trees, which are hierarchical structures used in music cognition research and music analysis. The parsing involves two steps. First, the input sequence is passed through a transformer encoder to enrich it with contextual information. Then, a classifier filters the graph of all possible dependency arcs to produc… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: To be published in the Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)

  9. arXiv:2304.14848  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem

    Authors: Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer

    Abstract: This paper targets the perceptual task of separating the different interacting voices, i.e., monophonic melodic streams, in a polyphonic musical piece. We target symbolic music, where notes are explicitly encoded, and model this task as a Multi-Trajectory Tracking (MTT) problem from discrete observations, i.e., notes in a pitch-time space. Our approach builds a graph from a musical piece, by creat… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: Accepted at the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23)

  10. arXiv:2304.12939  [pdf, other

    cs.SD cs.HC eess.AS

    The ACCompanion: Combining Reactivity, Robustness, and Musical Expressivity in an Automatic Piano Accompanist

    Authors: Carlos Cancino-Chacón, Silvan Peter, Patricia Hu, Emmanouil Karystinaios, Florian Henkel, Francesco Foscarin, Nimrod Varga, Gerhard Widmer

    Abstract: This paper introduces the ACCompanion, an expressive accompaniment system. Similarly to a musician who accompanies a soloist playing a given musical piece, our system can produce a human-like rendition of the accompaniment part that follows the soloist's choices in terms of tempo, dynamics, and articulation. The ACCompanion works in the symbolic domain, i.e., it needs a musical instrument capable… ▽ More

    Submitted 30 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23), Macao, China. The differences/extensions with the previous version include a technical appendix, added missing links, and minor text updates. 10 pages, 4 figures

  11. arXiv:2208.12485  [pdf, other

    cs.SD cs.AI eess.AS

    Concept-Based Techniques for "Musicologist-friendly" Explanations in a Deep Music Classifier

    Authors: Francesco Foscarin, Katharina Hoedt, Verena Praher, Arthur Flexer, Gerhard Widmer

    Abstract: Current approaches for explaining deep learning systems applied to musical data provide results in a low-level feature space, e.g., by highlighting potentially relevant time-frequency bins in a spectrogram or time-pitch bins in a piano roll. This can be difficult to understand, particularly for musicologists without technical knowledge. To address this issue, we focus on more human-friendly explan… ▽ More

    Submitted 29 August, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India

  12. arXiv:2206.01104  [pdf, other

    cs.SD cs.DL eess.AS

    The match file format: Encoding Alignments between Scores and Performances

    Authors: Francesco Foscarin, Emmanouil Karystinaios, Silvan David Peter, Carlos Cancino-Chacón, Maarten Grachten, Gerhard Widmer

    Abstract: This paper presents the specifications of match: a file format that extends a MIDI human performance with note-, beat-, and downbeat-level alignments to a corresponding musical score. This enables advanced analyses of the performance that are relevant for various tasks, such as expressive performance modeling, score following, music transcription, and performer classification. The match file inclu… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Journal ref: Proceedings of the Music Encoding Conference (MEC), 2022, Halifax, Canada

  13. arXiv:2206.01071  [pdf, other

    cs.SD cs.DL eess.AS

    Partitura: A Python Package for Symbolic Music Processing

    Authors: Carlos Cancino-Chacón, Silvan David Peter, Emmanouil Karystinaios, Francesco Foscarin, Maarten Grachten, Gerhard Widmer

    Abstract: Partitura is a lightweight Python package for handling symbolic musical information. It provides easy access to features commonly used in music information retrieval tasks, like note arrays (lists of timed pitched events) and 2D piano roll matrices, as well as other score elements such as time and key signatures, performance directives, and repeat structures. Partitura can load musical scores (in… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Journal ref: Proceedings of the Music Encoding Conference (MEC), 2022, Halifax, Canada

  14. arXiv:2107.14009  [pdf, other

    cs.SD eess.AS

    PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation

    Authors: Francesco Foscarin, Nicolas Audebert, Raphaël Fournier-S'Niehotta

    Abstract: We present PKSpell: a data-driven approach for the joint estimation of pitch spelling and key signatures from MIDI files. Both elements are fundamental for the production of a full-fledged musical score and facilitate many MIR tasks such as harmonic analysis, section identification, melodic similarity, and search in a digital music library. We design a deep recurrent neural network model that only… ▽ More

    Submitted 27 July, 2021; originally announced July 2021.

    Comments: International Society for Music Information Retrieval Conference (ISMIR), Nov 2021, Online, India