Skip to main content

Showing 1–6 of 6 results for author: Pasa, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.14586  [pdf, other

    cs.LG cs.CR

    Moshi Moshi? A Model Selection Hijacking Adversarial Attack

    Authors: Riccardo Petrucci, Luca Pajola, Francesco Marchiori, Luca Pasa, Mauro conti

    Abstract: Model selection is a fundamental task in Machine Learning~(ML), focusing on selecting the most suitable model from a pool of candidates by evaluating their performance on specific metrics. This process ensures optimal performance, computational efficiency, and adaptability to diverse tasks and environments. Despite its critical role, its security from the perspective of adversarial ML remains unex… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  2. arXiv:2401.14296  [pdf, other

    cs.CR cs.LG cs.SI

    "All of Me": Mining Users' Attributes from their Public Spotify Playlists

    Authors: Pier Paolo Tricomi, Luca Pajola, Luca Pasa, Mauro Conti

    Abstract: In the age of digital music streaming, playlists on platforms like Spotify have become an integral part of individuals' musical experiences. People create and publicly share their own playlists to express their musical tastes, promote the discovery of their favorite artists, and foster social connections. These publicly accessible playlists transcend the boundaries of mere musical preferences: the… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  3. arXiv:2106.05809  [pdf, other

    cs.LG

    Simple Graph Convolutional Networks

    Authors: Luca Pasa, Nicolò Navarin, Wolfgang Erb, Alessandro Sperduti

    Abstract: Many neural networks for graphs are based on the graph convolution operator, proposed more than a decade ago. Since then, many alternative definitions have been proposed, that tend to add complexity (and non-linearity) to the model. In this paper, we follow the opposite direction by proposing simple graph convolution operators, that can be implemented in single-layer graph convolutional networks.… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  4. arXiv:1912.02671  [pdf, other

    eess.AS cs.LG eess.IV

    Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

    Authors: Ander Arriandiaga, Giovanni Morrone, Luca Pasa, Leonardo Badino, Chiara Bartolozzi

    Abstract: We propose a method to address audio-visual target speaker enhancement in multi-talker environments using event-driven cameras. State of the art audio-visual speech separation methods shows that crucial information is the movement of the facial landmarks related to speech production. However, all approaches proposed so far work offline, using frame-based video input, making it difficult to process… ▽ More

    Submitted 22 February, 2021; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Accepted at ISCAS 2021

  5. arXiv:1904.08248  [pdf, ps, other

    eess.AS cs.CL cs.SD stat.ML

    An Analysis of Speech Enhancement and Recognition Losses in Limited Resources Multi-talker Single Channel Audio-Visual ASR

    Authors: Luca Pasa, Giovanni Morrone, Leonardo Badino

    Abstract: In this paper, we analyzed how audio-visual speech enhancement can help to perform the ASR task in a cocktail party scenario. Therefore we considered two simple end-to-end LSTM-based models that perform single-channel audio-visual speech enhancement and phone recognition respectively. Then, we studied how the two models interact, and how to train them jointly affects the final result. We analyzed… ▽ More

    Submitted 27 November, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

  6. arXiv:1811.02480  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments

    Authors: Giovanni Morrone, Luca Pasa, Vadim Tikhanoff, Sonia Bergamaschi, Luciano Fadiga, Leonardo Badino

    Abstract: In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available. Contrary to most previous studies, we do not learn visual features on the typically small audio-visual datasets, but use an already available face landmark detector (trained on a separate image dataset). The landmarks a… ▽ More

    Submitted 2 May, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)