Skip to main content

Showing 1–21 of 21 results for author: Attanasio, G

.
  1. arXiv:2506.06275  [pdf, ps, other

    cs.CV cs.CL cs.LG

    Movie Facts and Fibs (MF$^2$): A Benchmark for Long Movie Understanding

    Authors: Emmanouil Zaranis, António Farinhas, Saul Santos, Beatriz Canaverde, Miguel Moura Ramos, Aditya K Surikuchi, André Viveiros, Baohao Liao, Elena Bueno-Benito, Nithin Sivakumaran, Pavlo Vasylenko, Shoubin Yu, Sonal Sannigrahi, Wafaa Mohammed, Ben Peters, Danae Sánchez Villegas, Elias Stengel-Eskin, Giuseppe Attanasio, Jaehong Yoon, Stella Frank, Alessandro Suglia, Chrysoula Zerva, Desmond Elliott, Mariella Dimiccoli, Mohit Bansal , et al. (6 additional authors not shown)

    Abstract: Despite recent progress in vision-language models (VLMs), holistic understanding of long-form video content remains a significant challenge, partly due to limitations in current benchmarks. Many focus on peripheral, ``needle-in-a-haystack'' details, encouraging context-insensitive retrieval over deep comprehension. Others rely on large-scale, semi-automatically generated questions (often produced… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: Under Review

  2. arXiv:2506.02172  [pdf, ps, other

    cs.CL

    Different Speech Translation Models Encode and Translate Speaker Gender Differently

    Authors: Dennis Fucci, Marco Gaido, Matteo Negri, Luisa Bentivogli, Andre Martins, Giuseppe Attanasio

    Abstract: Recent studies on interpreting the hidden states of speech models have shown their ability to capture speaker-specific features, including gender. Does this finding also hold for speech translation (ST) models? If so, what are the implications for the speaker's gender assignment in translation? We address these questions from an interpretability perspective, using probing methods to assess gender… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted at ACL 2025

  3. arXiv:2505.03926  [pdf, ps, other

    physics.optics

    Broadband acousto-optic modulators on Silicon Nitride

    Authors: Scott E. Kenning, Tzu-Han Chang, Alaina G. Attanasio, Warren Jin, Avi Feshali, Yu Tian, Mario Paniccia, Sunil A. Bhave

    Abstract: Stress-optic modulators are emerging as a necessary building block of photonic integrated circuits tasked with controlling and manipulating classical and quantum optical systems. While photonic platforms such as lithium niobate and silicon on insulator have well developed modulator ecosystems, silicon nitride so far does not. As silicon nitride has favorable optical properties, such as ultra-low-l… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 11 pages, 5 figures

  4. arXiv:2501.10057  [pdf, other

    cs.CL

    MSTS: A Multimodal Safety Test Suite for Vision-Language Models

    Authors: Paul Röttger, Giuseppe Attanasio, Felix Friedrich, Janis Goldzycher, Alicia Parrish, Rishabh Bhardwaj, Chiara Di Bonaventura, Roman Eng, Gaia El Khoury Geagea, Sujata Goswami, Jieun Han, Dirk Hovy, Seogyeong Jeong, Paloma Jeretič, Flor Miriam Plaza-del-Arco, Donya Rooein, Patrick Schramowski, Anastassia Shaitarova, Xudong Shen, Richard Willats, Andrea Zugarini, Bertie Vidgen

    Abstract: Vision-language models (VLMs), which process image and text inputs, are increasingly integrated into chat assistants and other consumer AI applications. Without proper safeguards, however, VLMs may give harmful advice (e.g. how to self-harm) or encourage unsafe behaviours (e.g. to consume drugs). Despite these clear hazards, little work so far has evaluated VLM safety and the novel risks created b… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: under review

  5. arXiv:2410.10995  [pdf, ps, other

    cs.CL

    Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation

    Authors: Emmanouil Zaranis, Giuseppe Attanasio, Sweta Agrawal, André F. T. Martins

    Abstract: Quality estimation (QE)-the automatic assessment of translation quality-has recently become crucial across several stages of the translation pipeline, from data curation to training and decoding. While QE metrics have been optimized to align with human judgments, whether they encode social biases has been largely overlooked. Biased QE risks favoring certain demographic groups over others, e.g., by… ▽ More

    Submitted 2 June, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: ACL 2025

  6. arXiv:2406.06131  [pdf, other

    cs.CL

    Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German

    Authors: Manuel Lardelli, Giuseppe Attanasio, Anne Lauscher

    Abstract: The translation of gender-neutral person-referring terms (e.g., the students) is often non-trivial. Translating from English into German poses an interesting case -- in German, person-referring nouns are usually gender-specific, and if the gender of the referent(s) is unknown or diverse, the generic masculine (die Studenten (m.)) is commonly used. This solution, however, reduces the visibility of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings of ACL 2024. Code and data at https://github.com/g8a9/building-bridges-gender-fair-german-mt

  7. arXiv:2403.04445  [pdf, other

    cs.CL

    Classist Tools: Social Class Correlates with Performance in NLP

    Authors: Amanda Cercas Curry, Giuseppe Attanasio, Zeerak Talat, Dirk Hovy

    Abstract: Since the foundational work of William Labov on the social stratification of language (Labov, 1964), linguistics has made concentrated efforts to explore the links between sociodemographic characteristics and language production and perception. But while there is strong evidence for socio-demographic characteristics in language, they are infrequently used in Natural Language Processing (NLP). Age… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  8. arXiv:2402.17954  [pdf, other

    cs.CL

    Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps

    Authors: Giuseppe Attanasio, Beatrice Savoldi, Dennis Fucci, Dirk Hovy

    Abstract: Current automatic speech recognition (ASR) models are designed to be used across many languages and tasks without substantial changes. However, this broad language coverage hides performance gaps within languages, for example, across genders. Our study systematically evaluates the performance of two widely used multilingual ASR models on three datasets, encompassing 19 languages from eight languag… ▽ More

    Submitted 3 October, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at EMNLP 2024. Code and artifacts at https://github.com/g8a9/multilingual-asr-gender-gap

  9. arXiv:2310.12127  [pdf, other

    cs.CL cs.LG

    A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation

    Authors: Giuseppe Attanasio, Flor Miriam Plaza-del-Arco, Debora Nozza, Anne Lauscher

    Abstract: Recent instruction fine-tuned models can solve multiple NLP tasks when prompted to do so, with machine translation (MT) being a prominent use case. However, current research often focuses on standard performance benchmarks, leaving compelling fairness and ethical considerations behind. In MT, this might lead to misgendered translations, resulting, among other harms, in the perpetuation of stereoty… ▽ More

    Submitted 25 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023. Code and data at https://github.com/MilaNLProc/interpretability-mt-gender-bias

  10. arXiv:2309.07875  [pdf, other

    cs.CL

    Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions

    Authors: Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio, Paul Röttger, Dan Jurafsky, Tatsunori Hashimoto, James Zou

    Abstract: Training large language models to follow instructions makes them perform better on a wide range of tasks and generally become more helpful. However, a perfectly helpful model will follow even the most malicious instructions and readily generate harmful content. In this paper, we raise concerns over the safety of models that only emphasize helpfulness, not harmlessness, in their instruction-tuning.… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  11. arXiv:2309.07733  [pdf, other

    cs.CL cs.SD eess.AS

    Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features

    Authors: Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy, Elena Baralis

    Abstract: Recent advances in eXplainable AI (XAI) have provided new insights into how models for vision, language, and tabular data operate. However, few approaches exist for understanding speech models. Existing work focuses on a few spoken language understanding (SLU) tasks, and explanations are difficult to interpret for most users. We introduce a new approach to explain speech classification models. We… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 8 pages

  12. arXiv:2309.02311  [pdf, other

    cs.CL

    Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization

    Authors: Helena Bonaldi, Giuseppe Attanasio, Debora Nozza, Marco Guerini

    Abstract: Recent computational approaches for combating online hate speech involve the automatic generation of counter narratives by adapting Pretrained Transformer-based Language Models (PLMs) with human-curated data. This process, however, can produce in-domain overfitting, resulting in models generating acceptable narratives only for hatred similar to training data, with little portability to other targe… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: To appear at CS4OA workshop (INLG-SIGDial)

  13. arXiv:2308.01263  [pdf, other

    cs.CL cs.AI

    XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

    Authors: Paul Röttger, Hannah Rose Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy

    Abstract: Without proper safeguards, large language models will readily follow malicious instructions and generate toxic content. This risk motivates safety efforts such as red-teaming and large-scale feedback learning, which aim to make models both helpful and harmless. However, there is a tension between these two objectives, since harmlessness requires models to refuse to comply with unsafe prompts, and… ▽ More

    Submitted 1 April, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: Accepted at NAACL 2024 (Main Conference)

  14. ITALIC: An Italian Intent Classification Dataset

    Authors: Alkis Koudounas, Moreno La Quatra, Lorenzo Vaiani, Luca Colomba, Giuseppe Attanasio, Eliana Pastor, Luca Cagliero, Elena Baralis

    Abstract: Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects. We introduce ITALIC, the first large-scale speech dataset designed for intent classification in Italian. The dataset comprises 16,521 crowdsourced audio samples recorded by 70 speakers from various Itali… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023. Data and code at https://github.com/RiTA-nlp/ITALIC

  15. arXiv:2304.10621  [pdf, other

    cs.IR

    E Pluribus Unum: Guidelines on Multi-Objective Evaluation of Recommender Systems

    Authors: Patrick John Chia, Giuseppe Attanasio, Jacopo Tagliabue, Federico Bianchi, Ciro Greco, Gabriel de Souza P. Moreira, Davide Eynard, Fahd Husain

    Abstract: Recommender Systems today are still mostly evaluated in terms of accuracy, with other aspects beyond the immediate relevance of recommendations, such as diversity, long-term user retention and fairness, often taking a back seat. Moreover, reconciling multiple performance perspectives is by definition indeterminate, presenting a stumbling block to those in the pursuit of rounded evaluation of Recom… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 15 pages, under submission

  16. arXiv:2210.07365  [pdf, other

    cs.CL

    Is It Worth the (Environmental) Cost? Limited Evidence for Temporal Adaptation via Continuous Training

    Authors: Giuseppe Attanasio, Debora Nozza, Federico Bianchi, Dirk Hovy

    Abstract: Language is constantly changing and evolving, leaving language models to become quickly outdated. Consequently, we should continuously update our models with new data to expose them to new events and facts. However, that requires additional computing, which means new carbon emissions. Do any measurable benefits justify this cost? This paper looks for empirical evidence to support continuous traini… ▽ More

    Submitted 4 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 8 pages

  17. ferret: a Framework for Benchmarking Explainers on Transformers

    Authors: Giuseppe Attanasio, Eliana Pastor, Chiara Di Bonaventura, Debora Nozza

    Abstract: As Transformers are increasingly relied upon to solve complex NLP problems, there is an increased need for their decisions to be humanly interpretable. While several explainable AI (XAI) techniques for interpreting the outputs of transformer-based models have been proposed, there is still a lack of easy access to using and comparing them. We introduce ferret, a Python library to simplify the use a… ▽ More

    Submitted 2 March, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: 11 pages, 3 figures. Accepted to EACL 2023 (System Demonstration). More details at https://github.com/g8a9/ferret

  18. arXiv:2207.05772  [pdf, ps, other

    cs.IR

    EvalRS: a Rounded Evaluation of Recommender Systems

    Authors: Jacopo Tagliabue, Federico Bianchi, Tobias Schnabel, Giuseppe Attanasio, Ciro Greco, Gabriel de Souza P. Moreira, Patrick John Chia

    Abstract: Much of the complexity of Recommender Systems (RSs) comes from the fact that they are used as part of more complex applications and affect user experience through a varied range of user interfaces. However, research focused almost exclusively on the ability of RSs to produce accurate item rankings while giving little attention to the evaluation of RS behavior in real-world scenarios. Such narrow f… ▽ More

    Submitted 12 August, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: CIKM 2022 Data Challenge Paper

  19. arXiv:2204.03972  [pdf, other

    cs.IR cs.CL

    Contrastive language and vision learning of general fashion concepts

    Authors: Patrick John Chia, Giuseppe Attanasio, Federico Bianchi, Silvia Terragni, Ana Rita Magalhães, Diogo Goncalves, Ciro Greco, Jacopo Tagliabue

    Abstract: The steady rise of online shopping goes hand in hand with the development of increasingly complex ML and NLP models. While most use cases are cast as specialized supervised learning problems, we argue that practitioners would greatly benefit from more transferable representations of products. In this work, we build on recent developments in contrastive learning to train FashionCLIP, a CLIP-like mo… ▽ More

    Submitted 18 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Latest version available at https://www.nature.com/articles/s41598-022-23052-9; model available at https://huggingface.co/patrickjohncyh/fashion-clip

  20. arXiv:2203.09192  [pdf, other

    cs.CL

    Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists

    Authors: Giuseppe Attanasio, Debora Nozza, Dirk Hovy, Elena Baralis

    Abstract: Natural Language Processing (NLP) models risk overfitting to specific terms in the training data, thereby reducing their performance, fairness, and generalizability. E.g., neural hate speech detection models are strongly influenced by identity terms like gay, or women, resulting in false positives, severe unintended bias, and lower performance. Most mitigation techniques use lists of identity term… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of ACL 2022

  21. arXiv:2108.08688  [pdf, other

    cs.CL cs.CV

    Contrastive Language-Image Pre-training for the Italian Language

    Authors: Federico Bianchi, Giuseppe Attanasio, Raphael Pisoni, Silvia Terragni, Gabriele Sarti, Sri Lakshmi

    Abstract: CLIP (Contrastive Language-Image Pre-training) is a very recent multi-modal model that jointly learns representations of images and texts. The model is trained on a massive amount of English data and shows impressive performance on zero-shot classification tasks. Training the same model on a different language is not trivial, since data in other languages might be not enough and the model needs hi… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.