-
Explaining YOLO: Leveraging Grad-CAM to Explain Object Detections
Authors:
Armin Kirchknopf,
Djordje Slijepcevic,
Ilkay Wunderlich,
Michael Breiter,
Johannes Traxler,
Matthias Zeppelzauer
Abstract:
We investigate the problem of explainability for visual object detectors. Specifically, we demonstrate on the example of the YOLO object detector how to integrate Grad-CAM into the model architecture and analyze the results. We show how to compute attribution-based explanations for individual detections and find that the normalization of the results has a great impact on their interpretation.
We investigate the problem of explainability for visual object detectors. Specifically, we demonstrate on the example of the YOLO object detector how to integrate Grad-CAM into the model architecture and analyze the results. We show how to compute attribution-based explanations for individual detections and find that the normalization of the results has a great impact on their interpretation.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Automatic Sexism Detection with Multilingual Transformer Models
Authors:
Mina Schütz,
Jaqueline Boeck,
Daria Liakhovets,
Djordje Slijepčević,
Armin Kirchknopf,
Manuel Hecht,
Johannes Bogensperger,
Sven Schlarb,
Alexander Schindler,
Matthias Zeppelzauer
Abstract:
Sexism has become an increasingly major problem on social networks during the last years. The first shared task on sEXism Identification in Social neTworks (EXIST) at IberLEF 2021 is an international competition in the field of Natural Language Processing (NLP) with the aim to automatically identify sexism in social media content by applying machine learning methods. Thereby sexism detection is fo…
▽ More
Sexism has become an increasingly major problem on social networks during the last years. The first shared task on sEXism Identification in Social neTworks (EXIST) at IberLEF 2021 is an international competition in the field of Natural Language Processing (NLP) with the aim to automatically identify sexism in social media content by applying machine learning methods. Thereby sexism detection is formulated as a coarse (binary) classification problem and a fine-grained classification task that distinguishes multiple types of sexist content (e.g., dominance, stereotyping, and objectification). This paper presents the contribution of the AIT_FHSTP team at the EXIST2021 benchmark for both tasks. To solve the tasks we applied two multilingual transformer models, one based on multilingual BERT and one based on XLM-R. Our approach uses two different strategies to adapt the transformers to the detection of sexist content: first, unsupervised pre-training with additional data and second, supervised fine-tuning with additional and augmented data. For both tasks our best model is XLM-R with unsupervised pre-training on the EXIST data and additional datasets and fine-tuning on the provided dataset. The best run for the binary classification (task 1) achieves a macro F1-score of 0.7752 and scores 5th rank in the benchmark; for the multiclass classification (task 2) our best submission scores 6th rank with a macro F1-score of 0.5589.
△ Less
Submitted 8 February, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
Multimodal Detection of Information Disorder from Social Media
Authors:
Armin Kirchknopf,
Djordje Slijepcevic,
Matthias Zeppelzauer
Abstract:
Social media is accompanied by an increasing proportion of content that provides fake information or misleading content, known as information disorder. In this paper, we study the problem of multimodal fake news detection on a largescale multimodal dataset. We propose a multimodal network architecture that enables different levels and types of information fusion. In addition to the textual and vis…
▽ More
Social media is accompanied by an increasing proportion of content that provides fake information or misleading content, known as information disorder. In this paper, we study the problem of multimodal fake news detection on a largescale multimodal dataset. We propose a multimodal network architecture that enables different levels and types of information fusion. In addition to the textual and visual content of a posting, we further leverage secondary information, i.e. user comments and metadata. We fuse information at multiple levels to account for the specific intrinsic structure of the modalities. Our results show that multimodal analysis is highly effective for the task and all modalities contribute positively when fused properly.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.