Search | arXiv e-print repository

doi 10.1145/3726302.3730255

The Effects of Demographic Instructions on LLM Personas

Authors: Angel Felipe Magnossão de Paula, J. Shane Culpepper, Alistair Moffat, Sachin Pathiyan Cherumanal, Falk Scholer, Johanne Trippas

Abstract: Social media platforms must filter sexist content in compliance with governmental regulations. Current machine learning approaches can reliably detect sexism based on standardized definitions, but often neglect the subjective nature of sexist language and fail to consider individual users' perspectives. To address this gap, we adopt a perspectivist approach, retaining diverse annotations rather th… ▽ More Social media platforms must filter sexist content in compliance with governmental regulations. Current machine learning approaches can reliably detect sexism based on standardized definitions, but often neglect the subjective nature of sexist language and fail to consider individual users' perspectives. To address this gap, we adopt a perspectivist approach, retaining diverse annotations rather than enforcing gold-standard labels or their aggregations, allowing models to account for personal or group-specific views of sexism. Using demographic data from Twitter, we employ large language models (LLMs) to personalize the identification of sexism. △ Less

Submitted 16 May, 2025; originally announced May 2025.

Comments: Accepted at SIGIR'25, Padua, Italy

arXiv:2401.07216 [pdf, other]

doi 10.1145/3627508.3638309

Walert: Putting Conversational Search Knowledge into Action by Building and Evaluating a Large Language Model-Powered Chatbot

Authors: Sachin Pathiyan Cherumanal, Lin Tian, Futoon M. Abushaqra, Angel Felipe Magnossao de Paula, Kaixin Ji, Danula Hettiachchi, Johanne R. Trippas, Halil Ali, Falk Scholer, Damiano Spina

Abstract: Creating and deploying customized applications is crucial for operational success and enriching user experiences in the rapidly evolving modern business world. A prominent facet of modern user experiences is the integration of chatbots or voice assistants. The rapid evolution of Large Language Models (LLMs) has provided a powerful tool to build conversational applications. We present Walert, a cus… ▽ More Creating and deploying customized applications is crucial for operational success and enriching user experiences in the rapidly evolving modern business world. A prominent facet of modern user experiences is the integration of chatbots or voice assistants. The rapid evolution of Large Language Models (LLMs) has provided a powerful tool to build conversational applications. We present Walert, a customized LLM-based conversational agent able to answer frequently asked questions about computer science degrees and programs at RMIT University. Our demo aims to showcase how conversational information-seeking researchers can effectively communicate the benefits of using best practices to stakeholders interested in developing and deploying LLM-based chatbots. These practices are well-known in our community but often overlooked by practitioners who may not have access to this knowledge. The methodology and resources used in this demo serve as a bridge to facilitate knowledge transfer from experts, address industry professionals' practical needs, and foster a collaborative environment. The data and code of the demo are available at https://github.com/rmit-ir/walert. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: Accepted at 2024 ACM SIGIR CHIIR

arXiv:2307.03385 [pdf, other]

AI-UPV at EXIST 2023 -- Sexism Characterization Using Large Language Models Under The Learning with Disagreements Regime

Authors: Angel Felipe Magnossão de Paula, Giulia Rizzi, Elisabetta Fersini, Damiano Spina

Abstract: With the increasing influence of social media platforms, it has become crucial to develop automated systems capable of detecting instances of sexism and other disrespectful and hateful behaviors to promote a more inclusive and respectful online environment. Nevertheless, these tasks are considerably challenging considering different hate categories and the author's intentions, especially under the… ▽ More With the increasing influence of social media platforms, it has become crucial to develop automated systems capable of detecting instances of sexism and other disrespectful and hateful behaviors to promote a more inclusive and respectful online environment. Nevertheless, these tasks are considerably challenging considering different hate categories and the author's intentions, especially under the learning with disagreements regime. This paper describes AI-UPV team's participation in the EXIST (sEXism Identification in Social neTworks) Lab at CLEF 2023. The proposed approach aims at addressing the task of sexism identification and characterization under the learning with disagreements paradigm by training directly from the data with disagreements, without using any aggregated label. Yet, performances considering both soft and hard evaluations are reported. The proposed system uses large language models (i.e., mBERT and XLM-RoBERTa) and ensemble strategies for sexism identification and classification in English and Spanish. In particular, our system is articulated in three different pipelines. The ensemble approach outperformed the individual large language models obtaining the best performances both adopting a soft and a hard label evaluation. This work describes the participation in all the three EXIST tasks, considering a soft evaluation, it obtained fourth place in Task 2 at EXIST and first place in Task 3, with the highest ICM-Soft of -2.32 and a normalized ICM-Soft of 0.79. The source code of our approaches is publicly available at https://github.com/AngelFelipeMP/Sexism-LLM-Learning-With-Disagreement. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 15 pages, 9 tables, 1 figures, conference

arXiv:2307.03377 [pdf, ps, other]

Mitigating Negative Transfer with Task Awareness for Sexism, Hate Speech, and Toxic Language Detection

Authors: Angel Felipe Magnossão de Paula, Paolo Rosso, Damiano Spina

Abstract: This paper proposes a novelty approach to mitigate the negative transfer problem. In the field of machine learning, the common strategy is to apply the Single-Task Learning approach in order to train a supervised model to solve a specific task. Training a robust model requires a lot of data and a significant amount of computational resources, making this solution unfeasible in cases where data are… ▽ More This paper proposes a novelty approach to mitigate the negative transfer problem. In the field of machine learning, the common strategy is to apply the Single-Task Learning approach in order to train a supervised model to solve a specific task. Training a robust model requires a lot of data and a significant amount of computational resources, making this solution unfeasible in cases where data are unavailable or expensive to gather. Therefore another solution, based on the sharing of information between tasks, has been developed: Multi-Task Learning (MTL). Despite the recent developments regarding MTL, the problem of negative transfer has still to be solved. Negative transfer is a phenomenon that occurs when noisy information is shared between tasks, resulting in a drop in performance. This paper proposes a new approach to mitigate the negative transfer problem based on the task awareness concept. The proposed approach results in diminishing the negative transfer together with an improvement of performance over classic MTL solution. Moreover, the proposed approach has been implemented in two unified architectures to detect Sexism, Hate Speech, and Toxic Language in text comments. The proposed architectures set a new state-of-the-art both in EXIST-2021 and HatEval-2019 benchmarks. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 8 pages, 2 figures, 5 tables, IJCNN 2023 conference

arXiv:2303.09823 [pdf, other]

Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic languages

Authors: Angel Felipe Magnossão de Paula, Imene Bensalem, Paolo Rosso, Wajdi Zaghouani

Abstract: This paper describes our participation in the shared task of hate speech detection, which is one of the subtasks of the CERIST NLP Challenge 2022. Our experiments evaluate the performance of six transformer models and their combination using 2 ensemble approaches. The best results on the training set, in a five-fold cross validation scenario, were obtained by using the ensemble approach based on t… ▽ More This paper describes our participation in the shared task of hate speech detection, which is one of the subtasks of the CERIST NLP Challenge 2022. Our experiments evaluate the performance of six transformer models and their combination using 2 ensemble approaches. The best results on the training set, in a five-fold cross validation scenario, were obtained by using the ensemble approach based on the majority vote. The evaluation of this approach on the test set resulted in an F1-score of 0.60 and an Accuracy of 0.86. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: 7 pages, 3 tables

arXiv:2112.06080 [pdf, other]

UPV at TREC Health Misinformation Track 2021 Ranking with SBERT and Quality Estimators

Authors: Ipek Baris Schlicht, Angel Felipe Magnossão de Paula, Paolo Rosso

Abstract: Health misinformation on search engines is a significant problem that could negatively affect individuals or public health. To mitigate the problem, TREC organizes a health misinformation track. This paper presents our submissions to this track. We use a BM25 and a domain-specific semantic search engine for retrieving initial documents. Later, we examine a health news schema for quality assessment… ▽ More Health misinformation on search engines is a significant problem that could negatively affect individuals or public health. To mitigate the problem, TREC organizes a health misinformation track. This paper presents our submissions to this track. We use a BM25 and a domain-specific semantic search engine for retrieving initial documents. Later, we examine a health news schema for quality assessment and apply it to re-rank documents. We merge the scores from the different components by using reciprocal rank fusion. Finally, we discuss the results and conclude with future works. △ Less

Submitted 11 December, 2021; originally announced December 2021.

Comments: 6 pages; presented at the TREC 2021

arXiv:2111.04551 [pdf, other]

Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Authors: Angel Felipe Magnossão de Paula, Roberto Fray da Silva, Ipek Baris Schlicht

Abstract: The popularity of social media has created problems such as hate speech and sexism. The identification and classification of sexism in social media are very relevant tasks, as they would allow building a healthier social environment. Nevertheless, these tasks are considerably challenging. This work proposes a system to use multilingual and monolingual BERT and data points translation and ensemble… ▽ More The popularity of social media has created problems such as hate speech and sexism. The identification and classification of sexism in social media are very relevant tasks, as they would allow building a healthier social environment. Nevertheless, these tasks are considerably challenging. This work proposes a system to use multilingual and monolingual BERT and data points translation and ensemble strategies for sexism identification and classification in English and Spanish. It was conducted in the context of the sEXism Identification in Social neTworks shared 2021 (EXIST 2021) task, proposed by the Iberian Languages Evaluation Forum (IberLEF). The proposed system and its main components are described, and an in-depth hyperparameters analysis is conducted. The main results observed were: (i) the system obtained better results than the baseline model (multilingual BERT); (ii) ensemble models obtained better results than monolingual models; and (iii) an ensemble model considering all individual models and the best standardized values obtained the best accuracies and F1-scores for both tasks. This work obtained first place in both tasks at EXIST, with the highest accuracies (0.780 for task 1 and 0.658 for task 2) and F1-scores (F1-binary of 0.780 for task 1 and F1-macro of 0.579 for task 2). △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: 18 pages, presented at IberLEF: http://ceur-ws.org/Vol-2943/exist_paper2.pdf, the best scoring system at EXIST

arXiv:2111.04530 [pdf, other]

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

Authors: Angel Felipe Magnossão de Paula, Ipek Baris Schlicht

Abstract: This paper describes our participation in the DEtection of TOXicity in comments In Spanish (DETOXIS) shared task 2021 at the 3rd Workshop on Iberian Languages Evaluation Forum. The shared task is divided into two related classification tasks: (i) Task 1: toxicity detection and; (ii) Task 2: toxicity level detection. They focus on the xenophobic problem exacerbated by the spread of toxic comments p… ▽ More This paper describes our participation in the DEtection of TOXicity in comments In Spanish (DETOXIS) shared task 2021 at the 3rd Workshop on Iberian Languages Evaluation Forum. The shared task is divided into two related classification tasks: (i) Task 1: toxicity detection and; (ii) Task 2: toxicity level detection. They focus on the xenophobic problem exacerbated by the spread of toxic comments posted in different online news articles related to immigration. One of the necessary efforts towards mitigating this problem is to detect toxicity in the comments. Our main objective was to implement an accurate model to detect xenophobia in comments about web news articles within the DETOXIS shared task 2021, based on the competition's official metrics: the F1-score for Task 1 and the Closeness Evaluation Metric (CEM) for Task 2. To solve the tasks, we worked with two types of machine learning models: (i) statistical models and (ii) Deep Bidirectional Transformers for Language Understanding (BERT) models. We obtained our best results in both tasks using BETO, an BERT model trained on a big Spanish corpus. We obtained the 3rd place in Task 1 official ranking with the F1-score of 0.5996, and we achieved the 6th place in Task 2 official ranking with the CEM of 0.7142. Our results suggest: (i) BERT models obtain better results than statistical models for toxicity detection in text comments; (ii) Monolingual BERT models have an advantage over multilingual BERT models in toxicity detection in text comments in their pre-trained language. △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: 20 pages. Presented at IberLEF. See http://ceur-ws.org/Vol-2943/detoxis_paper2.pdf

arXiv:2109.09233 [pdf, other]

Unified and Multilingual Author Profiling for Detecting Haters

Authors: Ipek Baris Schlicht, Angel Felipe Magnossão de Paula

Abstract: This paper presents a unified user profiling framework to identify hate speech spreaders by processing their tweets regardless of the language. The framework encodes the tweets with sentence transformers and applies an attention mechanism to select important tweets for learning user profiles. Furthermore, the attention layer helps to explain why a user is a hate speech spreader by producing attent… ▽ More This paper presents a unified user profiling framework to identify hate speech spreaders by processing their tweets regardless of the language. The framework encodes the tweets with sentence transformers and applies an attention mechanism to select important tweets for learning user profiles. Furthermore, the attention layer helps to explain why a user is a hate speech spreader by producing attention weights at both token and post level. Our proposed model outperformed the state-of-the-art multilingual transformer models. △ Less

Submitted 19 September, 2021; originally announced September 2021.

Comments: 9 pages, 2 figures, see the original paper: http://ceur-ws.org/Vol-2936/paper-157.pdf

Journal ref: Published at the CLEF 2021

arXiv:2109.09232 [pdf, other]

UPV at CheckThat! 2021: Mitigating Cultural Differences for Identifying Multilingual Check-worthy Claims

Authors: Ipek Baris Schlicht, Angel Felipe Magnossão de Paula, Paolo Rosso

Abstract: Identifying check-worthy claims is often the first step of automated fact-checking systems. Tackling this task in a multilingual setting has been understudied. Encoding inputs with multilingual text representations could be one approach to solve the multilingual check-worthiness detection. However, this approach could suffer if cultural bias exists within the communities on determining what is che… ▽ More Identifying check-worthy claims is often the first step of automated fact-checking systems. Tackling this task in a multilingual setting has been understudied. Encoding inputs with multilingual text representations could be one approach to solve the multilingual check-worthiness detection. However, this approach could suffer if cultural bias exists within the communities on determining what is check-worthy.In this paper, we propose a language identification task as an auxiliary task to mitigate unintended bias.With this purpose, we experiment joint training by using the datasets from CLEF-2021 CheckThat!, that contain tweets in English, Arabic, Bulgarian, Spanish and Turkish. Our results show that joint training of language identification and check-worthy claim detection tasks can provide performance gains for some of the selected languages. △ Less

Submitted 19 September, 2021; originally announced September 2021.

Comments: 11 pages, 2 figures. Link to the original paper: http://ceur-ws.org/Vol-2936/paper-36.pdf

ACM Class: I.7; J.4

Journal ref: published at CLEF 2021

arXiv:2107.13461 [pdf, other]

Marine Vehicles Localization Using Grid Cells for Path Integration

Authors: Ignacio Carlucho, Manuel F. Bailey, Mariano De Paula, Corina Barbalata

Abstract: Autonomous Underwater Vehicles (AUVs) are platforms used for research and exploration of marine environments. However, these types of vehicles face many challenges that hinder their widespread use in the industry. One of the main limitations is obtaining accurate position estimation, due to the lack of GPS signal underwater. This estimation is usually done with Kalman filters. However, new develop… ▽ More Autonomous Underwater Vehicles (AUVs) are platforms used for research and exploration of marine environments. However, these types of vehicles face many challenges that hinder their widespread use in the industry. One of the main limitations is obtaining accurate position estimation, due to the lack of GPS signal underwater. This estimation is usually done with Kalman filters. However, new developments in the neuroscience field have shed light on the mechanisms by which mammals are able to obtain a reliable estimation of their current position based on external and internal motion cues. A new type of neuron, called Grid cells, has been shown to be part of path integration system in the brain. In this article, we show how grid cells can be used for obtaining a position estimation of underwater vehicles. The model of grid cells used requires only the linear velocities together with heading orientation and provides a reliable estimation of the vehicle's position. We provide simulation results for an AUV which show the feasibility of our proposed methodology. △ Less

Submitted 9 August, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

arXiv:2011.12360 [pdf, other]

doi 10.1109/IEEECONF38699.2020.9389378

A reinforcement learning control approach for underwater manipulation under position and torque constraints

Authors: Ignacio Carlucho, Mariano De Paula, Gerardo G. Acosta, Corina Barbalata

Abstract: In marine operations underwater manipulators play a primordial role. However, due to uncertainties in the dynamic model and disturbances caused by the environment, low-level control methods require great capabilities to adapt to change. Furthermore, under position and torque constraints the requirements for the control system are greatly increased. Reinforcement learning is a data driven control t… ▽ More In marine operations underwater manipulators play a primordial role. However, due to uncertainties in the dynamic model and disturbances caused by the environment, low-level control methods require great capabilities to adapt to change. Furthermore, under position and torque constraints the requirements for the control system are greatly increased. Reinforcement learning is a data driven control technique that can learn complex control policies without the need of a model. The learning capabilities of these type of agents allow for great adaptability to changes in the operative conditions. In this article we present a novel reinforcement learning low-level controller for the position control of an underwater manipulator under torque and position constraints. The reinforcement learning agent is based on an actor-critic architecture using sensor readings as state information. Simulation results using the Reach Alpha 5 underwater manipulator show the advantages of the proposed control strategy. △ Less

Submitted 24 November, 2020; originally announced November 2020.

Journal ref: Global Oceans 2020: Singapore - U.S. Gulf Coast

arXiv:1808.06044 [pdf, ps, other]

Maximising Throughput in a Complex Coal Export System

Authors: Mateus Rocha de Paula, Natashia Boland, Andreas Ernst, Alexandre Mendes, Martin Savelsbergh

Abstract: The Port of Newcastle features three coal export terminals, operating primarily in cargo assembly mode, that share a rail network on their inbound side, and a channel on their outbound side. Maximising throughput at a single coal terminal, taking into account its layout, its equipment, and its operating policies, is already challenging, but maximising throughput of the Hunter Valley coal export sy… ▽ More The Port of Newcastle features three coal export terminals, operating primarily in cargo assembly mode, that share a rail network on their inbound side, and a channel on their outbound side. Maximising throughput at a single coal terminal, taking into account its layout, its equipment, and its operating policies, is already challenging, but maximising throughput of the Hunter Valley coal export system as a whole requires that terminals and inbound and outbound shared resources be considered simultaneously. Existing approaches to do so either lack realism or are too computationally demanding to be useful as an everyday planning tool. We present a parallel genetic algorithm to optimise the integrated system. The algorithm models activities in continuous time, can handle practical planning horizons efficiently, and generates solutions that match or improve solutions obtained with the state-of-the-art solvers, whilst vastly outperforming them both in memory usage and running time. △ Less

Submitted 22 August, 2018; v1 submitted 18 August, 2018; originally announced August 2018.

Showing 1–13 of 13 results for author: De Paula, M