-
BrightCookies at SemEval-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification
Authors:
Foteini Papadopoulou,
Osman Mutlu,
Neris Özen,
Bas H. M. van der Velden,
Iris Hendrickx,
Ali Hürriyetoğlu
Abstract:
This paper presents our system developed for the SemEval-2025 Task 9: The Food Hazard Detection Challenge. The shared task's objective is to evaluate explainable classification systems for classifying hazards and products in two levels of granularity from food recall incident reports. In this work, we propose text augmentation techniques as a way to improve poor performance on minority classes and…
▽ More
This paper presents our system developed for the SemEval-2025 Task 9: The Food Hazard Detection Challenge. The shared task's objective is to evaluate explainable classification systems for classifying hazards and products in two levels of granularity from food recall incident reports. In this work, we propose text augmentation techniques as a way to improve poor performance on minority classes and compare their effect for each category on various transformer and machine learning models. We explore three word-level data augmentation techniques, namely synonym replacement, random word swapping, and contextual word insertion. The results show that transformer models tend to have a better overall performance. None of the three augmentation techniques consistently improved overall performance for classifying hazards and products. We observed a statistically significant improvement (P < 0.05) in the fine-grained categories when using the BERT model to compare the baseline with each augmented model. Compared to the baseline, the contextual words insertion augmentation improved the accuracy of predictions for the minority hazard classes by 6%. This suggests that targeted augmentation of minority classes can improve the performance of transformer models.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks
Authors:
George Papadopoulos,
Andreas Kontogiannis,
Foteini Papadopoulou,
Chaido Poulianou,
Ioannis Koumentis,
George Vouros
Abstract:
Multi-Agent Reinforcement Learning (MARL) has recently emerged as a significant area of research. However, MARL evaluation often lacks systematic diversity, hindering a comprehensive understanding of algorithms' capabilities. In particular, cooperative MARL algorithms are predominantly evaluated on benchmarks such as SMAC and GRF, which primarily feature team game scenarios without assessing adequ…
▽ More
Multi-Agent Reinforcement Learning (MARL) has recently emerged as a significant area of research. However, MARL evaluation often lacks systematic diversity, hindering a comprehensive understanding of algorithms' capabilities. In particular, cooperative MARL algorithms are predominantly evaluated on benchmarks such as SMAC and GRF, which primarily feature team game scenarios without assessing adequately various aspects of agents' capabilities required in fully cooperative real-world tasks such as multi-robot cooperation and warehouse, resource management, search and rescue, and human-AI cooperation. Moreover, MARL algorithms are mainly evaluated on low dimensional state spaces, and thus their performance on high-dimensional (e.g., image) observations is not well-studied. To fill this gap, this paper highlights the crucial need for expanding systematic evaluation across a wider array of existing benchmarks. To this end, we conduct extensive evaluation and comparisons of well-known MARL algorithms on complex fully cooperative benchmarks, including tasks with images as agents' observations. Interestingly, our analysis shows that many algorithms, hailed as state-of-the-art on SMAC and GRF, may underperform standard MARL baselines on fully cooperative benchmarks. Finally, towards more systematic and better evaluation of cooperative MARL algorithms, we have open-sourced PyMARLzoo+, an extension of the widely used (E)PyMARL libraries, which addresses an open challenge from [TBG++21], facilitating seamless integration and support with all benchmarks of PettingZoo, as well as Overcooked, PressurePlate, Capture Target and Box Pushing.
△ Less
Submitted 3 July, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Evaluation of 50 Greek Science and Engineering University Departments using Google Scholar
Authors:
Marina Pitsolanti,
Fotini Papadopoulou,
Nikolaos Tselios
Abstract:
In this paper, the scientometric evaluation of faculty members of 50 Greek Science and Engineering University Departments is presented. 1978 academics were examined in total. The number of papers, citations, h-index and i10-index have been collected for each academic, department, school and university using Google Scholar and the citations analysis program Publish or Perish. Analysis of the collec…
▽ More
In this paper, the scientometric evaluation of faculty members of 50 Greek Science and Engineering University Departments is presented. 1978 academics were examined in total. The number of papers, citations, h-index and i10-index have been collected for each academic, department, school and university using Google Scholar and the citations analysis program Publish or Perish. Analysis of the collected data showed that departments of the same academic discipline are characterized by significant differences on the scientific outcome. In addition, in the majority of the evaluated departments a significant difference in h-index between academics who report scientific activity on the departments website and those who do not, was observed. Moreover, academics who earned their PhD title in the USA demonstrate higher indices in comparison to scholars who obtained their PhD title in Europe or in Greece. Finally, the correlation between the academic rank and the scholars h-index (or the number of their citations) is quite low in some departments, which, under specific circumstances, could be an indication of the lack of meritocracy.
△ Less
Submitted 27 July, 2017; v1 submitted 13 March, 2017;
originally announced March 2017.