-
LLM-as-a-qualitative-judge: automating error analysis in natural language generation
Authors:
Nadezhda Chirkova,
Tunde Oluwaseyi Ajayi,
Seth Aycock,
Zain Muhammad Mujahid,
Vladana Perlić,
Ekaterina Borisova,
Markarit Vartampetian
Abstract:
Prompting large language models (LLMs) to evaluate generated text, known as LLM-as-a-judge, has become a standard evaluation approach in natural language generation (NLG), but is primarily used as a quantitative tool, i.e. with numerical scores as main outputs. In this work, we propose LLM-as-a-qualitative-judge, an LLM-based evaluation approach with the main output being a structured report of co…
▽ More
Prompting large language models (LLMs) to evaluate generated text, known as LLM-as-a-judge, has become a standard evaluation approach in natural language generation (NLG), but is primarily used as a quantitative tool, i.e. with numerical scores as main outputs. In this work, we propose LLM-as-a-qualitative-judge, an LLM-based evaluation approach with the main output being a structured report of common issue types in the NLG system outputs. Our approach is targeted at providing developers with meaningful insights on what improvements can be done to a given NLG system and consists of two main steps, namely open-ended per-instance issue analysis and clustering of the discovered issues using an intuitive cumulative algorithm. We also introduce a strategy for evaluating the proposed approach, coupled with ~300 annotations of issues in instances from 12 NLG datasets. Our results show that LLM-as-a-qualitative-judge correctly recognizes instance-specific issues in 2/3 cases and is capable of producing error type reports resembling the reports composed by human annotators. Our code and data are publicly available at https://github.com/tunde-ajayi/llm-as-a-qualitative-judge.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Deep Ensemble approach for Enhancing Brain Tumor Segmentation in Resource-Limited Settings
Authors:
Jeremiah Fadugba,
Isabel Lieberman,
Olabode Ajayi,
Mansour Osman,
Solomon Oluwole Akinola,
Tinashe Mustvangwa,
Dong Zhang,
Udunna C Anazondo,
Raymond Confidence
Abstract:
Segmentation of brain tumors is a critical step in treatment planning, yet manual segmentation is both time-consuming and subjective, relying heavily on the expertise of radiologists. In Sub-Saharan Africa, this challenge is magnified by overburdened medical systems and limited access to advanced imaging modalities and expert radiologists. Automating brain tumor segmentation using deep learning of…
▽ More
Segmentation of brain tumors is a critical step in treatment planning, yet manual segmentation is both time-consuming and subjective, relying heavily on the expertise of radiologists. In Sub-Saharan Africa, this challenge is magnified by overburdened medical systems and limited access to advanced imaging modalities and expert radiologists. Automating brain tumor segmentation using deep learning offers a promising solution. Convolutional Neural Networks (CNNs), especially the U-Net architecture, have shown significant potential. However, a major challenge remains: achieving generalizability across different datasets. This study addresses this gap by developing a deep learning ensemble that integrates UNet3D, V-Net, and MSA-VNet models for the semantic segmentation of gliomas. By initially training on the BraTS-GLI dataset and fine-tuning with the BraTS-SSA dataset, we enhance model performance. Our ensemble approach significantly outperforms individual models, achieving DICE scores of 0.8358 for Tumor Core, 0.8521 for Whole Tumor, and 0.8167 for Enhancing Tumor. These results underscore the potential of ensemble methods in improving the accuracy and reliability of automated brain tumor segmentation, particularly in resource-limited settings.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Bayesian Networks and Machine Learning for COVID-19 Severity Explanation and Demographic Symptom Classification
Authors:
Oluwaseun T. Ajayi,
Yu Cheng
Abstract:
With the prevailing efforts to combat the coronavirus disease 2019 (COVID-19) pandemic, there are still uncertainties that are yet to be discovered about its spread, future impact, and resurgence. In this paper, we present a three-stage data-driven approach to distill the hidden information about COVID-19. The first stage employs a Bayesian network structure learning method to identify the causal…
▽ More
With the prevailing efforts to combat the coronavirus disease 2019 (COVID-19) pandemic, there are still uncertainties that are yet to be discovered about its spread, future impact, and resurgence. In this paper, we present a three-stage data-driven approach to distill the hidden information about COVID-19. The first stage employs a Bayesian network structure learning method to identify the causal relationships among COVID-19 symptoms and their intrinsic demographic variables. As a second stage, the output from the Bayesian network structure learning, serves as a useful guide to train an unsupervised machine learning (ML) algorithm that uncovers the similarities in patients' symptoms through clustering. The final stage then leverages the labels obtained from clustering to train a demographic symptom identification (DSID) model which predicts a patient's symptom class and the corresponding demographic probability distribution. We applied our method on the COVID-19 dataset obtained from the Centers for Disease Control and Prevention (CDC) in the United States. Results from the experiments show a testing accuracy of 99.99%, as against the 41.15% accuracy of a heuristic ML method. This strongly reveals the viability of our Bayesian network and ML approach in understanding the relationship between the virus symptoms, and providing insights on patients' stratification towards reducing the severity of the virus.
△ Less
Submitted 17 June, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
Smart Cities and Villages: Concept Review and Implementation Perspectives in Developing Cities
Authors:
Kamiba I. Kabuya,
Olasupo O. Ajayi,
Anotine B. Bagula
Abstract:
The "Smart City" (SC) concept has been around for decades with deployment scenarios revealed in major cities of developed countries. However, while SC has enhanced the living conditions of city dwellers in the developed world, the concept is still either missing or poorly deployed in the developing world. This paper presents a review of the SC concept from the perspective of its application to cit…
▽ More
The "Smart City" (SC) concept has been around for decades with deployment scenarios revealed in major cities of developed countries. However, while SC has enhanced the living conditions of city dwellers in the developed world, the concept is still either missing or poorly deployed in the developing world. This paper presents a review of the SC concept from the perspective of its application to cities in developing nations, the opportunities it avails, and challenges related to its applicability to these cities. Building upon a systematic review of literature, this paper shows that there are neither canonical definitions, models or frameworks of references for the SC concept. This paper also aims to bridge the gap between the "smart city" and "smart village" concepts, with the expectation of providing a holistic approach to solving common issues in cities around the world. Drawing inspiration from other authors, we propose a conceptual model for a SC initiative in Africa and demonstrate the need to prioritize research and capacity development. We also discuss the potential opportunities for such SC implementations in sub-Saharan Africa. As a case study, we consider the city of Lubumbashi in the Democratic Republic of Congo and discuss ways of making it a smart city by building around successful smart city initiatives. It is our belief that for Lubumbashi, as with any other city in Sub-Saharan Africa, the first step to developing a smart city is to build knowledge and create an intellectual capital.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
Authors:
Odunayo Ogundepo,
Tajuddeen R. Gwadabe,
Clara E. Rivera,
Jonathan H. Clark,
Sebastian Ruder,
David Ifeoluwa Adelani,
Bonaventure F. P. Dossou,
Abdou Aziz DIOP,
Claytone Sikasote,
Gilles Hacheme,
Happy Buzaaba,
Ignatius Ezeani,
Rooweither Mabuya,
Salomey Osei,
Chris Emezue,
Albert Njoroge Kahira,
Shamsuddeen H. Muhammad,
Akintunde Oladipo,
Abraham Toluwase Owodunni,
Atnafu Lambebo Tonja,
Iyanuoluwa Shode,
Akari Asai,
Tunde Oluwaseyi Ajayi,
Clemencia Siro,
Steven Arthur
, et al. (27 additional authors not shown)
Abstract:
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create…
▽ More
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Evaluating the Robustness of Machine Reading Comprehension Models to Low Resource Entity Renaming
Authors:
Clemencia Siro,
Tunde Oluwaseyi Ajayi
Abstract:
Question answering (QA) models have shown compelling results in the task of Machine Reading Comprehension (MRC). Recently these systems have proved to perform better than humans on held-out test sets of datasets e.g. SQuAD, but their robustness is not guaranteed. The QA model's brittleness is exposed when evaluated on adversarial generated examples by a performance drop. In this study, we explore…
▽ More
Question answering (QA) models have shown compelling results in the task of Machine Reading Comprehension (MRC). Recently these systems have proved to perform better than humans on held-out test sets of datasets e.g. SQuAD, but their robustness is not guaranteed. The QA model's brittleness is exposed when evaluated on adversarial generated examples by a performance drop. In this study, we explore the robustness of MRC models to entity renaming, with entities from low-resource regions such as Africa. We propose EntSwap, a method for test-time perturbations, to create a test set whose entities have been renamed. In particular, we rename entities of type: country, person, nationality, location, organization, and city, to create AfriSQuAD2. Using the perturbed test set, we evaluate the robustness of three popular MRC models. We find that compared to base models, large models perform well comparatively on novel entities. Furthermore, our analysis indicates that entity type person highly challenges the MRC models' performance.
△ Less
Submitted 16 April, 2024; v1 submitted 6 April, 2023;
originally announced April 2023.
-
Hypergraphs for multiscale cycles in structured data
Authors:
Agnese Barbensi,
Iris H. R. Yoon,
Christian Degnbol Madsen,
Deborah O. Ajayi,
Michael P. H. Stumpf,
Heather A. Harrington
Abstract:
Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological data analysis can provide a powerful computational window on complex systems. Here we present a fra…
▽ More
Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological data analysis can provide a powerful computational window on complex systems. Here we present a framework to extend and interpret persistent homology summaries to analyse spatial data across multiple scales. We introduce hyperTDA, a topological pipeline that unifies local (e.g. geodesic) and global (e.g. Euclidean) metrics without losing spatial information, even in the presence of noise. Homology generators offer an elegant and flexible description of spatial structures and can capture the information computed by persistent homology in an interpretable way. Here the information computed by persistent homology is transformed into a weighted hypergraph, where hyperedges correspond to homology generators. We consider different choices of generators (e.g. matroid or minimal) and find that centrality and community detection are robust to either choice. We compare hyperTDA to existing geometric measures and validate its robustness to noise. We demonstrate the power of computing higher-order topological structures on spatial curves arising frequently in ecology, biophysics, and biology, but also in high-dimensional financial datasets. We find that hyperTDA can select between synthetic trajectories from the landmark 2020 AnDi challenge and quantifies movements of different animal species, even when data is limited.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Image Augmentation for Satellite Images
Authors:
Oluwadara Adedeji,
Peter Owoade,
Opeyemi Ajayi,
Olayiwola Arowolo
Abstract:
This study proposes the use of generative models (GANs) for augmenting the EuroSAT dataset for the Land Use and Land Cover (LULC) Classification task. We used DCGAN and WGAN-GP to generate images for each class in the dataset. We then explored the effect of augmenting the original dataset by about 10% in each case on model performance. The choice of GAN architecture seems to have no apparent effec…
▽ More
This study proposes the use of generative models (GANs) for augmenting the EuroSAT dataset for the Land Use and Land Cover (LULC) Classification task. We used DCGAN and WGAN-GP to generate images for each class in the dataset. We then explored the effect of augmenting the original dataset by about 10% in each case on model performance. The choice of GAN architecture seems to have no apparent effect on the model performance. However, a combination of geometric augmentation and GAN-generated images improved baseline results. Our study shows that GANs augmentation can improve the generalizability of deep classification models on satellite images.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
Authors:
David Ifeoluwa Adelani,
Jesujoba Oluwadara Alabi,
Angela Fan,
Julia Kreutzer,
Xiaoyu Shen,
Machel Reid,
Dana Ruiter,
Dietrich Klakow,
Peter Nabende,
Ernie Chang,
Tajuddeen Gwadabe,
Freshia Sackey,
Bonaventure F. P. Dossou,
Chris Chinenye Emezue,
Colin Leong,
Michael Beukman,
Shamsuddeen Hassan Muhammad,
Guyo Dub Jarso,
Oreen Yousuf,
Andre Niyongabo Rubungo,
Gilles Hacheme,
Eric Peter Wairagala,
Muhammad Umair Nasir,
Benjamin Ayoade Ajibade,
Tunde Oluwaseyi Ajayi
, et al. (20 additional authors not shown)
Abstract:
Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly left out in these datasets. This is primarily because many widely spoken languages are not well represented on the web and therefore excluded from the large-scale crawls used to create datasets. Furthermore, downstream users of these models…
▽ More
Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly left out in these datasets. This is primarily because many widely spoken languages are not well represented on the web and therefore excluded from the large-scale crawls used to create datasets. Furthermore, downstream users of these models are restricted to the selection of languages originally chosen for pre-training. This work investigates how to optimally leverage existing pre-trained models to create low-resource translation systems for 16 African languages. We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pre-training? and 2) How can the resulting translation models effectively transfer to new domains? To answer these questions, we create a new African news corpus covering 16 languages, of which eight languages are not part of any existing evaluation dataset. We demonstrate that the most effective strategy for transferring both to additional languages and to additional domains is to fine-tune large pre-trained models on small quantities of high-quality translation data.
△ Less
Submitted 22 August, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Modelling DDoS Attacks in IoT Networks using Machine Learning
Authors:
Pheeha Machaka,
Olasupo Ajayi,
Hloniphani Maluleke,
Ferdinand Kahenga,
Antoine Bagula,
Kyandoghere Kyamakya
Abstract:
In current Internet-of-Things (IoT) deployments, a mix of traditional IP networking and IoT specific protocols, both relying on the TCP protocol, can be used to transport data from a source to a destination. Therefore, TCP-specific attacks, such as the Distributed Denial of Service (DDoS) using the TCP SYN attack, are one of the most plausible tools that attackers can use on Cyber-Physical Systems…
▽ More
In current Internet-of-Things (IoT) deployments, a mix of traditional IP networking and IoT specific protocols, both relying on the TCP protocol, can be used to transport data from a source to a destination. Therefore, TCP-specific attacks, such as the Distributed Denial of Service (DDoS) using the TCP SYN attack, are one of the most plausible tools that attackers can use on Cyber-Physical Systems (CPS). This may be done by launching an attack from its IoT subsystem, here referred to as the "CPS-IoT", with potential propagation to the different servers located in both fog and the cloud infrastructures of the CPS. This study compares the effectiveness of supervised, unsupervised, and semi-supervised machine learning algorithms for detecting DDoS attacks in CPS-IoT, particularly during data transmission to and from the physical space to the cyber space via the Internet. The algorithms considered are broadly grouped into two: i) Detection algorithms, which include Logistic Regression (LGR), K-Means, and Artificial Neural Networks (ANN). We also looked into the effectiveness of semi-supervised hybrid learning models, which use unsupervised K-Means to label data, then feed the output to a supervised learning model for attack detection. ii.) Prediction algorithms - LGR, Kernel Ridge Regression (KRR) and Support Vector Regression (SVR), which were used to predict imminent attacks. Experimental tests were carried out and obtained results showed that the hybrid model was able to achieve 100% accuracy with zero false positives; while all the prediction models were able to achieve over 94% attack prediction accuracy.
△ Less
Submitted 20 June, 2022; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Africa 3: A Continental Network Model to Enable the African Fourth Industrial Revolution
Authors:
Olasupo O. Ajayi,
Antoine B. Bagula,
Hloniphani M. Maluleke
Abstract:
It is widely recognised that collaboration can help fast-track the development of countries in Africa. Leveraging on the fourth industrial revolution, Africa can achieve accelerated development in health care services, educational systems and socio-economic infrastructures. While a number of conceptual frameworks have been proposed for the African continent, many have discounted the Cloud infrastr…
▽ More
It is widely recognised that collaboration can help fast-track the development of countries in Africa. Leveraging on the fourth industrial revolution, Africa can achieve accelerated development in health care services, educational systems and socio-economic infrastructures. While a number of conceptual frameworks have been proposed for the African continent, many have discounted the Cloud infrastructure used for data storage and processing, as well as the underlying network infrastructure upon which such frameworks would be built. This work therefore presents a continental network model for interconnecting nations in Africa through its data centres. The proposed model is based on a multilayer network engineering approach, which first groups African countries into clusters of data centres using a hybrid combination of clustering techniques; then utilizes Ant Colony Optimization with Stench Pheromone, that is modified to support variable evaporation rates, to find the ideal network path(s) across the clusters and the continent as a whole. The propsoed model takes into consideration the geo-spatial location, population sizes, data centre counts and intercontinental submarine cable landings of each African country, when clustering and routing. For bench-marking purposes, the path selection algorithm was tested on both the obtained clusters and African Union's regional clusters.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Quality of Service (QoS) Modelling in Federated Cloud Computing
Authors:
Kun Ma,
Antoine Bagula,
Olasupo Ajayi
Abstract:
Building around the idea of a large scale server infrastructure with a potentially large number of tailored resources, which are capable of interacting to facilitate the deployment, adaptation, and support of services, cloud computing needs to frequently reschedule and manage various application tasks in order to accommodate the requests of a wide range and number of users. One of the challenges o…
▽ More
Building around the idea of a large scale server infrastructure with a potentially large number of tailored resources, which are capable of interacting to facilitate the deployment, adaptation, and support of services, cloud computing needs to frequently reschedule and manage various application tasks in order to accommodate the requests of a wide range and number of users. One of the challenges of cloud computing is to support and manage Quality-of-Service (QoS) by designing efficient techniques for the allocation of tasks between users and the cloud virtual resources, as well as assigning virtual resources to the cloud physical resources. The migration of virtual resources across physical resources is another challenge that requires considerable attention; especially in federated cloud computing environments wherein, providers might be willing to offer their unused resources as a service to the federation (cooperative allocation) and pull back these resources for their own use when they are needed (competitive allocation). This paper revisits the issue of QoS in cloud computing by formulating and presenting i) a multi-QoS task allocation model for the assignment of tasks to virtual machines and ii) a virtual machine migration model for a federated cloud computing environment by considering cases where resource providers are operating in cooperative or competitive mode. A new differential evolution (DE) based binding policy for task allocation and a novel virtual machine model are proposed as solutions for the problem of QoS support in federated cloud environments. The experimental results show that the proposed solutions improved the quality of service in the cloud computing environment and reveal the relative advantages of operating a mixed cooperation and competition model in a federated cloud environment.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Fourth Industrial Revolution for Development: The Relevance of Cloud Federation in Healthcare Support
Authors:
Olasupo O. Ajayi,
Antoine B. Bagula,
Kun Ma
Abstract:
Inefficient healthcare is a major concern among many African nations and can be mitigated by building world-class infrastructure connecting different medical facilities for collaboration and resource sharing. Such infrastructure should support collection and exchange of medical data for the purpose of accessing expertise not available locally. It should be equipped with modern technologies of the…
▽ More
Inefficient healthcare is a major concern among many African nations and can be mitigated by building world-class infrastructure connecting different medical facilities for collaboration and resource sharing. Such infrastructure should support collection and exchange of medical data for the purpose of accessing expertise not available locally. It should be equipped with modern technologies of the fourth industrial revolution, providing decision support to doctors thereby enabling African nations leapfrog from poorly equipped to medically prepared. Sadly, world-class healthcare infrastructure are a missing piece in the African public health ecosystem. Medical facilities are either non-existent or prohibitively expensive when they exist. Federated cloud computing can provide a solution to this challenge. Being a model that allows collaboration between multiple Cloud service providers through resources pooling; it allows for the execution of tasks on computing resources flexibly and cost efficiently. This paper aims to connect unconnected medical facilities in Africa by proposing a Cloud federation for healthcare using cooperative and competitive collaboration models. Simulations were carried out to test the efficacy of these models using five different workload allocation schemes: First-Fit-Descending (FFD), Best-Fit-Descending (BFD), Binary-Search-Best-Fit (BSBF); Genetic Algorithm meta-heuristic and Stable Roommate Allocation economic model for both light and heavy workloads. Results of simulations revealed that the cooperative model resulted in lower delays but higher resource utilisation; while the competitive provided faster service delivery and better quality of service. BSBF and BFD resulted in the best resources utilisation and energy conservation. Finally, deployment considerations and potential business models for federated Cloud for African healthcare were presented.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.