-
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
Authors:
Layba Fiaz,
Munief Hassan Tahir,
Sana Shams,
Sarmad Hussain
Abstract:
Multilingual Large Language Models (LLMs) often provide suboptimal performance on low-resource languages like Urdu. This paper introduces UrduLLaMA 1.0, a model derived from the open-source Llama-3.1-8B-Instruct architecture and continually pre-trained on 128 million Urdu tokens, capturing the rich diversity of the language. To enhance instruction-following and translation capabilities, we leverag…
▽ More
Multilingual Large Language Models (LLMs) often provide suboptimal performance on low-resource languages like Urdu. This paper introduces UrduLLaMA 1.0, a model derived from the open-source Llama-3.1-8B-Instruct architecture and continually pre-trained on 128 million Urdu tokens, capturing the rich diversity of the language. To enhance instruction-following and translation capabilities, we leverage Low-Rank Adaptation (LoRA) to fine tune the model on 41,000 Urdu instructions and approximately 50,000 English-Urdu translation pairs. Evaluation across three machine translation datasets demonstrates significant performance improvements compared to state-of-the-art (SOTA) models, establishing a new benchmark for Urdu LLMs. These findings underscore the potential of targeted adaptation strategies with limited data and computational resources to address the unique challenges of low-resource languages.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks
Authors:
Munief Hassan Tahir,
Sana Shams,
Layba Fiaz,
Farah Adeeba,
Sarmad Hussain
Abstract:
Large Language Models (LLMs) pre-trained on multilingual data have revolutionized natural language processing research, by transitioning from languages and task specific model pipelines to a single model adapted on a variety of tasks. However majority of existing multilingual NLP benchmarks for LLMs provide evaluation data in only few languages with little linguistic diversity. In addition these b…
▽ More
Large Language Models (LLMs) pre-trained on multilingual data have revolutionized natural language processing research, by transitioning from languages and task specific model pipelines to a single model adapted on a variety of tasks. However majority of existing multilingual NLP benchmarks for LLMs provide evaluation data in only few languages with little linguistic diversity. In addition these benchmarks lack quality assessment against the respective state-of the art models. This study presents an in-depth examination of 7 prominent LLMs: GPT-3.5-turbo, Llama 2-7B-Chat, Llama 3.1-8B, Bloomz 3B, Bloomz 7B1, Ministral-8B and Whisper (Large, medium and small variant) across 17 tasks using 22 datasets, 13.8 hours of speech, in a zero-shot setting, and their performance against state-of-the-art (SOTA) models, has been compared and analyzed. Our experiments show that SOTA models currently outperform encoder-decoder models in majority of Urdu NLP tasks under zero-shot settings. However, comparing Llama 3.1-8B over prior version Llama 2-7B-Chat, we can deduce that with improved language coverage, LLMs can surpass these SOTA models. Our results emphasize that models with fewer parameters but richer language-specific data, like Llama 3.1-8B, often outperform larger models with lower language diversity, such as GPT-3.5, in several tasks.
△ Less
Submitted 31 December, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Continuous-time quantum walks for MAX-CUT are hot
Authors:
Robert J. Banks,
Ehsan Haque,
Farah Nazef,
Fatima Fethallah,
Fatima Ruqaya,
Hamza Ahsan,
Het Vora,
Hibah Tahir,
Ibrahim Ahmad,
Isaac Hewins,
Ishaq Shah,
Krish Baranwal,
Mannan Arora,
Mateen Asad,
Mubasshirah Khan,
Nabian Hasan,
Nuh Azad,
Salgai Fedaiee,
Shakeel Majeed,
Shayam Bhuyan,
Tasfia Tarannum,
Yahya Ali,
Dan E. Browne,
P. A. Warburton
Abstract:
By exploiting the link between time-independent Hamiltonians and thermalisation, heuristic predictions on the performance of continuous-time quantum walks for MAX-CUT are made. The resulting predictions depend on the number of triangles in the underlying MAX-CUT graph. We extend these results to the time-dependent setting with multi-stage quantum walks and Floquet systems. The approach followed he…
▽ More
By exploiting the link between time-independent Hamiltonians and thermalisation, heuristic predictions on the performance of continuous-time quantum walks for MAX-CUT are made. The resulting predictions depend on the number of triangles in the underlying MAX-CUT graph. We extend these results to the time-dependent setting with multi-stage quantum walks and Floquet systems. The approach followed here provides a novel way of understanding the role of unitary dynamics in tackling combinatorial optimisation problems with continuous-time quantum algorithms.
△ Less
Submitted 7 February, 2024; v1 submitted 17 June, 2023;
originally announced June 2023.
-
Improved Fitness Dependent Optimizer for Solving Economic Load Dispatch Problem
Authors:
Barzan Hussein Tahir,
Tarik A. Rashid,
Hafiz Tayyab Rauf,
Nebojsa Bacanin,
Amit Chhabra,
S. Vimal,
Zaher Mundher Yaseen
Abstract:
Economic Load Dispatch depicts a fundamental role in the operation of power systems, as it decreases the environmental load, minimizes the operating cost, and preserves energy resources. The optimal solution to Economic Load Dispatch problems and various constraints can be obtained by evolving several evolutionary and swarm-based algorithms. The major drawback to swarm-based algorithms is prematur…
▽ More
Economic Load Dispatch depicts a fundamental role in the operation of power systems, as it decreases the environmental load, minimizes the operating cost, and preserves energy resources. The optimal solution to Economic Load Dispatch problems and various constraints can be obtained by evolving several evolutionary and swarm-based algorithms. The major drawback to swarm-based algorithms is premature convergence towards an optimal solution. Fitness Dependent Optimizer is a novel optimization algorithm stimulated by the decision-making and reproductive process of bee swarming. Fitness Dependent Optimizer (FDO) examines the search spaces based on the searching approach of Particle Swarm Optimization. To calculate the pace, the fitness function is utilized to generate weights that direct the search agents in the phases of exploitation and exploration. In this research, the authors have carried out Fitness Dependent Optimizer to solve the Economic Load Dispatch problem by reducing fuel cost, emission allocation, and transmission loss. Moreover, the authors have enhanced a novel variant of Fitness Dependent Optimizer, which incorporates novel population initialization techniques and dynamically employed sine maps to select the weight factor for Fitness Dependent Optimizer. The enhanced population initialization approach incorporates a quasi-random Sabol sequence to generate the initial solution in the multi-dimensional search space. A standard 24-unit system is employed for experimental evaluation with different power demands. Empirical results obtained using the enhanced variant of the Fitness Dependent Optimizer demonstrate superior performance in terms of low transmission loss, low fuel cost, and low emission allocation compared to the conventional Fitness Dependent Optimizer. The experimental study obtained 7.94E-12.
△ Less
Submitted 14 July, 2022;
originally announced September 2022.
-
Deep Learning Methods for Credit Card Fraud Detection
Authors:
Thanh Thi Nguyen,
Hammad Tahir,
Mohamed Abdelrazek,
Ali Babar
Abstract:
Credit card frauds are at an ever-increasing rate and have become a major problem in the financial sector. Because of these frauds, card users are hesitant in making purchases and both the merchants and financial institutions bear heavy losses. Some major challenges in credit card frauds involve the availability of public data, high class imbalance in data, changing nature of frauds and the high n…
▽ More
Credit card frauds are at an ever-increasing rate and have become a major problem in the financial sector. Because of these frauds, card users are hesitant in making purchases and both the merchants and financial institutions bear heavy losses. Some major challenges in credit card frauds involve the availability of public data, high class imbalance in data, changing nature of frauds and the high number of false alarms. Machine learning techniques have been used to detect credit card frauds but no fraud detection systems have been able to offer great efficiency to date. Recent development of deep learning has been applied to solve complex problems in various areas. This paper presents a thorough study of deep learning methods for the credit card fraud detection problem and compare their performance with various machine learning algorithms on three different financial datasets. Experimental results show great performance of the proposed deep learning methods against traditional machine learning models and imply that the proposed approaches can be implemented effectively for real-world credit card fraud detection systems.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Modelling pathogen spread in a healthcare network: indirect patient movements
Authors:
M. J. Piotrowska,
K. Sakowski,
A. Karch,
H. Tahir,
J. Horn,
M. E. Kretzschmar,
R. T. Mikolajczyk
Abstract:
A hybrid network--deterministic model for simulation of multiresistant pathogen spread in a healthcare system is presented. The model accounts for two paths of pathogen transmission between the healthcare facilities: inter-hospital patient transfers (direct transfers) and readmission of colonized patients (indirect transfers). In the latter case, the patients may be readmitted to the same facility…
▽ More
A hybrid network--deterministic model for simulation of multiresistant pathogen spread in a healthcare system is presented. The model accounts for two paths of pathogen transmission between the healthcare facilities: inter-hospital patient transfers (direct transfers) and readmission of colonized patients (indirect transfers). In the latter case, the patients may be readmitted to the same facility or to a different one. Intra-hospital pathogen transmission is governed by a SIS model expressed by a system of ordinary differential equations.
Using a network model created for a Lower Saxony region (Germany), we showed that the proposed model reproduces the basic properties of healthcare-associated pathogen spread. Moreover, it shows the important contribution of the readmission of colonized patients on the prevalence of individual hospitals as well as of overall healthcare system: it can increase the overall prevalence by the factor of 4 as compared to inter-hospital transfers only. The final prevalence in individual healthcare facilities was shown to depend on average length of stay by a non-linear concave function.
Finally, we demonstrated that the network parameters of the model may be derived from administrative admission/discharge records. In particular, they are sufficient to obtain inter-hospital transfer probabilities, and to express the patients' transfer as a Markov process.
△ Less
Submitted 15 January, 2020;
originally announced January 2020.
-
Enhanced aodv route discovery and route establishment for qos provision for real time transmission in manet
Authors:
Iftikhar Ahmad,
Uzma Ashraf,
Sadia Anum,
Hira Tahir
Abstract:
MANET is a temporary connection of mobile nodes via wireless links having no centralized base station. We developed a protocol with an enhanced route discovery mechanism that avoids the pre-transmission delay. When a source node wants to communicate with another node, it broadcast RREQ. EAODV give priority to the source node of real time transmission. When RREQ packet send to neighbor node, for re…
▽ More
MANET is a temporary connection of mobile nodes via wireless links having no centralized base station. We developed a protocol with an enhanced route discovery mechanism that avoids the pre-transmission delay. When a source node wants to communicate with another node, it broadcast RREQ. EAODV give priority to the source node of real time transmission. When RREQ packet send to neighbor node, for real time transmission it accept route request on priority basis and the drop ratio of packets decreased, then throughput increases by receiving more packets at destination and delivery ratio also increased through these QOS improved.
△ Less
Submitted 19 April, 2014;
originally announced April 2014.