Search | arXiv e-print repository

The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development

Authors: Tshilidzi Marwala, Eleonore Fournier-Tombs, Serge Stinckwich

Abstract: In the current data driven era, synthetic data, artificially generated data that resembles the characteristics of real world data without containing actual personal information, is gaining prominence. This is due to its potential to safeguard privacy, increase the availability of data for research, and reduce bias in machine learning models. This paper investigates the policies governing the creat… ▽ More In the current data driven era, synthetic data, artificially generated data that resembles the characteristics of real world data without containing actual personal information, is gaining prominence. This is due to its potential to safeguard privacy, increase the availability of data for research, and reduce bias in machine learning models. This paper investigates the policies governing the creation, utilization, and dissemination of synthetic data. Synthetic data can be a powerful instrument for protecting the privacy of individuals, but it also presents challenges, such as ensuring its quality and authenticity. A well crafted synthetic data policy must strike a balance between privacy concerns and the utility of data, ensuring that it can be utilized effectively without compromising ethical or legal standards. Organizations and institutions must develop standardized guidelines and best practices in order to capitalize on the benefits of synthetic data while addressing its inherent challenges. △ Less

Submitted 31 August, 2023; originally announced September 2023.

arXiv:2304.03952 [pdf, ps, other]

MphayaNER: Named Entity Recognition for Tshivenda

Authors: Rendani Mbuvha, David I. Adelani, Tendani Mutavhatsindi, Tshimangadzo Rakhuhu, Aluwani Mauda, Tshifhiwa Joshua Maumela, Andisani Masindi, Seani Rananga, Vukosi Marivate, Tshilidzi Marwala

Abstract: Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering. However, NER can be challenging, especially in low-resource languages with limited annotated datasets and tools. This paper adds to the effort of addressing these challenges by introducing MphayaNER, the first Tshivenda NER corpu… ▽ More Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering. However, NER can be challenging, especially in low-resource languages with limited annotated datasets and tools. This paper adds to the effort of addressing these challenges by introducing MphayaNER, the first Tshivenda NER corpus in the news domain. We establish NER baselines by \textit{fine-tuning} state-of-the-art models on MphayaNER. The study also explores zero-shot transfer between Tshivenda and other related Bantu languages, with chiShona and Kiswahili showing the best results. Augmenting MphayaNER with chiShona data was also found to improve model performance significantly. Both MphayaNER and the baseline models are made publicly available. △ Less

Submitted 8 April, 2023; originally announced April 2023.

Comments: Accepted at AfricaNLP Workshop at ICLR 2023

arXiv:2211.11576 [pdf, other]

Imputation of Missing Streamflow Data at Multiple Gauging Stations in Benin Republic

Authors: Rendani Mbuvha, Julien Yise Peniel Adounkpe, Wilson Tsakane Mongwe, Mandela Houngnibo, Nathaniel Newlands, Tshilidzi Marwala

Abstract: Streamflow observation data is vital for flood monitoring, agricultural, and settlement planning. However, such streamflow data are commonly plagued with missing observations due to various causes such as harsh environmental conditions and constrained operational resources. This problem is often more pervasive in under-resourced areas such as Sub-Saharan Africa. In this work, we reconstruct stream… ▽ More Streamflow observation data is vital for flood monitoring, agricultural, and settlement planning. However, such streamflow data are commonly plagued with missing observations due to various causes such as harsh environmental conditions and constrained operational resources. This problem is often more pervasive in under-resourced areas such as Sub-Saharan Africa. In this work, we reconstruct streamflow time series data through bias correction of the GEOGloWS ECMWF streamflow service (GESS) forecasts at ten river gauging stations in Benin Republic. We perform bias correction by fitting Quantile Mapping, Gaussian Process, and Elastic Net regression in a constrained training period. We show by simulating missingness in a testing period that GESS forecasts have a significant bias that results in low predictive skill over the ten Beninese stations. Our findings suggest that overall bias correction by Elastic Net and Gaussian Process regression achieves superior skill relative to traditional imputation by Random Forest, k-Nearest Neighbour, and GESS lookup. The findings of this work provide a basis for integrating global GESS streamflow data into operational early-warning decision-making systems (e.g., flood alert) in countries vulnerable to drought and flooding due to extreme weather events. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: AAAI 2022 Fall Symposium: The Role of AI in Responding to Climate Challenges, Nov 17-19, 2022

arXiv:2110.04755 [pdf]

Nano Version Control and Robots of Robots: Data Driven, Regenerative Production Code

Authors: Lukasz Machowski, Tshilidzi Marwala

Abstract: A reflection of the Corona pandemic highlights the need for more sustainable production systems using automation. The goal is to retain automation of repetitive tasks while allowing complex parts to come together. We recognize the fragility and how hard it is to create traditional automation. We introduce a method which converts one really hard problem of producing sustainable production code into… ▽ More A reflection of the Corona pandemic highlights the need for more sustainable production systems using automation. The goal is to retain automation of repetitive tasks while allowing complex parts to come together. We recognize the fragility and how hard it is to create traditional automation. We introduce a method which converts one really hard problem of producing sustainable production code into three simpler problems being data, patterns and working prototypes. We use developer seniority as a metric to measure whether the proposed method is easier. By using agent-based simulation and NanoVC repos for agent arbitration, we are able to create a simulated environment where patterns developed by people are used to transform working prototypes into templates that data can be fed through to create the robots that create the production code. Having two layers of robots allow early implementation choices to be replaced as we gather more feedback from the working system. Several benefits of this approach have been discovered, with the most notable being that the Robot of Robots encodes a legacy of the person that designed it in the form of the 3 ingredients (data, patterns and working prototypes). This method allows us to achieve our goal of reducing the fragility of the production code while removing the difficulty of getting there. △ Less

Submitted 10 October, 2021; originally announced October 2021.

Comments: Presented at the 3rd Electrical Engineering Postgraduate Symposium

arXiv:2107.02070 [pdf, other]

Antithetic Riemannian Manifold And Quantum-Inspired Hamiltonian Monte Carlo

Authors: Wilson Tsakane Mongwe, Rendani Mbuvha, Tshilidzi Marwala

Abstract: Markov Chain Monte Carlo inference of target posterior distributions in machine learning is predominately conducted via Hamiltonian Monte Carlo and its variants. This is due to Hamiltonian Monte Carlo based samplers ability to suppress random-walk behaviour. As with other Markov Chain Monte Carlo methods, Hamiltonian Monte Carlo produces auto-correlated samples which results in high variance in th… ▽ More Markov Chain Monte Carlo inference of target posterior distributions in machine learning is predominately conducted via Hamiltonian Monte Carlo and its variants. This is due to Hamiltonian Monte Carlo based samplers ability to suppress random-walk behaviour. As with other Markov Chain Monte Carlo methods, Hamiltonian Monte Carlo produces auto-correlated samples which results in high variance in the estimators, and low effective sample size rates in the generated samples. Adding antithetic sampling to Hamiltonian Monte Carlo has been previously shown to produce higher effective sample rates compared to vanilla Hamiltonian Monte Carlo. In this paper, we present new algorithms which are antithetic versions of Riemannian Manifold Hamiltonian Monte Carlo and Quantum-Inspired Hamiltonian Monte Carlo. The Riemannian Manifold Hamiltonian Monte Carlo algorithm improves on Hamiltonian Monte Carlo by taking into account the local geometry of the target, which is beneficial for target densities that may exhibit strong correlations in the parameters. Quantum-Inspired Hamiltonian Monte Carlo is based on quantum particles that can have random mass. Quantum-Inspired Hamiltonian Monte Carlo uses a random mass matrix which results in better sampling than Hamiltonian Monte Carlo on spiky and multi-modal distributions such as jump diffusion processes. The analysis is performed on jump diffusion process using real world financial market data, as well as on real world benchmark classification tasks using Bayesian logistic regression. △ Less

Submitted 5 July, 2021; originally announced July 2021.

arXiv:2106.06805 [pdf, other]

Predicting Higher Education Throughput in South Africa Using a Tree-Based Ensemble Technique

Authors: Rendani Mbuvha, Patience Zondo, Aluwani Mauda, Tshilidzi Marwala

Abstract: We use gradient boosting machines and logistic regression to predict academic throughput at a South African university. The results highlight the significant influence of socio-economic factors and field of study as predictors of throughput. We further find that socio-economic factors become less of a predictor relative to the field of study as the time to completion increases. We provide recommen… ▽ More We use gradient boosting machines and logistic regression to predict academic throughput at a South African university. The results highlight the significant influence of socio-economic factors and field of study as predictors of throughput. We further find that socio-economic factors become less of a predictor relative to the field of study as the time to completion increases. We provide recommendations on interventions to counteract the identified effects, which include academic, psychosocial and financial support. △ Less

Submitted 12 June, 2021; originally announced June 2021.

arXiv:2102.07106 [pdf, other]

Healing Products of Gaussian Processes

Authors: Samuel Cohen, Rendani Mbuvha, Tshilidzi Marwala, Marc Peter Deisenroth

Abstract: Gaussian processes (GPs) are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert… ▽ More Gaussian processes (GPs) are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert models allow for massively distributed computation, their predictions typically suffer from erratic behaviour of the mean or uncalibrated uncertainty quantification. By calibrating predictions via a tempered softmax weighting, we provide a solution to these problems for multiple product-of-expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification. △ Less

Submitted 14 February, 2021; originally announced February 2021.

Comments: ICML 2020

arXiv:2001.01765 [pdf, other]

An Automatic Relevance Determination Prior Bayesian Neural Network for Controlled Variable Selection

Authors: Rendani Mbuvha, Illyes Boulkaibet, Tshilidzi Marwala

Abstract: We present an Automatic Relevance Determination prior Bayesian Neural Network(BNN-ARD) weight l2-norm measure as a feature importance statistic for the model-x knockoff filter. We show on both simulated data and the Norwegian wind farm dataset that the proposed feature importance statistic yields statistically significant improvements relative to similar feature importance measures in both variabl… ▽ More We present an Automatic Relevance Determination prior Bayesian Neural Network(BNN-ARD) weight l2-norm measure as a feature importance statistic for the model-x knockoff filter. We show on both simulated data and the Norwegian wind farm dataset that the proposed feature importance statistic yields statistically significant improvements relative to similar feature importance measures in both variable selection power and predictive performance on a real world dataset. △ Less

Submitted 6 January, 2020; originally announced January 2020.

arXiv:1910.09544 [pdf, other]

Relative Net Utility and the Saint Petersburg Paradox

Authors: Daniel Muller, Tshilidzi Marwala

Abstract: The famous Saint Petersburg Paradox (St. Petersburg Paradox) shows that the theory of expected value does not capture the real-world economics of decision-making problems. Over the years, many economic theories were developed to resolve the paradox and explain gaps in the economic value theory in the evaluation of economic decisions, the subjective utility of the expected outcomes, and risk aversi… ▽ More The famous Saint Petersburg Paradox (St. Petersburg Paradox) shows that the theory of expected value does not capture the real-world economics of decision-making problems. Over the years, many economic theories were developed to resolve the paradox and explain gaps in the economic value theory in the evaluation of economic decisions, the subjective utility of the expected outcomes, and risk aversion as observed in the game of the St. Petersburg Paradox. In this paper, we use the concept of the relative net utility to resolve the St. Petersburg Paradox. Because the net utility concept is able to explain both behavioral economics and the St. Petersburg Paradox, it is deemed to be a universal approach to handling utility. This paper shows how the information content of the notion of net utility value allows us to capture a broader context of the impact of a decision's possible achievements. It discusses the necessary conditions that the utility function has to conform to avoid the paradox. Combining these necessary conditions allows us to define the theorem of indifference in the evaluation of economic decisions and to present the role of the relative net utility and net utility polarity in a value rational decision-making process. △ Less

Submitted 18 May, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

Comments: extension of the discussion about the paradox, additional examples, and proofreading

arXiv:1906.06382 [pdf, other]

Automatic Relevance Determination Bayesian Neural Networks for Credit Card Default Modelling

Authors: Rendani Mbuvha, Illyes Boulkaibet, Tshilidzi Marwala

Abstract: Credit risk modelling is an integral part of the global financial system. While there has been great attention paid to neural network models for credit default prediction, such models often lack the required interpretation mechanisms and measures of the uncertainty around their predictions. This work develops and compares Bayesian Neural Networks(BNNs) for credit card default modelling. This inclu… ▽ More Credit risk modelling is an integral part of the global financial system. While there has been great attention paid to neural network models for credit default prediction, such models often lack the required interpretation mechanisms and measures of the uncertainty around their predictions. This work develops and compares Bayesian Neural Networks(BNNs) for credit card default modelling. This includes a BNNs trained by Gaussian approximation and the first implementation of BNNs trained by Hybrid Monte Carlo(HMC) in credit risk modelling. The results on the Taiwan Credit Dataset show that BNNs with Automatic Relevance Determination(ARD) outperform normal BNNs without ARD. The results also show that BNNs trained by Gaussian approximation display similar predictive performance to those trained by the HMC. The results further show that BNN with ARD can be used to draw inferences about the relative importance of different features thus critically aiding decision makers in explaining model output to consumers. The robustness of this result is reinforced by high levels of congruence between the features identified as important using the two different approaches for training BNNs. △ Less

Submitted 14 June, 2019; originally announced June 2019.

arXiv:1902.04832 [pdf]

Relative rationality: Is machine rationality subjective?

Authors: Tshilidzi Marwala

Abstract: Rational decision making in its linguistic description means making logical decisions. In essence, a rational agent optimally processes all relevant information to achieve its goal. Rationality has two elements and these are the use of relevant information and the efficient processing of such information. In reality, relevant information is incomplete, imperfect and the processing engine, which is… ▽ More Rational decision making in its linguistic description means making logical decisions. In essence, a rational agent optimally processes all relevant information to achieve its goal. Rationality has two elements and these are the use of relevant information and the efficient processing of such information. In reality, relevant information is incomplete, imperfect and the processing engine, which is a brain for humans, is suboptimal. Humans are risk averse rather than utility maximizers. In the real world, problems are predominantly non-convex and this makes the idea of rational decision-making fundamentally unachievable and Herbert Simon called this bounded rationality. There is a trade-off between the amount of information used for decision-making and the complexity of the decision model used. This explores whether machine rationality is subjective and concludes that indeed it is. △ Less

Submitted 13 February, 2019; originally announced February 2019.

arXiv:1812.10144 [pdf]

Can rationality be measured?

Authors: Tshilidzi Marwala

Abstract: This paper studies whether rationality can be computed. Rationality is defined as the use of complete information, which is processed with a perfect biological or physical brain, in an optimized fashion. To compute rationality one needs to quantify how complete is the information, how perfect is the physical or biological brain and how optimized is the entire decision making system. The rationalit… ▽ More This paper studies whether rationality can be computed. Rationality is defined as the use of complete information, which is processed with a perfect biological or physical brain, in an optimized fashion. To compute rationality one needs to quantify how complete is the information, how perfect is the physical or biological brain and how optimized is the entire decision making system. The rationality of a model (i.e. physical or biological brain) is measured by the expected accuracy of the model. The rationality of the optimization procedure is measured as the ratio of the achieved objective (i.e. utility) to the global objective. The overall rationality of a decision is measured as the product of the rationality of the model and the rationality of the optimization procedure. The conclusion reached is that rationality can be computed for convex optimization problems. △ Less

Submitted 25 December, 2018; originally announced December 2018.

arXiv:1812.06510 [pdf]

The limit of artificial intelligence: Can machines be rational?

Authors: Tshilidzi Marwala

Abstract: This paper studies the question on whether machines can be rational. It observes the existing reasons why humans are not rational which is due to imperfect and limited information, limited and inconsistent processing power through the brain and the inability to optimize decisions and achieve maximum utility. It studies whether these limitations of humans are transferred to the limitations of machi… ▽ More This paper studies the question on whether machines can be rational. It observes the existing reasons why humans are not rational which is due to imperfect and limited information, limited and inconsistent processing power through the brain and the inability to optimize decisions and achieve maximum utility. It studies whether these limitations of humans are transferred to the limitations of machines. The conclusion reached is that even though machines are not rational advances in technological developments make these machines more rational. It also concludes that machines can be more rational than humans. △ Less

Submitted 16 December, 2018; originally announced December 2018.

arXiv:1808.01666 [pdf]

On Robot Revolution and Taxation

Authors: Tshilidzi Marwala

Abstract: Advances in artificial intelligence are resulting in the rapid automation of the work force. The tools that are used to automate are called robots. Bill Gates proposed that in order to deal with the problem of the loss of jobs and reduction of the tax revenue we ought to tax the robots. The problem with taxing the robots is that it is not easy to know what a robot is. This article studies the defi… ▽ More Advances in artificial intelligence are resulting in the rapid automation of the work force. The tools that are used to automate are called robots. Bill Gates proposed that in order to deal with the problem of the loss of jobs and reduction of the tax revenue we ought to tax the robots. The problem with taxing the robots is that it is not easy to know what a robot is. This article studies the definition of a robot and the implication of advances in robotics on taxation. It is evident from this article that it is a difficult task to establish what a robot is and what is not a robot. It concludes that taxing robots is the same as increasing corporate tax. △ Less

Submitted 5 August, 2018; originally announced August 2018.

arXiv:1807.08195 [pdf]

Creativity and Artificial Intelligence: A Digital Art Perspective

Authors: Bo Xing, Tshilidzi Marwala

Abstract: This paper describes the application of artificial intelligence to the creation of digital art. AI is a computational paradigm that codifies intelligence into machines. There are generally three types of artificial intelligence and these are machine learning, evolutionary programming and soft computing. Machine learning is the statistical approach to building intelligent systems. Evolutionary prog… ▽ More This paper describes the application of artificial intelligence to the creation of digital art. AI is a computational paradigm that codifies intelligence into machines. There are generally three types of artificial intelligence and these are machine learning, evolutionary programming and soft computing. Machine learning is the statistical approach to building intelligent systems. Evolutionary programming is the use of natural evolutionary systems to design intelligent machines. Some of the evolutionary programming systems include genetic algorithm which is inspired by the principles of evolution and swarm optimization which is inspired by the swarming of birds, fish, ants etc. Soft computing includes techniques such as agent based modelling and fuzzy logic. Opportunities on the applications of these to digital art are explored. △ Less

Submitted 21 July, 2018; originally announced July 2018.

Comments: 5 pages

arXiv:1804.06159 [pdf, other]

doi 10.1016/j.cnsns.2018.07.008

Precise Detection of Speech Endpoints Dynamically: A Wavelet Convolution based approach

Authors: Tanmoy Roy, Tshilidzi Marwala, Snehashish Chakraverty

Abstract: Precise detection of speech endpoints is an important factor which affects the performance of the systems where speech utterances need to be extracted from the speech signal such as Automatic Speech Recognition (ASR) system. Existing endpoint detection (EPD) methods mostly uses Short-Term Energy (STE), Zero-Crossing Rate (ZCR) based approaches and their variants. But STE and ZCR based EPD algorith… ▽ More Precise detection of speech endpoints is an important factor which affects the performance of the systems where speech utterances need to be extracted from the speech signal such as Automatic Speech Recognition (ASR) system. Existing endpoint detection (EPD) methods mostly uses Short-Term Energy (STE), Zero-Crossing Rate (ZCR) based approaches and their variants. But STE and ZCR based EPD algorithms often fail in the presence of Non-speech Sound Artifacts (NSAs) produced by the speakers. Algorithms based on pattern recognition and classification techniques are also proposed but require labeled data for training. A new algorithm termed as Wavelet Convolution based Speech Endpoint Detection (WCSEPD) is proposed in this article to extract speech endpoints. WCSEPD decomposes the speech signal into high-frequency and low-frequency components using wavelet convolution and computes entropy based thresholds for the two frequency components. The low-frequency thresholds are used to extract voiced speech segments, whereas the high-frequency thresholds are used to extract the unvoiced speech segments by filtering out the NSAs. WCSEPD does not require any labeled data for training and can automatically extract speech segments. Experiment results show that the proposed algorithm precisely extracts speech endpoints in the presence of NSAs. △ Less

Submitted 17 April, 2018; originally announced April 2018.

Comments: 25 Pages

arXiv:1802.04451 [pdf]

Blockchain and Artificial Intelligence

Authors: Tshilidzi Marwala, Bo Xing

Abstract: It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development o… ▽ More It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development of a blockchain system is still attributed to a cluster of core developers. Take smart contract as an example, it is essentially a collection of codes (or functions) and data (or states) that are programmed and deployed on a blockchain (say, Ethereum) by different human programmers. It is thus, unfortunately, less likely to be free of loopholes and flaws. In this article, through a brief overview about how artificial intelligence could be used to deliver bug-free smart contract so as to achieve the goal of blockchain 2.0, we to emphasize that the blockchain implementation can be assisted or enhanced via various AI techniques. The alliance of AI and blockchain is expected to create numerous possibilities. △ Less

Submitted 23 October, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

arXiv:1711.00462 [pdf, other]

Early prediction of the duration of protests using probabilistic Latent Dirichlet Allocation and Decision Trees

Authors: Satyakama Paul, Madhur Hasija, Tshilidzi Marwala

Abstract: Protests and agitations are an integral part of every democratic civil society. In recent years, South Africa has seen a large increase in its protests. The objective of this paper is to provide an early prediction of the duration of protests from its free flowing English text description. Free flowing descriptions of the protests help us in capturing its various nuances such as multiple causes, c… ▽ More Protests and agitations are an integral part of every democratic civil society. In recent years, South Africa has seen a large increase in its protests. The objective of this paper is to provide an early prediction of the duration of protests from its free flowing English text description. Free flowing descriptions of the protests help us in capturing its various nuances such as multiple causes, courses of actions etc. Next we use a combination of unsupervised learning (topic modeling) and supervised learning (decision trees) to predict the duration of the protests. Our results show a high degree (close to 90%) of accuracy in early prediction of the duration of protests.We expect the work to help police and other security services in planning and managing their resources in better handling protests in future. △ Less

Submitted 18 September, 2017; originally announced November 2017.

Comments: This paper is to appear in the 4th IEEE Latin American Conference on Computational Intelligence LA-CCI. This paper was written by Satyakama and Madhur and supervised by Tshilidzi

arXiv:1710.09486 [pdf]

A Differential Evaluation Markov Chain Monte Carlo algorithm for Bayesian Model Updating

Authors: M. Sherri, I. Boulkaibet, T. Marwala, M. I. Friswell

Abstract: The use of the Bayesian tools in system identification and model updating paradigms has been increased in the last ten years. Usually, the Bayesian techniques can be implemented to incorporate the uncertainties associated with measurements as well as the prediction made by the finite element model (FEM) into the FEM updating procedure. In this case, the posterior distribution function describes th… ▽ More The use of the Bayesian tools in system identification and model updating paradigms has been increased in the last ten years. Usually, the Bayesian techniques can be implemented to incorporate the uncertainties associated with measurements as well as the prediction made by the finite element model (FEM) into the FEM updating procedure. In this case, the posterior distribution function describes the uncertainty in the FE model prediction and the experimental data. Due to the complexity of the modeled systems, the analytical solution for the posterior distribution function may not exist. This leads to the use of numerical methods, such as Markov Chain Monte Carlo techniques, to obtain approximate solutions for the posterior distribution function. In this paper, a Differential Evaluation Markov Chain Monte Carlo (DE-MC) method is used to approximate the posterior function and update FEMs. The main idea of the DE-MC approach is to combine the Differential Evolution, which is an effective global optimization algorithm over real parameter space, with Markov Chain Monte Carlo (MCMC) techniques to generate samples from the posterior distribution function. In this paper, the DE-MC method is discussed in detail while the performance and the accuracy of this algorithm are investigated by updating two structural examples. △ Less

Submitted 25 October, 2017; originally announced October 2017.

Comments: To be published in the IMAC XXXVI, Florida, USA

arXiv:1703.10098 [pdf]

Rational Choice and Artificial Intelligence

Authors: Tshilidzi Marwala

Abstract: The theory of rational choice assumes that when people make decisions they do so in order to maximize their utility. In order to achieve this goal they ought to use all the information available and consider all the choices available to choose an optimal choice. This paper investigates what happens when decisions are made by artificially intelligent machines in the market rather than human beings.… ▽ More The theory of rational choice assumes that when people make decisions they do so in order to maximize their utility. In order to achieve this goal they ought to use all the information available and consider all the choices available to choose an optimal choice. This paper investigates what happens when decisions are made by artificially intelligent machines in the market rather than human beings. Firstly, the expectations of the future are more consistent if they are made by an artificially intelligent machine and the decisions are more rational and thus marketplace becomes more rational. △ Less

Submitted 29 March, 2017; originally announced March 2017.

arXiv:1703.09643 [pdf]

Implications of the Fourth Industrial Age on Higher Education

Authors: Bo Xing, Tshilidzi Marwala

Abstract: Higher education in the fourth industrial revolution, HE 4.0, is a complex, dialectical and exciting opportunity which can potentially transform society for the better. The fourth industrial revolution is powered by artificial intelligence and it will transform the workplace from tasks based characteristics to the human centred characteristics. Because of the convergence of man and machine, it wil… ▽ More Higher education in the fourth industrial revolution, HE 4.0, is a complex, dialectical and exciting opportunity which can potentially transform society for the better. The fourth industrial revolution is powered by artificial intelligence and it will transform the workplace from tasks based characteristics to the human centred characteristics. Because of the convergence of man and machine, it will reduce the subject distance between humanities and social science as well as science and technology. This will necessarily require much more interdisciplinary teaching, research and innovation. This paper explores the impact of HE 4.0 on the mission of a university which is teaching, research (including innovation) and service. △ Less

Submitted 17 March, 2017; originally announced March 2017.

Comments: Submitted to The Thinker

arXiv:1703.06597 [pdf]

Artificial Intelligence and Economic Theories

Authors: Tshilidzi Marwala, Evan Hurwitz

Abstract: The advent of artificial intelligence has changed many disciplines such as engineering, social science and economics. Artificial intelligence is a computational technique which is inspired by natural intelligence such as the swarming of birds, the working of the brain and the pathfinding of the ants. These techniques have impact on economic theories. This book studies the impact of artificial inte… ▽ More The advent of artificial intelligence has changed many disciplines such as engineering, social science and economics. Artificial intelligence is a computational technique which is inspired by natural intelligence such as the swarming of birds, the working of the brain and the pathfinding of the ants. These techniques have impact on economic theories. This book studies the impact of artificial intelligence on economic theories, a subject that has not been extensively studied. The theories that are considered are: demand and supply, asymmetrical information, pricing, rational choice, rational expectation, game theory, efficient market hypotheses, mechanism design, prospect, bounded rationality, portfolio theory, rational counterfactual and causality. The benefit of this book is that it evaluates existing theories of economics and update them based on the developments in artificial intelligence field. △ Less

Submitted 20 March, 2017; originally announced March 2017.

Comments: Marwala, T. and Hurwitz, E. (2017) Artificial Intelligence and Economic Theory: Skynet in the Market. Springer. (Accepted)

arXiv:1701.00833 [pdf]

Fuzzy finite element model updating using metaheuristic optimization algorithms

Authors: I. Boulkaibet, T. Marwala, M. I. Friswell, H. Haddad Khodaparast, S. Adhikari

Abstract: In this paper, a non-probabilistic method based on fuzzy logic is used to update finite element models (FEMs). Model updating techniques use the measured data to improve the accuracy of numerical models of structures. However, the measured data are contaminated with experimental noise and the models are inaccurate due to randomness in the parameters. This kind of aleatory uncertainty is irreducibl… ▽ More In this paper, a non-probabilistic method based on fuzzy logic is used to update finite element models (FEMs). Model updating techniques use the measured data to improve the accuracy of numerical models of structures. However, the measured data are contaminated with experimental noise and the models are inaccurate due to randomness in the parameters. This kind of aleatory uncertainty is irreducible, and may decrease the accuracy of the finite element model updating process. However, uncertainty quantification methods can be used to identify the uncertainty in the updating parameters. In this paper, the uncertainties associated with the modal parameters are defined as fuzzy membership functions, while the model updating procedure is defined as an optimization problem at each α-cut level. To determine the membership functions of the updated parameters, an objective function is defined and minimized using two metaheuristic optimization algorithms: ant colony optimization (ACO) and particle swarm optimization (PSO). A structural example is used to investigate the accuracy of the fuzzy model updating strategy using the PSO and ACO algorithms. Furthermore, the results obtained by the fuzzy finite element model updating are compared with the Bayesian model updating results. △ Less

Submitted 3 January, 2017; originally announced January 2017.

Comments: This article was accepted by the 2017 International Modal Analysis Conference

arXiv:1607.00136 [pdf, other]

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

Authors: Collins Leke, Tshilidzi Marwala

Abstract: In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning t… ▽ More In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed. △ Less

Submitted 1 July, 2016; originally announced July 2016.

Comments: 12 pages, 3 figures

arXiv:1512.01362 [pdf, other]

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

Authors: Collins Leke, Tshilidzi Marwala, Satyakama Paul

Abstract: In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or optimization techniques and K-Nearest Neighbor approaches to solve the problem. The presence of missing data entries in databases render the tasks of decision-making… ▽ More In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or optimization techniques and K-Nearest Neighbor approaches to solve the problem. The presence of missing data entries in databases render the tasks of decision-making and data analysis nontrivial. As a result this area has attracted a lot of research interest with the aim being to yield accurate and time efficient and sensitive missing data imputation techniques especially when time sensitive applications are concerned like power plants and winding processes. In this article, considering arbitrary and monotone missing data patterns, we hypothesize that the use of deep neural networks built using autoencoders and denoising autoencoders in conjunction with genetic algorithms, swarm intelligence and maximum likelihood estimator methods as novel data imputation techniques will lead to better imputed values than existing techniques. Also considered are the missing at random, missing completely at random and missing not at random missing data mechanisms. We also intend to use fuzzy logic in tandem with deep neural networks to perform the missing data imputation tasks, as well as different building blocks for the deep neural networks like Stacked Restricted Boltzmann Machines and Deep Belief Networks to test our hypothesis. The motivation behind this article is the need for missing data imputation techniques that lead to better imputed values than existing methods with higher accuracies and lower errors. △ Less

Submitted 4 December, 2015; originally announced December 2015.

Comments: 14 Pages, 4 figures, journal, experiments will be added testing the hypotheses

arXiv:1510.04632 [pdf]

Monte Carlo Dynamically Weighted Importance Sampling For Finite Element Model Updating

Authors: Daniel J Joubert, Tshilidzi Marwala

Abstract: The Finite Element Method (FEM) is generally unable to accurately predict natural frequencies and mode shapes of structures (eigenvalues and eigenvectors). Engineers develop numerical methods and a variety of techniques to compensate for this misalignment of modal properties, between experimentally measured data and the computed result from the FEM of structures. In this paper we compare two indir… ▽ More The Finite Element Method (FEM) is generally unable to accurately predict natural frequencies and mode shapes of structures (eigenvalues and eigenvectors). Engineers develop numerical methods and a variety of techniques to compensate for this misalignment of modal properties, between experimentally measured data and the computed result from the FEM of structures. In this paper we compare two indirect methods of updating namely, the Adaptive Metropolis Hastings and a newly applied algorithm called Monte Carlo Dynamically Weighted Importance Sampling (MCDWIS). The approximation of a posterior predictive distribution is based on Bayesian inference of continuous multivariate Gaussian probability density functions, defining the variability of physical properties affected by forced vibration. The motivation behind applying MCDWIS is in the complexity of computing normalizing constants in higher dimensional or multimodal systems. The MCDWIS accounts for this intractability by analytically computing importance sampling estimates at each time step of the algorithm. In addition, a dynamic weighting step with an Adaptive Pruned Enriched Population Control Scheme (APEPCS) allows for further control over weighted samples and population size. The performance of the MCDWIS simulation is graphically illustrated for all algorithm dependent parameters and show unbiased, stable sample estimates. △ Less

Submitted 15 October, 2015; originally announced October 2015.

Comments: Submitted to the IMAC-XXXIV

arXiv:1510.02867 [pdf]

Artificial Intelligence and Asymmetric Information Theory

Authors: Tshilidzi Marwala, Evan Hurwitz

Abstract: When human agents come together to make decisions, it is often the case that one human agent has more information than the other. This phenomenon is called information asymmetry and this distorts the market. Often if one human agent intends to manipulate a decision in its favor the human agent can signal wrong or right information. Alternatively, one human agent can screen for information to reduc… ▽ More When human agents come together to make decisions, it is often the case that one human agent has more information than the other. This phenomenon is called information asymmetry and this distorts the market. Often if one human agent intends to manipulate a decision in its favor the human agent can signal wrong or right information. Alternatively, one human agent can screen for information to reduce the impact of asymmetric information on decisions. With the advent of artificial intelligence, signaling and screening have been made easier. This paper studies the impact of artificial intelligence on the theory of asymmetric information. It is surmised that artificial intelligent agents reduce the degree of information asymmetry and thus the market where these agents are deployed become more efficient. It is also postulated that the more artificial intelligent agents there are deployed in the market the less is the volume of trades in the market. This is because for many trades to happen the asymmetry of information on goods and services to be traded should exist, creating a sense of arbitrage. △ Less

Submitted 14 October, 2015; v1 submitted 9 October, 2015; originally announced October 2015.

arXiv:1509.04904 [pdf, other]

Causal Model Analysis using Collider v-structure with Negative Percentage Mapping

Authors: Pramod Kumar Parida, Tshilidzi Marwala, Snehashish Chakraverty

Abstract: A major problem of causal inference is the arrangement of dependent nodes in a directed acyclic graph (DAG) with path coefficients and observed confounders. Path coefficients do not provide the units to measure the strength of information flowing from one node to the other. Here we proposed the method of causal structure learning using collider v-structures (CVS) with Negative Percentage Mapping (… ▽ More A major problem of causal inference is the arrangement of dependent nodes in a directed acyclic graph (DAG) with path coefficients and observed confounders. Path coefficients do not provide the units to measure the strength of information flowing from one node to the other. Here we proposed the method of causal structure learning using collider v-structures (CVS) with Negative Percentage Mapping (NPM) to get selective thresholds of information strength, to direct the edges and subjective confounders in a DAG. The NPM is used to scale the strength of information passed through nodes in units of percentage from interval from 0 to 1. The causal structures are constructed by bottom up approach using path coefficients, causal directions and confounders, derived implementing collider v-structure and NPM. The method is self-sufficient to observe all the latent confounders present in the causal model and capable of detecting every responsible causal direction. The results are tested for simulated datasets of non-Gaussian distributions and compared with DirectLiNGAM and ICA-LiNGAM to check efficiency of the proposed method. △ Less

Submitted 16 September, 2015; originally announced September 2015.

arXiv:1509.01213 [pdf]

Impact of Artificial Intelligence on Economic Theory

Authors: Tshilidzi Marwala

Abstract: Artificial intelligence has impacted many aspects of human life. This paper studies the impact of artificial intelligence on economic theory. In particular we study the impact of artificial intelligence on the theory of bounded rationality, efficient market hypothesis and prospect theory. Artificial intelligence has impacted many aspects of human life. This paper studies the impact of artificial intelligence on economic theory. In particular we study the impact of artificial intelligence on the theory of bounded rationality, efficient market hypothesis and prospect theory. △ Less

Submitted 1 July, 2015; originally announced September 2015.

arXiv:1404.2116 [pdf]

Rational Counterfactuals

Authors: Tshilidzi Marwala

Abstract: This paper introduces the concept of rational countefactuals which is an idea of identifying a counterfactual from the factual (whether perceived or real) that maximizes the attainment of the desired consequent. In counterfactual thinking if we have a factual statement like: Saddam Hussein invaded Kuwait and consequently George Bush declared war on Iraq then its counterfactuals is: If Saddam Husse… ▽ More This paper introduces the concept of rational countefactuals which is an idea of identifying a counterfactual from the factual (whether perceived or real) that maximizes the attainment of the desired consequent. In counterfactual thinking if we have a factual statement like: Saddam Hussein invaded Kuwait and consequently George Bush declared war on Iraq then its counterfactuals is: If Saddam Hussein did not invade Kuwait then George Bush would not have declared war on Iraq. The theory of rational counterfactuals is applied to identify the antecedent that gives the desired consequent necessary for rational decision making. The rational countefactual theory is applied to identify the values of variables Allies, Contingency, Distance, Major Power, Capability, Democracy, as well as Economic Interdependency that gives the desired consequent Peace. △ Less

Submitted 8 April, 2014; originally announced April 2014.

Comments: To appear in Artificial Intelligence for Rational Decision Making (Springer-Verlag)

arXiv:1403.5488 [pdf]

Missing Data Prediction and Classification: The Use of Auto-Associative Neural Networks and Optimization Algorithms

Authors: Collins Leke, Bhekisipho Twala, T. Marwala

Abstract: This paper presents methods which are aimed at finding approximations to missing data in a dataset by using optimization algorithms to optimize the network parameters after which prediction and classification tasks can be performed. The optimization methods that are considered are genetic algorithm (GA), simulated annealing (SA), particle swarm optimization (PSO), random forest (RF) and negative s… ▽ More This paper presents methods which are aimed at finding approximations to missing data in a dataset by using optimization algorithms to optimize the network parameters after which prediction and classification tasks can be performed. The optimization methods that are considered are genetic algorithm (GA), simulated annealing (SA), particle swarm optimization (PSO), random forest (RF) and negative selection (NS) and these methods are individually used in combination with auto-associative neural networks (AANN) for missing data estimation and the results obtained are compared. The methods suggested use the optimization algorithms to minimize an error function derived from training the auto-associative neural network during which the interrelationships between the inputs and the outputs are obtained and stored in the weights connecting the different layers of the network. The error function is expressed as the square of the difference between the actual observations and predicted values from an auto-associative neural network. In the event of missing data, all the values of the actual observations are not known hence, the error function is decomposed to depend on the known and unknown variable values. Multi-layer perceptron (MLP) neural network is employed to train the neural networks using the scaled conjugate gradient (SCG) method. Prediction accuracy is determined by mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and correlation coefficient (r) computations. Accuracy in classification is obtained by plotting ROC curves and calculating the areas under these. Analysis of results depicts that the approach using RF with AANN produces the most accurate predictions and classifications while on the other end of the scale is the approach which entails using NS with AANN. △ Less

Submitted 21 March, 2014; originally announced March 2014.

arXiv:1308.2309 [pdf]

Applying the Negative Selection Algorithm for Merger and Acquisition Target Identification

Authors: Satyakama Paul, Andreas Janecek, Fernando Buarque de Lima Neto, Tshilidzi Marwala

Abstract: In this paper, we propose a new methodology based on the Negative Selection Algorithm that belongs to the field of Computational Intelligence, specifically, Artificial Immune Systems to identify takeover targets. Although considerable research based on customary statistical techniques and some contemporary Computational Intelligence techniques have been devoted to identify takeover targets, most o… ▽ More In this paper, we propose a new methodology based on the Negative Selection Algorithm that belongs to the field of Computational Intelligence, specifically, Artificial Immune Systems to identify takeover targets. Although considerable research based on customary statistical techniques and some contemporary Computational Intelligence techniques have been devoted to identify takeover targets, most of the existing studies are based upon multiple previous mergers and acquisitions. Contrary to previous research, the novelty of this proposal lies in its ability to suggest takeover targets for novice firms that are at the beginning of their merger and acquisition spree. We first discuss the theoretical perspective and then provide a case study with details for practical implementation, both capitalizing from unique generalization capabilities of artificial immune systems algorithms. △ Less

Submitted 10 August, 2013; originally announced August 2013.

Comments: To appear in the proceedings of the 1st BRICS Countries & 11th CBIC Brazilian Congress on Computational Intelligence

arXiv:1308.2307 [pdf]

Finite Element Model Updating Using Fish School Search Optimization Method

Authors: I. Boulkabeit, L. Mthembu, T. Marwala, F. De Lima Neto

Abstract: A recent nature inspired optimization algorithm, Fish School Search (FSS) is applied to the finite element model (FEM) updating problem. This method is tested on a GARTEUR SM-AG19 aeroplane structure. The results of this algorithm are compared with two other metaheuristic algorithms; Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). It is observed that on average, the FSS and PSO algor… ▽ More A recent nature inspired optimization algorithm, Fish School Search (FSS) is applied to the finite element model (FEM) updating problem. This method is tested on a GARTEUR SM-AG19 aeroplane structure. The results of this algorithm are compared with two other metaheuristic algorithms; Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). It is observed that on average, the FSS and PSO algorithms give more accurate results than the GA. A minor modification to the FSS is proposed. This modification improves the performance of FSS on the FEM updating problem which has a constrained search space. △ Less

Submitted 10 August, 2013; originally announced August 2013.

Comments: To appear in the 1st BRICS Countries & 11th CBIC Brazilian Congress on Computational Intelligence

arXiv:1306.2025 [pdf]

Flexibly-bounded Rationality and Marginalization of Irrationality Theories for Decision Making

Authors: Tshilidzi Marwala

Abstract: In this paper the theory of flexibly-bounded rationality which is an extension to the theory of bounded rationality is revisited. Rational decision making involves using information which is almost always imperfect and incomplete together with some intelligent machine which if it is a human being is inconsistent to make decisions. In bounded rationality, this decision is made irrespective of the f… ▽ More In this paper the theory of flexibly-bounded rationality which is an extension to the theory of bounded rationality is revisited. Rational decision making involves using information which is almost always imperfect and incomplete together with some intelligent machine which if it is a human being is inconsistent to make decisions. In bounded rationality, this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and that the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of flexibly-bounded rationality, advanced information analysis is used, the correlation machine is applied to complete missing information and artificial intelligence is used to make more consistent decisions. Therefore flexibly-bounded rationality expands the bounds within which rationality is exercised. Because human decision making is essentially irrational, this paper proposes the theory of marginalization of irrationality in decision making to deal with the problem of satisficing in the presence of irrationality. △ Less

Submitted 9 June, 2013; originally announced June 2013.

Comments: 17 pages, submitted to Springer-Verlag. arXiv admin note: substantial text overlap with arXiv:1305.6037

arXiv:1305.6037 [pdf]

Semi-bounded Rationality: A model for decision making

Authors: Tshilidzi Marwala

Abstract: In this paper the theory of semi-bounded rationality is proposed as an extension of the theory of bounded rationality. In particular, it is proposed that a decision making process involves two components and these are the correlation machine, which estimates missing values, and the causal machine, which relates the cause to the effect. Rational decision making involves using information which is a… ▽ More In this paper the theory of semi-bounded rationality is proposed as an extension of the theory of bounded rationality. In particular, it is proposed that a decision making process involves two components and these are the correlation machine, which estimates missing values, and the causal machine, which relates the cause to the effect. Rational decision making involves using information which is almost always imperfect and incomplete as well as some intelligent machine which if it is a human being is inconsistent to make decisions. In the theory of bounded rationality this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of semi-bounded rationality, signal processing is used to filter noise and outliers in the information and the correlation machine is applied to complete the missing information and artificial intelligence is used to make more consistent decisions. △ Less

Submitted 26 May, 2013; originally announced May 2013.

arXiv:1208.4429 [pdf]

Common Mistakes when Applying Computational Intelligence and Machine Learning to Stock Market modelling

Authors: E. Hurwitz, T. Marwala

Abstract: For a number of reasons, computational intelligence and machine learning methods have been largely dismissed by the professional community. The reasons for this are numerous and varied, but inevitably amongst the reasons given is that the systems designed often do not perform as expected by their designers. The reasons for this lack of performance is a direct result of mistakes that are commonly s… ▽ More For a number of reasons, computational intelligence and machine learning methods have been largely dismissed by the professional community. The reasons for this are numerous and varied, but inevitably amongst the reasons given is that the systems designed often do not perform as expected by their designers. The reasons for this lack of performance is a direct result of mistakes that are commonly seen in market-prediction systems. This paper examines some of the more common mistakes, namely dataset insufficiency; inappropriate scaling; time-series tracking; inappropriate target quantification and inappropriate measures of performance. The rationale that leads to each of these mistakes is examined, as well as the nature of the errors they introduce to the analysis / design. Alternative ways of performing each task are also recommended in order to avoid perpetuating these mistakes, and hopefully to aid in clearing the way for the use of these powerful techniques in industry. △ Less

Submitted 22 August, 2012; originally announced August 2012.

Comments: 5 pages

arXiv:1206.0908 [pdf]

Soft Computing in Product Recovery: A Survey Focusing on Remanufacturing System

Authors: Bo Xing, Wen-Jing Gao, Fulufhelo V. Nelwamondo, Kimberly Battle, Tshilidzi Marwala

Abstract: This paper focuses on the application of soft computing in remanufacturing system, in which end-of-life products are disassembled into basic components and then remanufactured for both economic and environmental reasons. The disassembly activities include disassembly sequencing and planning, while the remanufacturing process is composed of product design, production planning & scheduling, and inve… ▽ More This paper focuses on the application of soft computing in remanufacturing system, in which end-of-life products are disassembled into basic components and then remanufactured for both economic and environmental reasons. The disassembly activities include disassembly sequencing and planning, while the remanufacturing process is composed of product design, production planning & scheduling, and inventory management. This paper presents a review of the related articles and suggests the corresponding further research directions. △ Less

Submitted 5 June, 2012; originally announced June 2012.

arXiv:1110.4296 [pdf]

Organizational adaptation to Complexity: A study of the South African Insurance Market as a Complex Adaptive System through Statistical Risk Analysis

Authors: Satyakama Paul, Bhekisipho Twala, Tshilidzi Marwala

Abstract: South Africa assumes a significant position in the insurance landscape of Africa. The present research based upon qualitative and quantitative analysis, shows that it shows the characteristics of a Complex Adaptive System. In addition, a statistical analysis of risk measures through Value at risk and Conditional tail expectation is carried out to show how an individual insurance company copes unde… ▽ More South Africa assumes a significant position in the insurance landscape of Africa. The present research based upon qualitative and quantitative analysis, shows that it shows the characteristics of a Complex Adaptive System. In addition, a statistical analysis of risk measures through Value at risk and Conditional tail expectation is carried out to show how an individual insurance company copes under external complexities. The authors believe that an explanation of the coping strategies, and the subsequent managerial implications would enrich our understanding of complexity in business. △ Less

Submitted 19 October, 2011; originally announced October 2011.

Comments: Paper Presented The 2nd International Conference on Complexity Science Management & Intelligent Information System will be held in October 14, 2011

arXiv:1110.3385 [pdf]

Fuzzy Inference Systems Optimization

Authors: Pretesh Patel, Tshilidzi Marwala

Abstract: This paper compares various optimization methods for fuzzy inference system optimization. The optimization methods compared are genetic algorithm, particle swarm optimization and simulated annealing. When these techniques were implemented it was observed that the performance of each technique within the fuzzy inference system classification was context dependent. This paper compares various optimization methods for fuzzy inference system optimization. The optimization methods compared are genetic algorithm, particle swarm optimization and simulated annealing. When these techniques were implemented it was observed that the performance of each technique within the fuzzy inference system classification was context dependent. △ Less

Submitted 15 October, 2011; originally announced October 2011.

Comments: Paper Submitted to INTECH

arXiv:1110.3383 [pdf]

Suitability of using technical indicators as potential strategies within intelligent trading systems

Authors: Evan Hurwitz, Tshilidzi Marwala

Abstract: The potential of machine learning to automate and control nonlinear, complex systems is well established. These same techniques have always presented potential for use in the investment arena, specifically for the managing of equity portfolios. In this paper, the opportunity for such exploitation is investigated through analysis of potential simple trading strategies that can then be meshed togeth… ▽ More The potential of machine learning to automate and control nonlinear, complex systems is well established. These same techniques have always presented potential for use in the investment arena, specifically for the managing of equity portfolios. In this paper, the opportunity for such exploitation is investigated through analysis of potential simple trading strategies that can then be meshed together for the machine learning system to switch between. It is the eligibility of these strategies that is being investigated in this paper, rather than application. In order to accomplish this, the underlying assumptions of each trading system are explored, and data is created in order to evaluate the efficacy of these systems when trading on data with the underlying patterns that they expect. The strategies are tested against a buy-and-hold strategy to determine if the act of trading has actually produced any worthwhile results, or are simply facets of the underlying prices. These results are then used to produce targeted returns based upon either a desired return or a desired risk, as both are required within the portfolio-management industry. Results show a very viable opportunity for exploitation within the aforementioned industry, with the Strategies performing well within their narrow assumptions, and the intelligent system combining them to perform without assumptions. △ Less

Submitted 15 October, 2011; originally announced October 2011.

Comments: Paper Presented at the 2011 IEEE International Conference on Systems, Man, and Cybernetics, Anchorage, Alaska, USA

arXiv:1110.3382 [pdf]

Sampling Techniques in Bayesian Finite Element Model Updating

Authors: I. Boulkaibet, T. Marwala, L. Mthembu, M. I. Friswell, S. Adhikari

Abstract: Recent papers in the field of Finite Element Model (FEM) updating have highlighted the benefits of Bayesian techniques. The Bayesian approaches are designed to deal with the uncertainties associated with complex systems, which is the main problem in the development and updating of FEMs. This paper highlights the complexities and challenges of implementing any Bayesian method when the analysis invo… ▽ More Recent papers in the field of Finite Element Model (FEM) updating have highlighted the benefits of Bayesian techniques. The Bayesian approaches are designed to deal with the uncertainties associated with complex systems, which is the main problem in the development and updating of FEMs. This paper highlights the complexities and challenges of implementing any Bayesian method when the analysis involves a complicated structural dynamic model. In such systems an analytical Bayesian formulation might not be available in an analytic form; therefore this leads to the use of numerical methods, i.e. sampling methods. The main challenge then is to determine an efficient sampling of the model parameter space. In this paper, three sampling techniques, the Metropolis-Hastings (MH) algorithm, Slice Sampling and the Hybrid Monte Carlo (HMC) technique, are tested by updating a structural beam model. The efficiency and limitations of each technique is investigated when the FEM updating problem is implemented using the Bayesian Approach. Both MH and HMC techniques are found to perform better than the Slice sampling when Young's modulus is chosen as the updating parameter. The HMC method gives better results than MH and Slice sampling techniques, when the area moment of inertias and section areas are updated. △ Less

Submitted 15 October, 2011; originally announced October 2011.

Comments: Paper Accepted in the 25th International Modal Analysis Conference, 2012

arXiv:1108.5250 [pdf]

doi 10.1109/IEMBS.2011.6091552

Single-trial EEG Discrimination between Wrist and Finger Movement Imagery and Execution in a Sensorimotor BCI

Authors: A. K. Mohamed, T. Marwala, L. R. John

Abstract: A brain-computer interface (BCI) may be used to control a prosthetic or orthotic hand using neural activity from the brain. The core of this sensorimotor BCI lies in the interpretation of the neural information extracted from electroencephalogram (EEG). It is desired to improve on the interpretation of EEG to allow people with neuromuscular disorders to perform daily activities. This paper investi… ▽ More A brain-computer interface (BCI) may be used to control a prosthetic or orthotic hand using neural activity from the brain. The core of this sensorimotor BCI lies in the interpretation of the neural information extracted from electroencephalogram (EEG). It is desired to improve on the interpretation of EEG to allow people with neuromuscular disorders to perform daily activities. This paper investigates the possibility of discriminating between the EEG associated with wrist and finger movements. The EEG was recorded from test subjects as they executed and imagined five essential hand movements using both hands. Independent component analysis (ICA) and time-frequency techniques were used to extract spectral features based on event-related (de)synchronisation (ERD/ERS), while the Bhattacharyya distance (BD) was used for feature reduction. Mahalanobis distance (MD) clustering and artificial neural networks (ANN) were used as classifiers and obtained average accuracies of 65 % and 71 % respectively. This shows that EEG discrimination between wrist and finger movements is possible. The research introduces a new combination of motor tasks to BCI research. △ Less

Submitted 26 August, 2011; originally announced August 2011.

Comments: 33rd Annual International IEEE EMBS Conference 2011

arXiv:1108.4618 [pdf]

Artificial Neural Network and Rough Set for HV Bushings Condition Monitoring

Authors: LJ Mpanza, T. Marwala

Abstract: Most transformer failures are attributed to bushings failures. Hence it is necessary to monitor the condition of bushings. In this paper three methods are developed to monitor the condition of oil filled bushing. Multi-layer perceptron (MLP), Radial basis function (RBF) and Rough Set (RS) models are developed and combined through majority voting to form a committee. The MLP performs better that th… ▽ More Most transformer failures are attributed to bushings failures. Hence it is necessary to monitor the condition of bushings. In this paper three methods are developed to monitor the condition of oil filled bushing. Multi-layer perceptron (MLP), Radial basis function (RBF) and Rough Set (RS) models are developed and combined through majority voting to form a committee. The MLP performs better that the RBF and the RS is terms of classification accuracy. The RBF is the fasted to train. The committee performs better than the individual models. The diversity of models is measured to evaluate their similarity when used in the committee. △ Less

Submitted 23 August, 2011; originally announced August 2011.

Comments: IEEE INES 2011

arXiv:1108.4551 [pdf]

Improving the performance of the ripper in insurance risk classification : A comparitive study using feature selection

Authors: Mlungisi Duma, Bhekisipho Twala, Tshilidzi Marwala

Abstract: The Ripper algorithm is designed to generate rule sets for large datasets with many features. However, it was shown that the algorithm struggles with classification performance in the presence of missing data. The algorithm struggles to classify instances when the quality of the data deteriorates as a result of increasing missing data. In this paper, a feature selection technique is used to help i… ▽ More The Ripper algorithm is designed to generate rule sets for large datasets with many features. However, it was shown that the algorithm struggles with classification performance in the presence of missing data. The algorithm struggles to classify instances when the quality of the data deteriorates as a result of increasing missing data. In this paper, a feature selection technique is used to help improve the classification performance of the Ripper model. Principal component analysis and evidence automatic relevance determination techniques are used to improve the performance. A comparison is done to see which technique helps the algorithm improve the most. Training datasets with completely observable data were used to construct the model and testing datasets with missing values were used for measuring accuracy. The results showed that principal component analysis is a better feature selection for the Ripper in improving the classification performance. △ Less

Submitted 23 August, 2011; originally announced August 2011.

Comments: ICINCO 2011: 8th International Conference on Informatics in Control, Automation and Robotics

arXiv:1108.4548 [pdf]

Ant Colony Optimization of Rough Set for HV Bushings Fault Detection

Authors: J. L. Mpanza, T. Marwala

Abstract: Most transformer failures are attributed to bushings failures. Hence it is necessary to monitor the condition of bushings. In this paper three methods are developed to monitor the condition of oil filled bushing. Multi-layer perceptron (MLP), Radial basis function (RBF) and Rough Set (RS) models are developed and combined through majority voting to form a committee. The MLP performs better that th… ▽ More Most transformer failures are attributed to bushings failures. Hence it is necessary to monitor the condition of bushings. In this paper three methods are developed to monitor the condition of oil filled bushing. Multi-layer perceptron (MLP), Radial basis function (RBF) and Rough Set (RS) models are developed and combined through majority voting to form a committee. The MLP performs better that the RBF and the RS is terms of classification accuracy. The RBF is the fasted to train. The committee performs better than the individual models. The diversity of models is measured to evaluate their similarity when used in the committee. △ Less

Submitted 23 August, 2011; originally announced August 2011.

Comments: Fourth International Workshop on Advanced Computational Intelligence (IWACI 2011)

arXiv:1108.4545 [pdf]

The fuzzy gene filter: A classifier performance assesment

Authors: Meir Perez, Tshilidzi Marwala

Abstract: The Fuzzy Gene Filter (FGF) is an optimised Fuzzy Inference System designed to rank genes in order of differential expression, based on expression data generated in a microarray experiment. This paper examines the effectiveness of the FGF for feature selection using various classification architectures. The FGF is compared to three of the most common gene ranking algorithms: t-test, Wilcoxon test… ▽ More The Fuzzy Gene Filter (FGF) is an optimised Fuzzy Inference System designed to rank genes in order of differential expression, based on expression data generated in a microarray experiment. This paper examines the effectiveness of the FGF for feature selection using various classification architectures. The FGF is compared to three of the most common gene ranking algorithms: t-test, Wilcoxon test and ROC curve analysis. Four classification schemes are used to compare the performance of the FGF vis-a-vis the standard approaches: K Nearest Neighbour (KNN), Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) and Artificial Neural Network (ANN). A nested stratified Leave-One-Out Cross Validation scheme is used to identify the optimal number top ranking genes, as well as the optimal classifier parameters. Two microarray data sets are used for the comparison: a prostate cancer data set and a lymphoma data set. △ Less

Submitted 23 August, 2011; originally announced August 2011.

Comments: Intelligent Systems and Control / 742: Computational Bioscience (ISC 2011) July 11 - 13, 2011 Cambridge, United Kingdom Editor(s): J.F. Whidborne, P. Willis, G. Montana

arXiv:1012.4046 [pdf]

Artificial Intelligence in Reverse Supply Chain Management: The State of the Art

Authors: Bo Xing, Wen-Jing Gao, Kimberly Battle, Tshildzi Marwala, Fulufhelo V. Nelwamondo

Abstract: Product take-back legislation forces manufacturers to bear the costs of collection and disposal of products that have reached the end of their useful lives. In order to reduce these costs, manufacturers can consider reuse, remanufacturing and/or recycling of components as an alternative to disposal. The implementation of such alternatives usually requires an appropriate reverse supply chain manage… ▽ More Product take-back legislation forces manufacturers to bear the costs of collection and disposal of products that have reached the end of their useful lives. In order to reduce these costs, manufacturers can consider reuse, remanufacturing and/or recycling of components as an alternative to disposal. The implementation of such alternatives usually requires an appropriate reverse supply chain management. With the concepts of reverse supply chain are gaining popularity in practice, the use of artificial intelligence approaches in these areas is also becoming popular. As a result, the purpose of this paper is to give an overview of the recent publications concerning the application of artificial intelligence techniques to reverse supply chain with emphasis on certain types of product returns. △ Less

Submitted 17 December, 2010; originally announced December 2010.

Comments: Proceedings of the Twenty-First Annual Symposium of the Pattern Recognition Association of South Africa 22-23 November 2010 Stellenbosch, South Africa, pp. 305-310

arXiv:1012.4045 [pdf]

Application of Global and One-Dimensional Local Optimization to Operating System Scheduler Tuning

Authors: George Anderson, Tshilidzi Marwala, Fulufhelo Vincent Nelwamondo

Abstract: This paper describes a study of comparison of global and one-dimensional local optimization methods to operating system scheduler tuning. The operating system scheduler we use is the Linux 2.6.23 Completely Fair Scheduler (CFS) running in simulator (LinSched). We have ported the Hackbench scheduler benchmark to this simulator and use this as the workload. The global optimization approach we use is… ▽ More This paper describes a study of comparison of global and one-dimensional local optimization methods to operating system scheduler tuning. The operating system scheduler we use is the Linux 2.6.23 Completely Fair Scheduler (CFS) running in simulator (LinSched). We have ported the Hackbench scheduler benchmark to this simulator and use this as the workload. The global optimization approach we use is Particle Swarm Optimization (PSO). We make use of Response Surface Methodology (RSM) to specify optimal parameters for our PSO implementation. The one-dimensional local optimization approach we use is the Golden Section method. In order to use this approach, we convert the scheduler tuning problem from one involving setting of three parameters to one involving the manipulation of one parameter. Our results show that the global optimization approach yields better response but the one- dimensional optimization approach converges to a solution faster than the global optimization approach. △ Less

Submitted 17 December, 2010; originally announced December 2010.

Comments: Proceedings of the Twenty-First Annual Symposium of the Pattern Recognition Association of South Africa 22-23 November 2010 Stellenbosch, South Africa, pp. 7-11

arXiv:1011.1735 [pdf]

Use of Data Mining in Scheduler Optimization

Authors: George Anderson, Tshilidzi Marwala, Fulufhelo V. Nelwamondo

Abstract: The operating system's role in a computer system is to manage the various resources. One of these resources is the Central Processing Unit. It is managed by a component of the operating system called the CPU scheduler. Schedulers are optimized for typical workloads expected to run on the platform. However, a single scheduler may not be appropriate for all workloads. That is, a scheduler may schedu… ▽ More The operating system's role in a computer system is to manage the various resources. One of these resources is the Central Processing Unit. It is managed by a component of the operating system called the CPU scheduler. Schedulers are optimized for typical workloads expected to run on the platform. However, a single scheduler may not be appropriate for all workloads. That is, a scheduler may schedule a workload such that the completion time is minimized, but when another type of workload is run on the platform, scheduling and therefore completion time will not be optimal; a different scheduling algorithm, or a different set of parameters, may work better. Several approaches to solving this problem have been proposed. The objective of this survey is to summarize the approaches based on data mining, which are available in the literature. In addition to solutions that can be directly utilized for solving this problem, we are interested in data mining research in related areas that have potential for use in operating system scheduling. We also explain general technical issues involved in scheduling in modern computers, including parallel scheduling issues related to multi-core CPUs. We propose a taxonomy that classifies the scheduling approaches we discuss into different categories. △ Less

Submitted 8 November, 2010; originally announced November 2010.

Comments: 10 pages

arXiv:0910.2276 [pdf]

State of the Art Review for Applying Computational Intelligence and Machine Learning Techniques to Portfolio Optimisation

Authors: Evan Hurwitz, Tshilidzi Marwala

Abstract: Computational techniques have shown much promise in the field of Finance, owing to their ability to extract sense out of dauntingly complex systems. This paper reviews the most promising of these techniques, from traditional computational intelligence methods to their machine learning siblings, with particular view to their application in optimising the management of a portfolio of financial ins… ▽ More Computational techniques have shown much promise in the field of Finance, owing to their ability to extract sense out of dauntingly complex systems. This paper reviews the most promising of these techniques, from traditional computational intelligence methods to their machine learning siblings, with particular view to their application in optimising the management of a portfolio of financial instruments. The current state of the art is assessed, and prospective further work is assessed and recommended △ Less

Submitted 13 October, 2009; originally announced October 2009.

Comments: 9 pages

Showing 1–50 of 108 results for author: Marwala, T