-
Modelling the Relationship Between Post Encroachment Time and Signal Timings Using UAV Video data
Authors:
Zubayer Islam,
Mohamed Abdel-Aty,
Amrita Goswamy,
Amr Abdelraouf,
Ou Zheng
Abstract:
Intersection safety often relies on the correct modelling of signal phasing and timing parameters. A slight increase in yellow time or red time can have significant impact on the rear end crashes or conflicts. This paper aims to identify the relationship between surrogate safety measures and signal phasing. Unmanned Aerial Vehicle (UAV) video data has been used to study an intersection. Post Encro…
▽ More
Intersection safety often relies on the correct modelling of signal phasing and timing parameters. A slight increase in yellow time or red time can have significant impact on the rear end crashes or conflicts. This paper aims to identify the relationship between surrogate safety measures and signal phasing. Unmanned Aerial Vehicle (UAV) video data has been used to study an intersection. Post Encroachment Time (PET) between vehicles was calculated from the video data as well as speed, heading and relevant signal timing parameters such as all red time, red clearance time, yellow time, etc. Random Parameter Ordered Logit Model was used to model the relationship between PET and these signal timing parameters. Overall, the results showed that yellow time and red clearance time is positively related to PETs. The model was also able to idendity certain signal phases that could be a potential safety hazard and would need to be retimed by considering the PETs. The odds ratios from the models also indicates that increasing the yellow and red clearance times by one second can improve the PET levels by 16% and 3% respectively.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Signal Classification using Smooth Coefficients of Multiple wavelets
Authors:
Paul Grant,
Md Zahidul Islam
Abstract:
Classification of time series signals has become an important construct and has many practical applications. With existing classifiers we may be able to accurately classify signals, however that accuracy may decline if using a reduced number of attributes. Transforming the data then undertaking reduction in dimensionality may improve the quality of the data analysis, decrease time required for cla…
▽ More
Classification of time series signals has become an important construct and has many practical applications. With existing classifiers we may be able to accurately classify signals, however that accuracy may decline if using a reduced number of attributes. Transforming the data then undertaking reduction in dimensionality may improve the quality of the data analysis, decrease time required for classification and simplify models. We propose an approach, which chooses suitable wavelets to transform the data, then combines the output from these transforms to construct a dataset to then apply ensemble classifiers to. We demonstrate this on different data sets, across different classifiers and use differing evaluation methods. Our experimental results demonstrate the effectiveness of the proposed technique, compared to the approaches that use either raw signal data or a single wavelet transform.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Detecting Autism Spectrum Disorder using Machine Learning
Authors:
Md Delowar Hossain,
Muhammad Ashad Kabir,
Adnan Anwar,
Md Zahidul Islam
Abstract:
Autism Spectrum Disorder (ASD), which is a neuro development disorder, is often accompanied by sensory issues such an over sensitivity or under sensitivity to sounds and smells or touch. Although its main cause is genetics in nature, early detection and treatment can help to improve the conditions. In recent years, machine learning based intelligent diagnosis has been evolved to complement the tra…
▽ More
Autism Spectrum Disorder (ASD), which is a neuro development disorder, is often accompanied by sensory issues such an over sensitivity or under sensitivity to sounds and smells or touch. Although its main cause is genetics in nature, early detection and treatment can help to improve the conditions. In recent years, machine learning based intelligent diagnosis has been evolved to complement the traditional clinical methods which can be time consuming and expensive. The focus of this paper is to find out the most significant traits and automate the diagnosis process using available classification techniques for improved diagnosis purpose. We have analyzed ASD datasets of Toddler, Child, Adolescent and Adult. We determine the best performing classifier for these binary datasets using the evaluation metrics recall, precision, F-measures and classification errors. Our finding shows that Sequential minimal optimization (SMO) based Support Vector Machines (SVM) classifier outperforms all other benchmark machine learning algorithms in terms of accuracy during the detection of ASD cases and produces less classification errors compared to other algorithms. Also, we find that Relief Attributes algorithm is the best to identify the most significant attributes in ASD datasets.
△ Less
Submitted 30 September, 2020;
originally announced September 2020.
-
FastForest: Increasing Random Forest Processing Speed While Maintaining Accuracy
Authors:
Darren Yates,
Md Zahidul Islam
Abstract:
Random Forest remains one of Data Mining's most enduring ensemble algorithms, achieving well-documented levels of accuracy and processing speed, as well as regularly appearing in new research. However, with data mining now reaching the domain of hardware-constrained devices such as smartphones and Internet of Things (IoT) devices, there is continued need for further research into algorithm efficie…
▽ More
Random Forest remains one of Data Mining's most enduring ensemble algorithms, achieving well-documented levels of accuracy and processing speed, as well as regularly appearing in new research. However, with data mining now reaching the domain of hardware-constrained devices such as smartphones and Internet of Things (IoT) devices, there is continued need for further research into algorithm efficiency to deliver greater processing speed without sacrificing accuracy. Our proposed FastForest algorithm delivers an average 24% increase in processing speed compared with Random Forest whilst maintaining (and frequently exceeding) it on classification accuracy over tests involving 45 datasets. FastForest achieves this result through a combination of three optimising components - Subsample Aggregating ('Subbagging'), Logarithmic Split-Point Sampling and Dynamic Restricted Subspacing. Moreover, detailed testing of Subbagging sizes has found an optimal scalar delivering a positive mix of processing performance and accuracy.
△ Less
Submitted 6 April, 2020;
originally announced April 2020.
-
A Novel Incremental Clustering Technique with Concept Drift Detection
Authors:
Mitchell D. Woodbright,
Md Anisur Rahman,
Md Zahidul Islam
Abstract:
Data are being collected from various aspects of life. These data can often arrive in chunks/batches. Traditional static clustering algorithms are not suitable for dynamic datasets, i.e., when data arrive in streams of chunks/batches. If we apply a conventional clustering technique over the combined dataset, then every time a new batch of data comes, the process can be slow and wasteful. Moreover,…
▽ More
Data are being collected from various aspects of life. These data can often arrive in chunks/batches. Traditional static clustering algorithms are not suitable for dynamic datasets, i.e., when data arrive in streams of chunks/batches. If we apply a conventional clustering technique over the combined dataset, then every time a new batch of data comes, the process can be slow and wasteful. Moreover, it can be challenging to store the combined dataset in memory due to its ever-increasing size. As a result, various incremental clustering techniques have been proposed. These techniques need to efficiently update the current clustering result whenever a new batch arrives, to adapt the current clustering result/solution with the latest data. These techniques also need the ability to detect concept drifts when the clustering pattern of a new batch is significantly different from older batches. Sometimes, clustering patterns may drift temporarily in a single batch while the next batches do not exhibit the drift. Therefore, incremental clustering techniques need the ability to detect a temporary drift and sustained drift. In this paper, we propose an efficient incremental clustering algorithm called UIClust. It is designed to cluster streams of data chunks, even when there are temporary or sustained concept drifts. We evaluate the performance of UIClust by comparing it with a recently published, high-quality incremental clustering algorithm. We use real and synthetic datasets. We compare the results by using well-known clustering evaluation criteria: entropy, sum of squared errors (SSE), and execution time. Our results show that UIClust outperforms the existing technique in all our experiments.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Tree Index: A New Cluster Evaluation Technique
Authors:
A. H. Beg,
Md Zahidul Islam,
Vladimir Estivill-Castro
Abstract:
We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation power of clustering is some cumulative error similar to vector quantization). Our Tree Index is finding margins amongst clusters for easy learning without the…
▽ More
We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation power of clustering is some cumulative error similar to vector quantization). Our Tree Index is finding margins amongst clusters for easy learning without the complications of Minimum Description Length. Our Tree Index produces a decision tree from the clustered data set, using the cluster identifiers as labels. It combines the entropy of each leaf with their depth. Intuitively, a shorter tree with pure leaves generalizes the data well (the clusters are easy to learn because they are well separated). So, the labels are meaningful clusters. If the clustering algorithm does not separate well, trees learned from their results will be large and too detailed. We show that, on the clustering results (obtained by various techniques) on a brain dataset, Tree Index discriminates between reasonable and non-sensible clusters. We confirm the effectiveness of Tree Index through graphical visualizations. Tree Index evaluates the sensible solutions higher than the non-sensible solutions while existing cluster-quality indexes fail to do so.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets
Authors:
Darren Yates,
Md Zahidul Islam,
Junbin Gao
Abstract:
Smartphones have become the ultimate 'personal' computer, yet despite this, general-purpose data-mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data-mining application designed specifically for Android devices that imports the Weka data-mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can…
▽ More
Smartphones have become the ultimate 'personal' computer, yet despite this, general-purpose data-mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data-mining application designed specifically for Android devices that imports the Weka data-mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can be expanded with additional algorithms. Combined, DataLearner delivers 40 classification, clustering and association rule mining algorithms for model training and evaluation without need for cloud computing resources or network connectivity. It provides the same classification accuracy as PCs and laptops, while doing so with acceptable processing speed and consuming negligible battery life. With its ability to provide easy-to-use data-mining on a phone-size screen, DataLearner is a new portable, self-contained data-mining tool for remote, personalised and learning applications alike. DataLearner features four elements - this paper, the app available on Google Play, the GPL3-licensed source code on GitHub and a short video on YouTube.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.