-
Video Forgery Detection for Surveillance Cameras: A Review
Authors:
Noor B. Tayfor,
Tarik A. Rashid,
Shko M. Qader,
Bryar A. Hassan,
Mohammed H. Abdalla,
Jafar Majidpour,
Aram M. Ahmed,
Hussein M. Ali,
Aso M. Aladdin,
Abdulhady A. Abdullah,
Ahmed S. Shamsaldin,
Haval M. Sidqi,
Abdulrahman Salih,
Zaher M. Yaseen,
Azad A. Ameen,
Janmenjoy Nayak,
Mahmood Yashar Hamza
Abstract:
The widespread availability of video recording through smartphones and digital devices has made video-based evidence more accessible than ever. Surveillance footage plays a crucial role in security, law enforcement, and judicial processes. However, with the rise of advanced video editing tools, tampering with digital recordings has become increasingly easy, raising concerns about their authenticit…
▽ More
The widespread availability of video recording through smartphones and digital devices has made video-based evidence more accessible than ever. Surveillance footage plays a crucial role in security, law enforcement, and judicial processes. However, with the rise of advanced video editing tools, tampering with digital recordings has become increasingly easy, raising concerns about their authenticity. Ensuring the integrity of surveillance videos is essential, as manipulated footage can lead to misinformation and undermine judicial decisions. This paper provides a comprehensive review of existing forensic techniques used to detect video forgery, focusing on their effectiveness in verifying the authenticity of surveillance recordings. Various methods, including compression-based analysis, frame duplication detection, and machine learning-based approaches, are explored. The findings highlight the growing necessity for more robust forensic techniques to counteract evolving forgery methods. Strengthening video forensic capabilities will ensure that surveillance recordings remain credible and admissible as legal evidence.
△ Less
Submitted 4 May, 2025;
originally announced May 2025.
-
An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation
Authors:
Farah Alsafadi,
Mahmoud Yaseen,
Xu Wu
Abstract:
The confluence of ultrafast computers with large memory, rapid progress in Machine Learning (ML) algorithms, and the availability of large datasets place multiple engineering fields at the threshold of dramatic progress. However, a unique challenge in nuclear engineering is data scarcity because experimentation on nuclear systems is usually more expensive and time-consuming than most other discipl…
▽ More
The confluence of ultrafast computers with large memory, rapid progress in Machine Learning (ML) algorithms, and the availability of large datasets place multiple engineering fields at the threshold of dramatic progress. However, a unique challenge in nuclear engineering is data scarcity because experimentation on nuclear systems is usually more expensive and time-consuming than most other disciplines. One potential way to resolve the data scarcity issue is deep generative learning, which uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data. In this way, one can significantly expand the dataset to train more accurate predictive ML models. In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models. We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data. Additionally, the DNN prediction uncertainties are quantified using Bayesian Neural Networks (BNN) and conformal prediction (CP) to assess the impact on predictive uncertainty reduction. To test the proposed methodology, we used TRACE simulations of steady-state void fraction data based on the NUPEC Boiling Water Reactor Full-size Fine-mesh Bundle Test (BFBT) benchmark. We found that augmenting the training dataset using VAEs has improved the DNN model's predictive accuracy, improved the prediction confidence intervals, and reduced the prediction uncertainties.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector
Authors:
Muhammad Yaseen
Abstract:
This study provides a comprehensive analysis of the YOLOv9 object detection model, focusing on its architectural innovations, training methodologies, and performance improvements over its predecessors. Key advancements, such as the Generalized Efficient Layer Aggregation Network GELAN and Programmable Gradient Information PGI, significantly enhance feature extraction and gradient flow, leading to…
▽ More
This study provides a comprehensive analysis of the YOLOv9 object detection model, focusing on its architectural innovations, training methodologies, and performance improvements over its predecessors. Key advancements, such as the Generalized Efficient Layer Aggregation Network GELAN and Programmable Gradient Information PGI, significantly enhance feature extraction and gradient flow, leading to improved accuracy and efficiency. By incorporating Depthwise Convolutions and the lightweight C3Ghost architecture, YOLOv9 reduces computational complexity while maintaining high precision. Benchmark tests on Microsoft COCO demonstrate its superior mean Average Precision mAP and faster inference times, outperforming YOLOv8 across multiple metrics. The model versatility is highlighted by its seamless deployment across various hardware platforms, from edge devices to high performance GPUs, with built in support for PyTorch and TensorRT integration. This paper provides the first in depth exploration of YOLOv9s internal features and their real world applicability, establishing it as a state of the art solution for real time object detection across industries, from IoT devices to large scale industrial applications.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector
Authors:
Muhammad Yaseen
Abstract:
This study presents a detailed analysis of the YOLOv8 object detection model, focusing on its architecture, training techniques, and performance improvements over previous iterations like YOLOv5. Key innovations, including the CSPNet backbone for enhanced feature extraction, the FPN+PAN neck for superior multi-scale object detection, and the transition to an anchor-free approach, are thoroughly ex…
▽ More
This study presents a detailed analysis of the YOLOv8 object detection model, focusing on its architecture, training techniques, and performance improvements over previous iterations like YOLOv5. Key innovations, including the CSPNet backbone for enhanced feature extraction, the FPN+PAN neck for superior multi-scale object detection, and the transition to an anchor-free approach, are thoroughly examined. The paper reviews YOLOv8's performance across benchmarks like Microsoft COCO and Roboflow 100, highlighting its high accuracy and real-time capabilities across diverse hardware platforms. Additionally, the study explores YOLOv8's developer-friendly enhancements, such as its unified Python package and CLI, which streamline model training and deployment. Overall, this research positions YOLOv8 as a state-of-the-art solution in the evolving object detection field.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
From Dialogue to Diagram: Task and Relationship Extraction from Natural Language for Accelerated Business Process Prototyping
Authors:
Sara Qayyum,
Muhammad Moiz Asghar,
Muhammad Fouzan Yaseen
Abstract:
The automatic transformation of verbose, natural language descriptions into structured process models remains a challenge of significant complexity - This paper introduces a contemporary solution, where central to our approach, is the use of dependency parsing and Named Entity Recognition (NER) for extracting key elements from textual descriptions. Additionally, we utilize Subject-Verb-Object (SVO…
▽ More
The automatic transformation of verbose, natural language descriptions into structured process models remains a challenge of significant complexity - This paper introduces a contemporary solution, where central to our approach, is the use of dependency parsing and Named Entity Recognition (NER) for extracting key elements from textual descriptions. Additionally, we utilize Subject-Verb-Object (SVO) constructs for identifying action relationships and integrate semantic analysis tools, including WordNet, for enriched contextual understanding. A novel aspect of our system is the application of neural coreference resolution, integrated with the SpaCy framework, enhancing the precision of entity linkage and anaphoric references. Furthermore, the system adeptly handles data transformation and visualization, converting extracted information into BPMN (Business Process Model and Notation) diagrams. This methodology not only streamlines the process of capturing and representing business workflows but also significantly reduces the manual effort and potential for error inherent in traditional modeling approaches.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Reduced Order Modeling of a MOOSE-based Advanced Manufacturing Model with Operator Learning
Authors:
Mahmoud Yaseen,
Dewen Yushu,
Peter German,
Xu Wu
Abstract:
Advanced Manufacturing (AM) has gained significant interest in the nuclear community for its potential application on nuclear materials. One challenge is to obtain desired material properties via controlling the manufacturing process during runtime. Intelligent AM based on deep reinforcement learning (DRL) relies on an automated process-level control mechanism to generate optimal design variables…
▽ More
Advanced Manufacturing (AM) has gained significant interest in the nuclear community for its potential application on nuclear materials. One challenge is to obtain desired material properties via controlling the manufacturing process during runtime. Intelligent AM based on deep reinforcement learning (DRL) relies on an automated process-level control mechanism to generate optimal design variables and adaptive system settings for improved end-product properties. A high-fidelity thermo-mechanical model for direct energy deposition has recently been developed within the MOOSE framework at the Idaho National Laboratory (INL). The goal of this work is to develop an accurate and fast-running reduced order model (ROM) for this MOOSE-based AM model that can be used in a DRL-based process control and optimization method. Operator learning (OL)-based methods will be employed due to their capability to learn a family of differential equations, in this work, produced by changing process variables in the Gaussian point heat source for the laser. We will develop OL-based ROM using Fourier neural operator, and perform a benchmark comparison of its performance with a conventional deep neural network-based ROM.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Fast and Accurate Reduced-Order Modeling of a MOOSE-based Additive Manufacturing Model with Operator Learning
Authors:
Mahmoud Yaseen,
Dewen Yushu,
Peter German,
Xu Wu
Abstract:
One predominant challenge in additive manufacturing (AM) is to achieve specific material properties by manipulating manufacturing process parameters during the runtime. Such manipulation tends to increase the computational load imposed on existing simulation tools employed in AM. The goal of the present work is to construct a fast and accurate reduced-order model (ROM) for an AM model developed wi…
▽ More
One predominant challenge in additive manufacturing (AM) is to achieve specific material properties by manipulating manufacturing process parameters during the runtime. Such manipulation tends to increase the computational load imposed on existing simulation tools employed in AM. The goal of the present work is to construct a fast and accurate reduced-order model (ROM) for an AM model developed within the Multiphysics Object-Oriented Simulation Environment (MOOSE) framework, ultimately reducing the time/cost of AM control and optimization processes. Our adoption of the operator learning (OL) approach enabled us to learn a family of differential equations produced by altering process variables in the laser's Gaussian point heat source. More specifically, we used the Fourier neural operator (FNO) and deep operator network (DeepONet) to develop ROMs for time-dependent responses. Furthermore, we benchmarked the performance of these OL methods against a conventional deep neural network (DNN)-based ROM. Ultimately, we found that OL methods offer comparable performance and, in terms of accuracy and generalizability, even outperform DNN at predicting scalar model responses. The DNN-based ROM afforded the fastest training time. Furthermore, all the ROMs were faster than the original MOOSE model yet still provided accurate predictions. FNO had a smaller mean prediction error than DeepONet, with a larger variance for time-dependent responses. Unlike DNN, both FNO and DeepONet were able to simulate time series data without the need for dimensionality reduction techniques. The present work can help facilitate the AM optimization process by enabling faster execution of simulation tools while still preserving evaluation accuracy.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Functional PCA and Deep Neural Networks-based Bayesian Inverse Uncertainty Quantification with Transient Experimental Data
Authors:
Ziyu Xie,
Mahmoud Yaseen,
Xu Wu
Abstract:
Inverse UQ is the process to inversely quantify the model input uncertainties based on experimental data. This work focuses on developing an inverse UQ process for time-dependent responses, using dimensionality reduction by functional principal component analysis (PCA) and deep neural network (DNN)-based surrogate models. The demonstration is based on the inverse UQ of TRACE physical model paramet…
▽ More
Inverse UQ is the process to inversely quantify the model input uncertainties based on experimental data. This work focuses on developing an inverse UQ process for time-dependent responses, using dimensionality reduction by functional principal component analysis (PCA) and deep neural network (DNN)-based surrogate models. The demonstration is based on the inverse UQ of TRACE physical model parameters using the FEBA transient experimental data. The measurement data is time-dependent peak cladding temperature (PCT). Since the quantity-of-interest (QoI) is time-dependent that corresponds to infinite-dimensional responses, PCA is used to reduce the QoI dimension while preserving the transient profile of the PCT, in order to make the inverse UQ process more efficient. However, conventional PCA applied directly to the PCT time series profiles can hardly represent the data precisely due to the sudden temperature drop at the time of quenching. As a result, a functional alignment method is used to separate the phase and amplitude information of the transient PCT profiles before dimensionality reduction. DNNs are then trained using PC scores from functional PCA to build surrogate models of TRACE in order to reduce the computational cost in Markov Chain Monte Carlo sampling. Bayesian neural networks are used to estimate the uncertainties of DNN surrogate model predictions. In this study, we compared four different inverse UQ processes with different dimensionality reduction methods and surrogate models. The proposed approach shows an improvement in reducing the dimension of the TRACE transient simulations, and the forward propagation of inverse UQ results has a better agreement with the experimental data.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Awareness requirement and performance management for adaptive systems: a survey
Authors:
Tarik A. Rashid,
Bryar A. Hassan,
Abeer Alsadoon,
Shko Qader,
S. Vimal,
Amit Chhabra,
Zaher Mundher Yaseen
Abstract:
Self-adaptive software can assess and modify its behavior when the assessment indicates that the program is not performing as intended or when improved functionality or performance is available. Since the mid-1960s, the subject of system adaptivity has been extensively researched, and during the last decade, many application areas and technologies involving self-adaptation have gained prominence.…
▽ More
Self-adaptive software can assess and modify its behavior when the assessment indicates that the program is not performing as intended or when improved functionality or performance is available. Since the mid-1960s, the subject of system adaptivity has been extensively researched, and during the last decade, many application areas and technologies involving self-adaptation have gained prominence. All of these efforts have in common the introduction of self-adaptability through software. Thus, it is essential to investigate systematic software engineering methods to create self-adaptive systems that may be used across different domains. The primary objective of this research is to summarize current advances in awareness requirements for adaptive strategies based on an examination of state-of-the-art methods described in the literature. This paper presents a review of self-adaptive systems in the context of requirement awareness and summarizes the most common methodologies applied. At first glance, it gives a review of the previous surveys and works about self-adaptive systems. Afterward, it classifies the current self-adaptive systems based on six criteria. Then, it presents and evaluates the most common self-adaptive approaches. Lastly, an evaluation among the self-adaptive models is conducted based on four concepts (requirements description, monitoring, relationship, dependency/impact, and tools).
△ Less
Submitted 22 January, 2023;
originally announced February 2023.
-
Improved Fitness Dependent Optimizer for Solving Economic Load Dispatch Problem
Authors:
Barzan Hussein Tahir,
Tarik A. Rashid,
Hafiz Tayyab Rauf,
Nebojsa Bacanin,
Amit Chhabra,
S. Vimal,
Zaher Mundher Yaseen
Abstract:
Economic Load Dispatch depicts a fundamental role in the operation of power systems, as it decreases the environmental load, minimizes the operating cost, and preserves energy resources. The optimal solution to Economic Load Dispatch problems and various constraints can be obtained by evolving several evolutionary and swarm-based algorithms. The major drawback to swarm-based algorithms is prematur…
▽ More
Economic Load Dispatch depicts a fundamental role in the operation of power systems, as it decreases the environmental load, minimizes the operating cost, and preserves energy resources. The optimal solution to Economic Load Dispatch problems and various constraints can be obtained by evolving several evolutionary and swarm-based algorithms. The major drawback to swarm-based algorithms is premature convergence towards an optimal solution. Fitness Dependent Optimizer is a novel optimization algorithm stimulated by the decision-making and reproductive process of bee swarming. Fitness Dependent Optimizer (FDO) examines the search spaces based on the searching approach of Particle Swarm Optimization. To calculate the pace, the fitness function is utilized to generate weights that direct the search agents in the phases of exploitation and exploration. In this research, the authors have carried out Fitness Dependent Optimizer to solve the Economic Load Dispatch problem by reducing fuel cost, emission allocation, and transmission loss. Moreover, the authors have enhanced a novel variant of Fitness Dependent Optimizer, which incorporates novel population initialization techniques and dynamically employed sine maps to select the weight factor for Fitness Dependent Optimizer. The enhanced population initialization approach incorporates a quasi-random Sabol sequence to generate the initial solution in the multi-dimensional search space. A standard 24-unit system is employed for experimental evaluation with different power demands. Empirical results obtained using the enhanced variant of the Fitness Dependent Optimizer demonstrate superior performance in terms of low transmission loss, low fuel cost, and low emission allocation compared to the conventional Fitness Dependent Optimizer. The experimental study obtained 7.94E-12.
△ Less
Submitted 14 July, 2022;
originally announced September 2022.
-
Quantification of Deep Neural Network Prediction Uncertainties for VVUQ of Machine Learning Models
Authors:
Mahmoud Yaseen,
Xu Wu
Abstract:
Recent performance breakthroughs in Artificial intelligence (AI) and Machine learning (ML), especially advances in Deep learning (DL), the availability of powerful, easy-to-use ML libraries (e.g., scikit-learn, TensorFlow, PyTorch.), and increasing computational power have led to unprecedented interest in AI/ML among nuclear engineers. For physics-based computational models, Verification, Validati…
▽ More
Recent performance breakthroughs in Artificial intelligence (AI) and Machine learning (ML), especially advances in Deep learning (DL), the availability of powerful, easy-to-use ML libraries (e.g., scikit-learn, TensorFlow, PyTorch.), and increasing computational power have led to unprecedented interest in AI/ML among nuclear engineers. For physics-based computational models, Verification, Validation and Uncertainty Quantification (VVUQ) have been very widely investigated and a lot of methodologies have been developed. However, VVUQ of ML models has been relatively less studied, especially in nuclear engineering. In this work, we focus on UQ of ML models as a preliminary step of ML VVUQ, more specifically, Deep Neural Networks (DNNs) because they are the most widely used supervised ML algorithm for both regression and classification tasks. This work aims at quantifying the prediction, or approximation uncertainties of DNNs when they are used as surrogate models for expensive physical models. Three techniques for UQ of DNNs are compared, namely Monte Carlo Dropout (MCD), Deep Ensembles (DE) and Bayesian Neural Networks (BNNs). Two nuclear engineering examples are used to benchmark these methods, (1) time-dependent fission gas release data using the Bison code, and (2) void fraction simulation based on the BFBT benchmark using the TRACE code. It was found that the three methods typically require different DNN architectures and hyperparameters to optimize their performance. The UQ results also depend on the amount of training data available and the nature of the data. Overall, all these three methods can provide reasonable estimations of the approximation uncertainties. The uncertainties are generally smaller when the mean predictions are close to the test data, while the BNN methods usually produce larger uncertainties than MCD and DE.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Cloud based Scalable Object Recognition from Video Streams using Orientation Fusion and Convolutional Neural Networks
Authors:
Muhammad Usman Yaseen,
Ashiq Anjum,
Giancarlo Fortino,
Antonio Liotta,
Amir Hussain
Abstract:
Object recognition from live video streams comes with numerous challenges such as the variation in illumination conditions and poses. Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition. Yet, CNNs still suffer from severe accuracy degradation, particularly on illumination-variant datasets. To address this problem, we propose a new CNN method…
▽ More
Object recognition from live video streams comes with numerous challenges such as the variation in illumination conditions and poses. Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition. Yet, CNNs still suffer from severe accuracy degradation, particularly on illumination-variant datasets. To address this problem, we propose a new CNN method based on orientation fusion for visual object recognition. The proposed cloud-based video analytics system pioneers the use of bi-dimensional empirical mode decomposition to split a video frame into intrinsic mode functions (IMFs). We further propose these IMFs to endure Reisz transform to produce monogenic object components, which are in turn used for the training of CNNs. Past works have demonstrated how the object orientation component may be used to pursue accuracy levels as high as 93\%. Herein we demonstrate how a feature-fusion strategy of the orientation components leads to further improving visual recognition accuracy to 97\%. We also assess the scalability of our method, looking at both the number and the size of the video streams under scrutiny. We carry out extensive experimentation on the publicly available Yale dataset, including also a self generated video datasets, finding significant improvements (both in accuracy and scale), in comparison to AlexNet, LeNet and SE-ResNeXt, which are the three most commonly used deep learning models for visual object recognition and classification.
△ Less
Submitted 19 June, 2021;
originally announced June 2021.
-
Forgery Detection in a Questioned Hyperspectral Document Image using K-means Clustering
Authors:
Maria Yaseen,
Rammal Aftab Ahmed,
Rimsha Mahrukh
Abstract:
Hyperspectral imaging allows for analysis of images in several hundred of spectral bands depending on the spectral resolution of the imaging sensor. Hyperspectral document image is the one which has been captured by a hyperspectral camera so that the document can be observed in the different bands on the basis of their unique spectral signatures. To detect the forgery in a document various Ink mis…
▽ More
Hyperspectral imaging allows for analysis of images in several hundred of spectral bands depending on the spectral resolution of the imaging sensor. Hyperspectral document image is the one which has been captured by a hyperspectral camera so that the document can be observed in the different bands on the basis of their unique spectral signatures. To detect the forgery in a document various Ink mismatch detection techniques based on hyperspectral imaging have presented vast potential in differentiating visually similar inks. Inks of different materials exhibit different spectral signature even if they have the same color. Hyperspectral analysis of document images allows identification and discrimination of visually similar inks. Based on this analysis forensic experts can identify the authenticity of the document. In this paper an extensive ink mismatch detection technique is presented which uses KMean Clustering to identify different inks on the basis of their unique spectral response and separates them into different clusters.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
ForecastTB An R Package as a Test-Bench for Time Series Forecasting Application of Wind Speed and Solar Radiation Modeling
Authors:
Neeraj Dhanraj Bokde,
Zaher Mundher Yaseen,
Gorm Bruun Andersen
Abstract:
This paper introduces an R package ForecastTB that can be used to compare the accuracy of different forecasting methods as related to the characteristics of a time series dataset. The ForecastTB is a plug-and-play structured module, and several forecasting methods can be included with simple instructions. The proposed test-bench is not limited to the default forecasting and error metric functions,…
▽ More
This paper introduces an R package ForecastTB that can be used to compare the accuracy of different forecasting methods as related to the characteristics of a time series dataset. The ForecastTB is a plug-and-play structured module, and several forecasting methods can be included with simple instructions. The proposed test-bench is not limited to the default forecasting and error metric functions, and users are able to append, remove, or choose the desired methods as per requirements. Besides, several plotting functions and statistical performance metrics are provided to visualize the comparative performance and accuracy of different forecasting methods. Furthermore, this paper presents real application examples with natural time series datasets (i.e., wind speed and solar radiation) to exhibit the features of the ForecastTB package to evaluate forecasting comparison analysis as affected by the characteristics of a dataset. Modeling results indicated the applicability and robustness of the proposed R package ForecastTB for time series forecasting.
△ Less
Submitted 21 July, 2020; v1 submitted 4 April, 2020;
originally announced April 2020.
-
Preventing Clean Label Poisoning using Gaussian Mixture Loss
Authors:
Muhammad Yaseen,
Muneeb Aadil,
Maria Sargsyan
Abstract:
Since 2014 when Szegedy et al. showed that carefully designed perturbations of the input can lead Deep Neural Networks (DNNs) to wrongly classify its label, there has been an ongoing research to make DNNs more robust to such malicious perturbations. In this work, we consider a poisoning attack called Clean Labeling poisoning attack (CLPA). The goal of CLPA is to inject seemingly benign instances w…
▽ More
Since 2014 when Szegedy et al. showed that carefully designed perturbations of the input can lead Deep Neural Networks (DNNs) to wrongly classify its label, there has been an ongoing research to make DNNs more robust to such malicious perturbations. In this work, we consider a poisoning attack called Clean Labeling poisoning attack (CLPA). The goal of CLPA is to inject seemingly benign instances which can drastically change decision boundary of the DNNs due to which subsequent queries at test time can be mis-classified. We argue that a strong defense against CLPA can be embedded into the model during the training by imposing features of the network to follow a Large Margin Gaussian Mixture distribution in the penultimate layer. By having such a prior knowledge, we can systematically evaluate how unusual the example is, given the label it is claiming to be. We demonstrate our builtin defense via experiments on MNIST and CIFAR datasets. We train two models on each dataset: one trained via softmax, another via LGM. We show that using LGM can substantially reduce the effectiveness of CLPA while having no additional overhead of data sanitization. The code to reproduce our results is available online.
△ Less
Submitted 10 February, 2020;
originally announced March 2020.
-
An empirical estimation for time and memory algorithm complexities: newly developed R package
Authors:
Marc Agenis-Nevers,
Neeraj Dhanraj Bokde,
Zaher Mundher Yaseen,
Mayur Shende
Abstract:
This article introduces GuessCompx which is an R package that performs an empirical estimation on the time and memory complexities of an algorithm or a function. It tests multiple increasing-sizes samples of the user's data and attempts to fit one of seven complexity functions: O(N), O(N^2), O(log(N)), etc. Based on a best fit procedure using LOO-MSE (leave one out-mean squared error), it also pre…
▽ More
This article introduces GuessCompx which is an R package that performs an empirical estimation on the time and memory complexities of an algorithm or a function. It tests multiple increasing-sizes samples of the user's data and attempts to fit one of seven complexity functions: O(N), O(N^2), O(log(N)), etc. Based on a best fit procedure using LOO-MSE (leave one out-mean squared error), it also predicts the full computation time and memory usage on the whole dataset. Conceptually, it relies on the base R functions system.time and memory.size, the latter being only suitable for Windows users. Together with this results, a plot and a significance test are returned. Complexity is assessed with regard to the user's actual dataset through its size (and no other parameter). This article provides several examples demonstrating several cases (e.g., distance function, time series and custom function) and optimal parameters tuning. The subject of the empirical computational complexity has been relatively little studied in computer sciences, and such a package provides a reliable, convenient and simple procedure for estimation process. Further, the package does not require to have the code of the target function.
△ Less
Submitted 21 October, 2020; v1 submitted 4 November, 2019;
originally announced November 2019.