-
Time-Series Forecasting in Smart Manufacturing Systems: An Experimental Evaluation of the State-of-the-art Algorithms
Authors:
Mojtaba A. Farahani,
Fadi El Kalach,
Austin Harper,
M. R. McCormick,
Ramy Harik,
Thorsten Wuest
Abstract:
TSF is growing in various domains including manufacturing. Although numerous TSF algorithms have been developed recently, the validation and evaluation of algorithms hold substantial value for researchers and practitioners and are missing. This study aims to fill this gap by evaluating the SoTA TSF algorithms on thirteen manufacturing datasets, focusing on their applicability in manufacturing. Eac…
▽ More
TSF is growing in various domains including manufacturing. Although numerous TSF algorithms have been developed recently, the validation and evaluation of algorithms hold substantial value for researchers and practitioners and are missing. This study aims to fill this gap by evaluating the SoTA TSF algorithms on thirteen manufacturing datasets, focusing on their applicability in manufacturing. Each algorithm was selected based on its TSF category to ensure a representative set of algorithms. The evaluation includes different scenarios to evaluate the models using two problem categories and two forecasting horizons. To evaluate the performance, the WAPE was calculated, and additional post hoc analyses were conducted to assess the significance of observed differences. Only algorithms with codes from open-source libraries were utilized, and no hyperparameter tuning was done. This allowed us to evaluate the algorithms as "out-of-the-box" solutions that can be easily implemented, ensuring their usability within the manufacturing by practitioners with limited technical knowledge. This aligns to facilitate the adoption of these techniques in smart manufacturing systems. Based on the results, transformer and MLP-based architectures demonstrated the best performance with MLP-based architecture winning the most scenarios. For univariate TSF, PatchTST emerged as the most robust, particularly for long-term horizons, while for multivariate problems, MLP-based architectures like N-HITS and TiDE showed superior results. The study revealed that simpler algorithms like XGBoost could outperform complex algorithms in certain tasks. These findings challenge the assumption that more sophisticated models produce better results. Additionally, the research highlighted the importance of computational resource considerations, showing variations in runtime and memory usage across different algorithms.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
Deciphering the Interplay of Parametric and Non-parametric Memory in Retrieval-augmented Language Models
Authors:
Mehrdad Farahani,
Richard Johansson
Abstract:
Generative language models often struggle with specialized or less-discussed knowledge. A potential solution is found in Retrieval-Augmented Generation (RAG) models which act like retrieving information before generating responses. In this study, we explore how the \textsc{Atlas} approach, a RAG model, decides between what it already knows (parametric) and what it retrieves (non-parametric). We us…
▽ More
Generative language models often struggle with specialized or less-discussed knowledge. A potential solution is found in Retrieval-Augmented Generation (RAG) models which act like retrieving information before generating responses. In this study, we explore how the \textsc{Atlas} approach, a RAG model, decides between what it already knows (parametric) and what it retrieves (non-parametric). We use causal mediation analysis and controlled experiments to examine how internal representations influence information processing. Our findings disentangle the effects of parametric knowledge and the retrieved context. They indicate that in cases where the model can choose between both types of information (parametric and non-parametric), it relies more on the context than the parametric knowledge. Furthermore, the analysis investigates the computations involved in \emph{how} the model uses the information from the context. We find that multiple mechanisms are active within the model and can be detected with mediation analysis: first, the decision of \emph{whether the context is relevant}, and second, how the encoder computes output representations to support copying when relevant.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Uncovering EDK2 Firmware Flaws: Insights from Code Audit Tools
Authors:
Mahsa Farahani,
Ghazal Shenavar,
Ali Hosseinghorban,
Alireza Ejlali
Abstract:
Firmware serves as a foundational software layer in modern computers, initiating as the first code executed on platform hardware, similar in function to a minimal operating system. Defined as a software interface between an operating system and platform firmware, the Unified Extensible Firmware Interface (UEFI) standardizes system initialization and management. A prominent open-source implementati…
▽ More
Firmware serves as a foundational software layer in modern computers, initiating as the first code executed on platform hardware, similar in function to a minimal operating system. Defined as a software interface between an operating system and platform firmware, the Unified Extensible Firmware Interface (UEFI) standardizes system initialization and management. A prominent open-source implementation of UEFI, the EFI Development Kit II (EDK2), plays a crucial role in shaping firmware architecture. Despite its widespread adoption, the architecture faces challenges such as limited system resources at early stages and a lack of standard security features. Furthermore, the scarcity of open-source tools specifically designed for firmware analysis emphasizes the need for adaptable, innovative solutions.
In this paper, we explore the application of general code audit tools to firmware, with a particular focus on EDK2. Although these tools were not originally designed for firmware analysis, they have proven effective in identifying critical areas for enhancement in firmware security. Our findings, derived from deploying key audit tools on EDK2, categorize these tools based on their methodologies and illustrate their capability to uncover unique firmware attributes, significantly contributing to the understanding and improvement of firmware security.
△ Less
Submitted 22 September, 2024;
originally announced September 2024.
-
Analysing the Behaviour of Tree-Based Neural Networks in Regression Tasks
Authors:
Peter Samoaa,
Mehrdad Farahani,
Antonio Longa,
Philipp Leitner,
Morteza Haghir Chehreghani
Abstract:
The landscape of deep learning has vastly expanded the frontiers of source code analysis, particularly through the utilization of structural representations such as Abstract Syntax Trees (ASTs). While these methodologies have demonstrated effectiveness in classification tasks, their efficacy in regression applications, such as execution time prediction from source code, remains underexplored. This…
▽ More
The landscape of deep learning has vastly expanded the frontiers of source code analysis, particularly through the utilization of structural representations such as Abstract Syntax Trees (ASTs). While these methodologies have demonstrated effectiveness in classification tasks, their efficacy in regression applications, such as execution time prediction from source code, remains underexplored. This paper endeavours to decode the behaviour of tree-based neural network models in the context of such regression challenges. We extend the application of established models--tree-based Convolutional Neural Networks (CNNs), Code2Vec, and Transformer-based methods--to predict the execution time of source code by parsing it to an AST. Our comparative analysis reveals that while these models are benchmarks in code representation, they exhibit limitations when tasked with regression. To address these deficiencies, we propose a novel dual-transformer approach that operates on both source code tokens and AST representations, employing cross-attention mechanisms to enhance interpretability between the two domains. Furthermore, we explore the adaptation of Graph Neural Networks (GNNs) to this tree-based problem, theorizing the inherent compatibility due to the graphical nature of ASTs. Empirical evaluations on real-world datasets showcase that our dual-transformer model outperforms all other tree-based neural networks and the GNN-based models. Moreover, our proposed dual transformer demonstrates remarkable adaptability and robust performance across diverse datasets.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Time-Series Classification in Smart Manufacturing Systems: An Experimental Evaluation of State-of-the-Art Machine Learning Algorithms
Authors:
Mojtaba A. Farahani,
M. R. McCormick,
Ramy Harik,
Thorsten Wuest
Abstract:
Manufacturing is gathering extensive amounts of diverse data, thanks to the growing number of sensors and rapid advances in sensing technologies. Among the various data types available in SMS settings, time-series data plays a pivotal role. Hence, TSC emerges is crucial in this domain. The objective of this study is to fill this gap by providing a rigorous experimental evaluation of the SoTA ML an…
▽ More
Manufacturing is gathering extensive amounts of diverse data, thanks to the growing number of sensors and rapid advances in sensing technologies. Among the various data types available in SMS settings, time-series data plays a pivotal role. Hence, TSC emerges is crucial in this domain. The objective of this study is to fill this gap by providing a rigorous experimental evaluation of the SoTA ML and DL algorithms for TSC tasks in manufacturing and industrial settings. We first explored and compiled a comprehensive list of more than 92 SoTA algorithms from both TSC and manufacturing literature. Following, we selected the 36 most representative algorithms from this list. To evaluate their performance across various manufacturing classification tasks, we curated a set of 22 manufacturing datasets, representative of different characteristics that cover diverse manufacturing problems. Subsequently, we implemented and evaluated the algorithms on the manufacturing benchmark datasets, and analyzed the results for each dataset. Based on the results, ResNet, DrCIF, InceptionTime, and ARSENAL are the top-performing algorithms, boasting an average accuracy of over 96.6% across all 22 manufacturing TSC datasets. These findings underscore the robustness, efficiency, scalability, and effectiveness of convolutional kernels in capturing temporal features in time-series data, as three out of the top four performing algorithms leverage these kernels for feature extraction. Additionally, LSTM, BiLSTM, and TS-LSTM algorithms deserve recognition for their effectiveness in capturing features within time-series data using RNN-based structures.
△ Less
Submitted 5 August, 2024; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Multivariate time series classification with dual attention network
Authors:
Mojtaba A. Farahani,
Tara Eslaminokandeh
Abstract:
One of the topics in machine learning that is becoming more and more relevant is multivariate time series classification. Current techniques concentrate on identifying the local important sequence segments or establishing the global long-range dependencies. They frequently disregard the merged data from both global and local features, though. Using dual attention, we explore a novel network (DA-Ne…
▽ More
One of the topics in machine learning that is becoming more and more relevant is multivariate time series classification. Current techniques concentrate on identifying the local important sequence segments or establishing the global long-range dependencies. They frequently disregard the merged data from both global and local features, though. Using dual attention, we explore a novel network (DA-Net) in this research to extract local and global features for multivariate time series classification. The two distinct layers that make up DA-Net are the Squeeze-Excitation Window Attention (SEWA) layer and the Sparse Self-Attention within Windows (SSAW) layer. DA- Net can mine essential local sequence fragments that are necessary for establishing global long-range dependencies based on the two expanded layers.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
A Genetic Algorithm Meta-Heuristic for a Generalized Quadratic Assignment Problem
Authors:
Mojtaba A. Farahani,
Alan McKendall
Abstract:
The generalized quadratic assignment problem (GQAP) is one of the hardest problems to solve in the operations research area. The GQAP addressed in this work is defined as the task of minimizing the assignment and transportation costs of assigning a set of facilities to a set of locations. The facilities have different space requirements, and the locations have different space capacities. Multiple…
▽ More
The generalized quadratic assignment problem (GQAP) is one of the hardest problems to solve in the operations research area. The GQAP addressed in this work is defined as the task of minimizing the assignment and transportation costs of assigning a set of facilities to a set of locations. The facilities have different space requirements, and the locations have different space capacities. Multiple facilities can be assigned to each location if the space capacity is not violated. In this work, three instances of GQAP in different situations are presented. Then, a genetic algorithm is developed to solve the GQAP instances. Finally, the local neighborhood search with the steepest descend strategy is constructed and applied to the final solution obtained by the GA, and the final solution is compared with the best solution found by MPL/CPLEX software and reference papers. The results show that the developed GA heuristic is effective for solving the GQAP.
△ Less
Submitted 9 October, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
An Empirical Study of Multitask Learning to Improve Open Domain Dialogue Systems
Authors:
Mehrdad Farahani,
Richard Johansson
Abstract:
Autoregressive models used to generate responses in open-domain dialogue systems often struggle to take long-term context into account and to maintain consistency over a dialogue. Previous research in open-domain dialogue generation has shown that the use of \emph{auxiliary tasks} can introduce inductive biases that encourage the model to improve these qualities. However, most previous research ha…
▽ More
Autoregressive models used to generate responses in open-domain dialogue systems often struggle to take long-term context into account and to maintain consistency over a dialogue. Previous research in open-domain dialogue generation has shown that the use of \emph{auxiliary tasks} can introduce inductive biases that encourage the model to improve these qualities. However, most previous research has focused on encoder-only or encoder/decoder models, while the use of auxiliary tasks in \emph{decoder-only} autoregressive models is under-explored. This paper describes an investigation where four different auxiliary tasks are added to small and medium-sized GPT-2 models fine-tuned on the PersonaChat and DailyDialog datasets. The results show that the introduction of the new auxiliary tasks leads to small but consistent improvement in evaluations of the investigated models.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Time-Series Pattern Recognition in Smart Manufacturing Systems: A Literature Review and Ontology
Authors:
Mojtaba A. Farahani,
M. R. McCormick,
Robert Gianinny,
Frank Hudacheck,
Ramy Harik,
Zhichao Liu,
Thorsten Wuest
Abstract:
Since the inception of Industry 4.0 in 2012, emerging technologies have enabled the acquisition of vast amounts of data from diverse sources such as machine tools, robust and affordable sensor systems with advanced information models, and other sources within Smart Manufacturing Systems (SMS). As a result, the amount of data that is available in manufacturing settings has exploded, allowing data-h…
▽ More
Since the inception of Industry 4.0 in 2012, emerging technologies have enabled the acquisition of vast amounts of data from diverse sources such as machine tools, robust and affordable sensor systems with advanced information models, and other sources within Smart Manufacturing Systems (SMS). As a result, the amount of data that is available in manufacturing settings has exploded, allowing data-hungry tools such as Artificial Intelligence (AI) and Machine Learning (ML) to be leveraged. Time-series analytics has been successfully applied in a variety of industries, and that success is now being migrated to pattern recognition applications in manufacturing to support higher quality products, zero defect manufacturing, and improved customer satisfaction. However, the diverse landscape of manufacturing presents a challenge for successfully solving problems in industry using time-series pattern recognition. The resulting research gap of understanding and applying the subject matter of time-series pattern recognition in manufacturing is a major limiting factor for adoption in industry. The purpose of this paper is to provide a structured perspective of the current state of time-series pattern recognition in manufacturing with a problem-solving focus. By using an ontology to classify and define concepts, how they are structured, their properties, the relationships between them, and considerations when applying them, this paper aims to provide practical and actionable guidelines for application and recommendations for advancing time-series analytics.
△ Less
Submitted 6 March, 2023; v1 submitted 29 January, 2023;
originally announced January 2023.
-
Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization
Authors:
Mehrdad Farahani,
Mohammad Gharachorloo,
Mohammad Manthouri
Abstract:
Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The…
▽ More
Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
ParsBERT: Transformer-based Model for Persian Language Understanding
Authors:
Mehrdad Farahani,
Mohammad Gharachorloo,
Marzieh Farahani,
Mohammad Manthouri
Abstract:
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with…
▽ More
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification and Named Entity Recognition tasks.
△ Less
Submitted 31 May, 2020; v1 submitted 26 May, 2020;
originally announced May 2020.
-
Short-Term Traffic Flow Prediction Using Variational LSTM Networks
Authors:
Mehrdad Farahani,
Marzieh Farahani,
Mohammad Manthouri,
Okyay Kaynak
Abstract:
Traffic flow characteristics are one of the most critical decision-making and traffic policing factors in a region. Awareness of the predicted status of the traffic flow has prime importance in traffic management and traffic information divisions. The purpose of this research is to suggest a forecasting model for traffic flow by using deep learning techniques based on historical data in the Intell…
▽ More
Traffic flow characteristics are one of the most critical decision-making and traffic policing factors in a region. Awareness of the predicted status of the traffic flow has prime importance in traffic management and traffic information divisions. The purpose of this research is to suggest a forecasting model for traffic flow by using deep learning techniques based on historical data in the Intelligent Transportation Systems area. The historical data collected from the Caltrans Performance Measurement Systems (PeMS) for six months in 2019. The proposed prediction model is a Variational Long Short-Term Memory Encoder in brief VLSTM-E try to estimate the flow accurately in contrast to other conventional methods. VLSTM-E can provide more reliable short-term traffic flow by considering the distribution and missing values.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.