-
Integrating Artificial Intelligence Models and Synthetic Image Data for Enhanced Asset Inspection and Defect Identification
Authors:
Reddy Mandati,
Vladyslav Anderson,
Po-chen Chen,
Ankush Agarwal,
Tatjana Dokic,
David Barnard,
Michael Finn,
Jesse Cromer,
Andrew Mccauley,
Clay Tutaj,
Neha Dave,
Bobby Besharati,
Jamie Barnett,
Timothy Krall
Abstract:
In the past utilities relied on in-field inspections to identify asset defects. Recently, utilities have started using drone-based inspections to enhance the field-inspection process. We consider a vast repository of drone images, providing a wealth of information about asset health and potential issues. However, making the collected imagery data useful for automated defect detection requires sign…
▽ More
In the past utilities relied on in-field inspections to identify asset defects. Recently, utilities have started using drone-based inspections to enhance the field-inspection process. We consider a vast repository of drone images, providing a wealth of information about asset health and potential issues. However, making the collected imagery data useful for automated defect detection requires significant manual labeling effort. We propose a novel solution that combines synthetic asset defect images with manually labeled drone images. This solution has several benefits: improves performance of defect detection, reduces the number of hours spent on manual labeling, and enables the capability to generate realistic images of rare defects where not enough real-world data is available. We employ a workflow that combines 3D modeling tools such as Maya and Unreal Engine to create photorealistic 3D models and 2D renderings of defective assets and their surroundings. These synthetic images are then integrated into our training pipeline augmenting the real data. This study implements an end-to-end Artificial Intelligence solution to detect assets and asset defects from the combined imagery repository. The unique contribution of this research lies in the application of advanced computer vision models and the generation of photorealistic 3D renderings of defective assets, aiming to transform the asset inspection process. Our asset detection model has achieved an accuracy of 92 percent, we achieved a performance lift of 67 percent when introducing approximately 2,000 synthetic images of 2k resolution. In our tests, the defect detection model achieved an accuracy of 73 percent across two batches of images. Our analysis demonstrated that synthetic data can be successfully used in place of real-world manually labeled data to train defect detection model.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Precision, Stability, and Generalization: A Comprehensive Assessment of RNNs learnability capability for Classifying Counter and Dyck Languages
Authors:
Neisarg Dave,
Daniel Kifer,
Lee Giles,
Ankur Mali
Abstract:
This study investigates the learnability of Recurrent Neural Networks (RNNs) in classifying structured formal languages, focusing on counter and Dyck languages. Traditionally, both first-order (LSTM) and second-order (O2RNN) RNNs have been considered effective for such tasks, primarily based on their theoretical expressiveness within the Chomsky hierarchy. However, our research challenges this not…
▽ More
This study investigates the learnability of Recurrent Neural Networks (RNNs) in classifying structured formal languages, focusing on counter and Dyck languages. Traditionally, both first-order (LSTM) and second-order (O2RNN) RNNs have been considered effective for such tasks, primarily based on their theoretical expressiveness within the Chomsky hierarchy. However, our research challenges this notion by demonstrating that RNNs primarily operate as state machines, where their linguistic capabilities are heavily influenced by the precision of their embeddings and the strategies used for sampling negative examples. Our experiments revealed that performance declines significantly as the structural similarity between positive and negative examples increases. Remarkably, even a basic single-layer classifier using RNN embeddings performed better than chance. To evaluate generalization, we trained models on strings up to a length of 40 and tested them on strings from lengths 41 to 500, using 10 unique seeds to ensure statistical robustness. Stability comparisons between LSTM and O2RNN models showed that O2RNNs generally offer greater stability across various scenarios. We further explore the impact of different initialization strategies revealing that our hypothesis is consistent with various RNNs. Overall, this research questions established beliefs about RNNs' computational capabilities, highlighting the importance of data structure and sampling techniques in assessing neural networks' potential for language classification tasks. It emphasizes that stronger constraints on expressivity are crucial for understanding true learnability, as mere expressivity does not capture the essence of learning.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Investigating Symbolic Capabilities of Large Language Models
Authors:
Neisarg Dave,
Daniel Kifer,
C. Lee Giles,
Ankur Mali
Abstract:
Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to b…
▽ More
Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to bridge this gap by rigorously evaluating LLMs on a series of symbolic tasks, such as addition, multiplication, modulus arithmetic, numerical precision, and symbolic counting. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The assessment framework is anchored in Chomsky's Hierarchy, providing a robust measure of the computational abilities of these models. The evaluation employs minimally explained prompts alongside the zero-shot Chain of Thoughts technique, allowing models to navigate the solution process autonomously. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases. Notably, even the fine-tuned GPT3.5 exhibits only marginal improvements, mirroring the performance trends observed in other models. Across the board, all models demonstrated a limited generalization ability on these symbol-intensive tasks. This research underscores LLMs' challenges with increasing symbolic complexity and highlights the need for specialized training, memory and architectural adjustments to enhance their proficiency in symbol-based reasoning tasks.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network
Authors:
Neisarg Dave,
Daniel Kifer,
C. Lee Giles,
Ankur Mali
Abstract:
This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RN…
▽ More
This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RNN, and MIRNN. The observations from our experiments establish the superior performance of O2RNN and quantization-based rule extraction over others. $L^{*}$, primarily proposed for regular grammars, performs similarly to quantization methods for Tomita languages when neural networks are perfectly trained. However, for partially trained RNNs, $L^{*}$ shows instability in the number of states in DFA, e.g., for Tomita 5 and Tomita 6 languages, $L^{*}$ produced more than $100$ states. In contrast, quantization methods result in rules with number of states very close to ground truth DFA. Among RNN cells, O2RNN produces stable DFA consistently compared to other cells. For Dyck Languages, we observe that although GRU outperforms other RNNs in network performance, the DFA extracted by O2RNN has higher performance and better stability. The stability is computed as the standard deviation of accuracy on test sets on networks trained across $10$ seeds. On Dyck Languages, quantization methods outperformed $L^{*}$ with better stability in accuracy and the number of states. $L^{*}$ often showed instability in accuracy in the order of $16\% - 22\%$ for GRU and MIRNN while deviation for quantization methods varied in $5\% - 15\%$. In many instances with LSTM and GRU, DFA's extracted by $L^{*}$ even failed to beat chance accuracy ($50\%$), while those extracted by quantization method had standard deviation in the $7\%-17\%$ range. For O2RNN, both rule extraction methods had deviation in the $0.5\% - 3\%$ range.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.