-
DETONATE: A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization
Authors:
Renjith Prasad,
Abhilekh Borah,
Hasnat Md Abdullah,
Chathurangi Shyalika,
Gurpreet Singh,
Ritvik Garimella,
Rajarshi Roy,
Harshul Surana,
Nasrin Imanpour,
Suranjana Trivedy,
Amit Sheth,
Amitava Das
Abstract:
Alignment is crucial for text-to-image (T2I) models to ensure that generated images faithfully capture user intent while maintaining safety and fairness. Direct Preference Optimization (DPO), prominent in large language models (LLMs), is extending its influence to T2I systems. This paper introduces DPO-Kernels for T2I models, a novel extension enhancing alignment across three dimensions: (i) Hybri…
▽ More
Alignment is crucial for text-to-image (T2I) models to ensure that generated images faithfully capture user intent while maintaining safety and fairness. Direct Preference Optimization (DPO), prominent in large language models (LLMs), is extending its influence to T2I systems. This paper introduces DPO-Kernels for T2I models, a novel extension enhancing alignment across three dimensions: (i) Hybrid Loss, integrating embedding-based objectives with traditional probability-based loss for improved optimization; (ii) Kernelized Representations, employing Radial Basis Function (RBF), Polynomial, and Wavelet kernels for richer feature transformations and better separation between safe and unsafe inputs; and (iii) Divergence Selection, expanding beyond DPO's default Kullback-Leibler (KL) regularizer by incorporating Wasserstein and R'enyi divergences for enhanced stability and robustness. We introduce DETONATE, the first large-scale benchmark of its kind, comprising approximately 100K curated image pairs categorized as chosen and rejected. DETONATE encapsulates three axes of social bias and discrimination: Race, Gender, and Disability. Prompts are sourced from hate speech datasets, with images generated by leading T2I models including Stable Diffusion 3.5 Large, Stable Diffusion XL, and Midjourney. Additionally, we propose the Alignment Quality Index (AQI), a novel geometric measure quantifying latent-space separability of safe/unsafe image activations, revealing hidden vulnerabilities. Empirically, we demonstrate that DPO-Kernels maintain strong generalization bounds via Heavy-Tailed Self-Regularization (HT-SR). DETONATE and complete code are publicly released.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
SmartPilot: A Multiagent CoPilot for Adaptive and Intelligent Manufacturing
Authors:
Chathurangi Shyalika,
Renjith Prasad,
Alaa Al Ghazo,
Darssan Eswaramoorthi,
Harleen Kaur,
Sara Shree Muthuselvam,
Amit Sheth
Abstract:
In the dynamic landscape of Industry 4.0, achieving efficiency, precision, and adaptability is essential to optimize manufacturing operations. Industries suffer due to supply chain disruptions caused by anomalies, which are being detected by current AI models but leaving domain experts uncertain without deeper insights into these anomalies. Additionally, operational inefficiencies persist due to i…
▽ More
In the dynamic landscape of Industry 4.0, achieving efficiency, precision, and adaptability is essential to optimize manufacturing operations. Industries suffer due to supply chain disruptions caused by anomalies, which are being detected by current AI models but leaving domain experts uncertain without deeper insights into these anomalies. Additionally, operational inefficiencies persist due to inaccurate production forecasts and the limited effectiveness of traditional AI models for processing complex sensor data. Despite these advancements, existing systems lack the seamless integration of these capabilities needed to create a truly unified solution for enhancing production and decision-making. We propose SmartPilot, a neurosymbolic, multiagent CoPilot designed for advanced reasoning and contextual decision-making to address these challenges. SmartPilot processes multimodal sensor data and is compact to deploy on edge devices. It focuses on three key tasks: anomaly prediction, production forecasting, and domain-specific question answering. By bridging the gap between AI capabilities and real-world industrial needs, SmartPilot empowers industries with intelligent decision-making and drives transformative innovation in manufacturing. The demonstration video, datasets, and supplementary materials are available at https://github.com/ChathurangiShyalika/SmartPilot.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
NSF-MAP: Neurosymbolic Multimodal Fusion for Robust and Interpretable Anomaly Prediction in Assembly Pipelines
Authors:
Chathurangi Shyalika,
Renjith Prasad,
Fadi El Kalach,
Revathy Venkataramanan,
Ramtin Zand,
Ramy Harik,
Amit Sheth
Abstract:
In modern assembly pipelines, identifying anomalies is crucial in ensuring product quality and operational efficiency. Conventional single-modality methods fail to capture the intricate relationships required for precise anomaly prediction in complex predictive environments with abundant data and multiple modalities. This paper proposes a neurosymbolic AI and fusion-based approach for multimodal a…
▽ More
In modern assembly pipelines, identifying anomalies is crucial in ensuring product quality and operational efficiency. Conventional single-modality methods fail to capture the intricate relationships required for precise anomaly prediction in complex predictive environments with abundant data and multiple modalities. This paper proposes a neurosymbolic AI and fusion-based approach for multimodal anomaly prediction in assembly pipelines. We introduce a time series and image-based fusion model that leverages decision-level fusion techniques. Our research builds upon three primary novel approaches in multimodal learning: time series and image-based decision-level fusion modeling, transfer learning for fusion, and knowledge-infused learning. We evaluate the novel method using our derived and publicly available multimodal dataset and conduct comprehensive ablation studies to assess the impact of our preprocessing techniques and fusion model compared to traditional baselines. The results demonstrate that a neurosymbolic AI-based fusion approach that uses transfer learning can effectively harness the complementary strengths of time series and image data, offering a robust and interpretable approach for anomaly prediction in assembly pipelines with enhanced performance. \noindent The datasets, codes to reproduce the results, supplementary materials, and demo are available at https://github.com/ChathurangiShyalika/NSF-MAP.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Time Series Foundational Models: Their Role in Anomaly Detection and Prediction
Authors:
Chathurangi Shyalika,
Harleen Kaur Bagga,
Ahan Bhatt,
Renjith Prasad,
Alaa Al Ghazo,
Amit Sheth
Abstract:
Time series foundational models (TSFM) have gained prominence in time series forecasting, promising state-of-the-art performance across various applications. However, their application in anomaly detection and prediction remains underexplored, with growing concerns regarding their black-box nature, lack of interpretability and applicability. This paper critically evaluates the efficacy of TSFM in…
▽ More
Time series foundational models (TSFM) have gained prominence in time series forecasting, promising state-of-the-art performance across various applications. However, their application in anomaly detection and prediction remains underexplored, with growing concerns regarding their black-box nature, lack of interpretability and applicability. This paper critically evaluates the efficacy of TSFM in anomaly detection and prediction tasks. We systematically analyze TSFM across multiple datasets, including those characterized by the absence of discernible patterns, trends and seasonality. Our analysis shows that while TSFMs can be extended for anomaly detection and prediction, traditional statistical and deep learning models often match or outperform TSFM in these tasks. Additionally, TSFMs require high computational resources but fail to capture sequential dependencies effectively or improve performance in few-shot or zero-shot scenarios. \noindent The preprocessed datasets, codes to reproduce the results and supplementary materials are available at https://github.com/smtmnfg/TSFM.
△ Less
Submitted 26 December, 2024;
originally announced December 2024.
-
AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines
Authors:
Renjith Prasad,
Chathurangi Shyalika,
Ramtin Zand,
Fadi El Kalach,
Revathy Venkataramanan,
Ramy Harik,
Amit Sheth
Abstract:
Anomaly detection in manufacturing pipelines remains a critical challenge, intensified by the complexity and variability of industrial environments. This paper introduces AssemAI, an interpretable image-based anomaly detection system tailored for smart manufacturing pipelines. Utilizing a curated image dataset from an industry-focused rocket assembly pipeline, we address the challenge of imbalance…
▽ More
Anomaly detection in manufacturing pipelines remains a critical challenge, intensified by the complexity and variability of industrial environments. This paper introduces AssemAI, an interpretable image-based anomaly detection system tailored for smart manufacturing pipelines. Utilizing a curated image dataset from an industry-focused rocket assembly pipeline, we address the challenge of imbalanced image data and demonstrate the importance of image-based methods in anomaly detection. Our primary contributions include deriving an image dataset, fine-tuning an object detection model YOLO-FF, and implementing a custom anomaly detection model for assembly pipelines. The proposed approach leverages domain knowledge in data preparation, model development and reasoning. We implement several anomaly detection models on the derived image dataset, including a Convolutional Neural Network, Vision Transformer (ViT), and pre-trained versions of these models. Additionally, we incorporate explainability techniques at both user and model levels, utilizing ontology for user-level explanations and SCORE-CAM for in-depth feature and model analysis. Finally, the best-performing anomaly detection model and YOLO-FF are deployed in a real-time setting. Our results include ablation studies on the baselines and a comprehensive evaluation of the proposed system. This work highlights the broader impact of advanced image-based anomaly detection in enhancing the reliability and efficiency of smart manufacturing processes. The image dataset, codes to reproduce the results and additional experiments are available at https://github.com/renjithk4/AssemAI.
△ Less
Submitted 16 October, 2024; v1 submitted 4 August, 2024;
originally announced August 2024.
-
Evaluating the Role of Data Enrichment Approaches Towards Rare Event Analysis in Manufacturing
Authors:
Chathurangi Shyalika,
Ruwan Wickramarachchi,
Fadi El Kalach,
Ramy Harik,
Amit Sheth
Abstract:
Rare events are occurrences that take place with a significantly lower frequency than more common regular events. In manufacturing, predicting such events is particularly important, as they lead to unplanned downtime, shortening equipment lifespan, and high energy consumption. The occurrence of events is considered frequently-rare if observed in more than 10% of all instances, very-rare if it is 1…
▽ More
Rare events are occurrences that take place with a significantly lower frequency than more common regular events. In manufacturing, predicting such events is particularly important, as they lead to unplanned downtime, shortening equipment lifespan, and high energy consumption. The occurrence of events is considered frequently-rare if observed in more than 10% of all instances, very-rare if it is 1-5%, moderately-rare if it is 5-10%, and extremely-rare if less than 1%. The rarity of events is inversely correlated with the maturity of a manufacturing industry. Typically, the rarity of events affects the multivariate data generated within a manufacturing process to be highly imbalanced, which leads to bias in predictive models. This paper evaluates the role of data enrichment techniques combined with supervised machine-learning techniques for rare event detection and prediction. To address the data scarcity, we use time series data augmentation and sampling methods to amplify the dataset with more multivariate features and data points while preserving the underlying time series patterns in the combined alterations. Imputation techniques are used in handling null values in datasets. Considering 15 learning models ranging from statistical learning to machine learning to deep learning methods, the best-performing model for the selected datasets is obtained and the efficacy of data enrichment is evaluated. Based on this evaluation, our results find that the enrichment procedure enhances up to 48% of F1 measure in rare failure event detection and prediction of supervised prediction models. We also conduct empirical and ablation experiments on the datasets to derive dataset-specific novel insights. Finally, we investigate the interpretability aspect of models for rare event prediction, considering multiple methods.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
A Comprehensive Survey on Rare Event Prediction
Authors:
Chathurangi Shyalika,
Ruwan Wickramarachchi,
Amit Sheth
Abstract:
Rare event prediction involves identifying and forecasting events with a low probability using machine learning (ML) and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the ML pipeline, i.e., from data processing to algorithms to evaluation protocols. Predic…
▽ More
Rare event prediction involves identifying and forecasting events with a low probability using machine learning (ML) and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the ML pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and ML. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.
△ Less
Submitted 5 October, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.