FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
Authors:
Christodoulos Constantinides,
Dhaval Patel,
Shuxin Lin,
Claudio Guerrero,
Sunil Dagajirao Patil,
Jayant Kalagnanam
Abstract:
We introduce FailureSensorIQ, a novel Multi-Choice Question-Answering (MCQA) benchmarking system designed to assess the ability of Large Language Models (LLMs) to reason and understand complex, domain-specific scenarios in Industry 4.0. Unlike traditional QA benchmarks, our system focuses on multiple aspects of reasoning through failure modes, sensor data, and the relationships between them across…
▽ More
We introduce FailureSensorIQ, a novel Multi-Choice Question-Answering (MCQA) benchmarking system designed to assess the ability of Large Language Models (LLMs) to reason and understand complex, domain-specific scenarios in Industry 4.0. Unlike traditional QA benchmarks, our system focuses on multiple aspects of reasoning through failure modes, sensor data, and the relationships between them across various industrial assets. Through this work, we envision a paradigm shift where modeling decisions are not only data-driven using statistical tools like correlation analysis and significance tests, but also domain-driven by specialized LLMs which can reason about the key contributors and useful patterns that can be captured with feature engineering. We evaluate the Industrial knowledge of over a dozen LLMs-including GPT-4, Llama, and Mistral-on FailureSensorIQ from different lens using Perturbation-Uncertainty-Complexity analysis, Expert Evaluation study, Asset-Specific Knowledge Gap analysis, ReAct agent using external knowledge-bases. Even though closed-source models with strong reasoning capabilities approach expert-level performance, the comprehensive benchmark reveals a significant drop in performance that is fragile to perturbations, distractions, and inherent knowledge gaps in the models. We also provide a real-world case study of how LLMs can drive the modeling decisions on 3 different failure prediction datasets related to various assets. We release: (a) expert-curated MCQA for various industrial assets, (b) FailureSensorIQ benchmark and Hugging Face leaderboard based on MCQA built from non-textual data found in ISO documents, and (c) LLMFeatureSelector, an LLM-based feature selection scikit-learn pipeline. The software is available at https://github.com/IBM/FailureSensorIQ.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
32-Bit RISC-V CPU Core on Logisim
Authors:
Siddesh D. Patil,
Premraj V. Jadhav,
Siddharth Sankhe
Abstract:
This project focuses on making a RISC-V CPU Core using the Logisim software. RISC-V is significant because it will allow smaller device manufacturers to build hardware without paying royalties and allow developers and researchers to design and experiment with a proven and freely available instruction set architecture. RISC-V is ideal for a variety of applications from IOTs to Embedded systems such…
▽ More
This project focuses on making a RISC-V CPU Core using the Logisim software. RISC-V is significant because it will allow smaller device manufacturers to build hardware without paying royalties and allow developers and researchers to design and experiment with a proven and freely available instruction set architecture. RISC-V is ideal for a variety of applications from IOTs to Embedded systems such as disks, CPUs, Calculators, SOCs, etc. RISC-V(Reduced Instruction Set Architecture) is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. Unlike most other ISA designs, the RISC-V ISA is provided under open source licenses that do not require fees to use.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
DEMO: Simulation of Realistic Mobility Model and Implementation of 802.11p (DSRC) for Vehicular Networks (VANET)
Authors:
Saurabh D. Patil,
D. V. Thombare,
Vaishali D. Khairnar
Abstract:
An ad hoc network of vehicles (VANET) consists of vehicles that exchange information via radio in order to improve road safety, traffic management and do better distribution of traffic load in time and space. Along with this it allows Internet access for passengers and users of vehicles. A significant characteristic while studying VANETs is the requirement of having a mobility model that gives asp…
▽ More
An ad hoc network of vehicles (VANET) consists of vehicles that exchange information via radio in order to improve road safety, traffic management and do better distribution of traffic load in time and space. Along with this it allows Internet access for passengers and users of vehicles. A significant characteristic while studying VANETs is the requirement of having a mobility model that gives aspects of real vehicular traffic. These scenarios play an important role in performance of VANETs. In our paper we have demonstration and description of generating realistic mobility model using various tools such as eWorld, OpenStreetMap, SUMO and TraNS. Generated mobility scenario is added to NS-2.34 (Network Simulator) for analysis of DSR and AODV routing protocol under 802.11p (DSRC/WAVE) and 802.11a. Results after analysis shows 802.11p is more suitable than 802.11a for VANET.
△ Less
Submitted 18 April, 2013;
originally announced April 2013.