BELL: Benchmarking the Explainability of Large Language Models
Authors:
Syed Quiser Ahmed,
Bharathi Vokkaliga Ganesh,
Jagadish Babu P,
Karthick Selvaraj,
ReddySiva Naga Parvathi Devi,
Sravya Kappala
Abstract:
Large Language Models have demonstrated remarkable capabilities in natural language processing, yet their decision-making processes often lack transparency. This opaqueness raises significant concerns regarding trust, bias, and model performance. To address these issues, understanding and evaluating the interpretability of LLMs is crucial. This paper introduces a standardised benchmarking techniqu…
▽ More
Large Language Models have demonstrated remarkable capabilities in natural language processing, yet their decision-making processes often lack transparency. This opaqueness raises significant concerns regarding trust, bias, and model performance. To address these issues, understanding and evaluating the interpretability of LLMs is crucial. This paper introduces a standardised benchmarking technique, Benchmarking the Explainability of Large Language Models, designed to evaluate the explainability of large language models.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
A Comprehensive Review of Adversarial Attacks on Machine Learning
Authors:
Syed Quiser Ahmed,
Bharathi Vokkaliga Ganesh,
Sathyanarayana Sampath Kumar,
Prakhar Mishra,
Ravi Anand,
Bhanuteja Akurathi
Abstract:
This research provides a comprehensive overview of adversarial attacks on AI and ML models, exploring various attack types, techniques, and their potential harms. We also delve into the business implications, mitigation strategies, and future research directions. To gain practical insights, we employ the Adversarial Robustness Toolbox (ART) [1] library to simulate these attacks on real-world use c…
▽ More
This research provides a comprehensive overview of adversarial attacks on AI and ML models, exploring various attack types, techniques, and their potential harms. We also delve into the business implications, mitigation strategies, and future research directions. To gain practical insights, we employ the Adversarial Robustness Toolbox (ART) [1] library to simulate these attacks on real-world use cases, such as self-driving cars. Our goal is to inform practitioners and researchers about the challenges and opportunities in defending AI systems against adversarial threats. By providing a comprehensive comparison of different attack methods, we aim to contribute to the development of more robust and secure AI systems.
△ Less
Submitted 15 December, 2024;
originally announced December 2024.