Search | arXiv e-print repository

doi 10.1145/3715275.3732186

XAI-Units: Benchmarking Explainability Methods with Unit Tests

Authors: Jun Rui Lee, Sadegh Emami, Michael David Hollins, Timothy C. H. Wong, Carlos Ignacio Villalobos Sánchez, Francesca Toni, Dekai Zhang, Adam Dejl

Abstract: Feature attribution (FA) methods are widely used in explainable AI (XAI) to help users understand how the inputs of a machine learning model contribute to its outputs. However, different FA models often provide disagreeing importance scores for the same model. In the absence of ground truth or in-depth knowledge about the inner workings of the model, it is often difficult to meaningfully determine… ▽ More Feature attribution (FA) methods are widely used in explainable AI (XAI) to help users understand how the inputs of a machine learning model contribute to its outputs. However, different FA models often provide disagreeing importance scores for the same model. In the absence of ground truth or in-depth knowledge about the inner workings of the model, it is often difficult to meaningfully determine which of the different FA methods produce more suitable explanations in different contexts. As a step towards addressing this issue, we introduce the open-source XAI-Units benchmark, specifically designed to evaluate FA methods against diverse types of model behaviours, such as feature interactions, cancellations, and discontinuous outputs. Our benchmark provides a set of paired datasets and models with known internal mechanisms, establishing clear expectations for desirable attribution scores. Accompanied by a suite of built-in evaluation metrics, XAI-Units streamlines systematic experimentation and reveals how FA methods perform against distinct, atomic kinds of model reasoning, similar to unit tests in software engineering. Crucially, by using procedurally generated models tied to synthetic datasets, we pave the way towards an objective and reliable comparison of FA methods. △ Less

Submitted 1 June, 2025; originally announced June 2025.

Comments: Accepted at FAccT 2025

arXiv:2302.11327 [pdf, other]

doi 10.1109/OJSP.2023.3279011

A Gradient Boosting Approach for Training Convolutional and Deep Neural Networks

Authors: Seyedsaman Emami, Gonzalo Martínez-Muñoz

Abstract: Deep learning has revolutionized the computer vision and image classification domains. In this context Convolutional Neural Networks (CNNs) based architectures are the most widely applied models. In this article, we introduced two procedures for training Convolutional Neural Networks (CNNs) and Deep Neural Network based on Gradient Boosting (GB), namely GB-CNN and GB-DNN. These models are trained… ▽ More Deep learning has revolutionized the computer vision and image classification domains. In this context Convolutional Neural Networks (CNNs) based architectures are the most widely applied models. In this article, we introduced two procedures for training Convolutional Neural Networks (CNNs) and Deep Neural Network based on Gradient Boosting (GB), namely GB-CNN and GB-DNN. These models are trained to fit the gradient of the loss function or pseudo-residuals of previous models. At each iteration, the proposed method adds one dense layer to an exact copy of the previous deep NN model. The weights of the dense layers trained on previous iterations are frozen to prevent over-fitting, permitting the model to fit the new dense as well as to fine-tune the convolutional layers (for GB-CNN) while still utilizing the information already learned. Through extensive experimentation on different 2D-image classification and tabular datasets, the presented models show superior performance in terms of classification accuracy with respect to standard CNN and Deep-NN with the same architectures. △ Less

Submitted 23 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

arXiv:2211.14599 [pdf, ps, other]

doi 10.1007/s13042-024-02279-0

Condensed Gradient Boosting

Authors: Seyedsaman Emami, Gonzalo Martínez-Muñoz

Abstract: This paper presents a computationally efficient variant of gradient boosting for multi-class classification and multi-output regression tasks. Standard gradient boosting uses a 1-vs-all strategy for classifications tasks with more than two classes. This strategy translates in that one tree per class and iteration has to be trained. In this work, we propose the use of multi-output regressors as bas… ▽ More This paper presents a computationally efficient variant of gradient boosting for multi-class classification and multi-output regression tasks. Standard gradient boosting uses a 1-vs-all strategy for classifications tasks with more than two classes. This strategy translates in that one tree per class and iteration has to be trained. In this work, we propose the use of multi-output regressors as base models to handle the multi-class problem as a single task. In addition, the proposed modification allows the model to learn multi-output regression problems. An extensive comparison with other multi-ouptut based gradient boosting methods is carried out in terms of generalization and computational efficiency. The proposed method showed the best trade-off between generalization ability and training and predictions speeds. △ Less

Submitted 14 May, 2024; v1 submitted 26 November, 2022; originally announced November 2022.

arXiv:2209.01380 [pdf, other]

Classification of Breast Tumours Based on Histopathology Images Using Deep Features and Ensemble of Gradient Boosting Methods

Authors: Mohammad Reza Abbasniya, Sayed Ali Sheikholeslamzadeh, Hamid Nasiri, Samaneh Emami

Abstract: Breast cancer is the most common cancer among women worldwide. Early-stage diagnosis of breast cancer can significantly improve the efficiency of treatment. Computer-aided diagnosis (CAD) systems are widely adopted in this issue due to their reliability, accuracy and affordability. There are different imaging techniques for a breast cancer diagnosis; one of the most accurate ones is histopathology… ▽ More Breast cancer is the most common cancer among women worldwide. Early-stage diagnosis of breast cancer can significantly improve the efficiency of treatment. Computer-aided diagnosis (CAD) systems are widely adopted in this issue due to their reliability, accuracy and affordability. There are different imaging techniques for a breast cancer diagnosis; one of the most accurate ones is histopathology which is used in this paper. Deep feature transfer learning is used as the main idea of the proposed CAD system's feature extractor. Although 16 different pre-trained networks have been tested in this study, our main focus is on the classification phase. The Inception-ResNet-v2 which has both residual and inception networks profits together has shown the best feature extraction capability in the case of breast cancer histopathology images among all tested CNNs. In the classification phase, the ensemble of CatBoost, XGBoost and LightGBM has provided the best average accuracy. The BreakHis dataset was used to evaluate the proposed method. BreakHis contains 7909 histopathology images (2,480 benign and 5,429 malignant) in four magnification factors. The proposed method's accuracy (IRv2-CXL) using 70% of BreakHis dataset as training data in 40x, 100x, 200x and 400x magnification is 96.82%, 95.84%, 97.01% and 96.15%, respectively. Most studies on automated breast cancer detection have focused on feature extraction, which made us attend to the classification phase. IRv2-CXL has shown better or comparable results in all magnifications due to using the soft voting ensemble method which could combine the advantages of CatBoost, XGBoost and LightGBM together. △ Less

Submitted 3 September, 2022; originally announced September 2022.

Comments: This work has been submitted to the Computers and Electrical Engineering journal (Elsevier) for possible publication

arXiv:1909.12098 [pdf, ps, other]

doi 10.1109/ACCESS.2023.3271515

Sequential Training of Neural Networks with Gradient Boosting

Authors: Seyedsaman Emami, Gonzalo Martínez-Muñoz

Abstract: This paper presents a novel technique based on gradient boosting to train the final layers of a neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A neural network can also be seen as an additive expansion where the scalar product of the responses of the last hidden layer and its weights pr… ▽ More This paper presents a novel technique based on gradient boosting to train the final layers of a neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A neural network can also be seen as an additive expansion where the scalar product of the responses of the last hidden layer and its weights provide the final output of the network. Instead of training the network as a whole, the proposed algorithm trains the network sequentially in $T$ steps. First, the bias term of the network is initialized with a constant approximation that minimizes the average loss of the data. Then, at each step, a portion of the network, composed of $J$ neurons, is trained to approximate the pseudo-residuals on the training data computed from the previous iterations. Finally, the $T$ partial models and bias are integrated as a single NN with $T \times J$ neurons in the hidden layer. Extensive experiments in classification and regression tasks, as well as in combination with deep neural networks, are carried out showing a competitive generalization performance with respect to neural networks trained with different standard solvers, such as Adam, L-BFGS, SGD and deep models. Furthermore, we show that the proposed method design permits to switch off a number of hidden units during test (the units that were last trained) without a significant reduction of its generalization ability. This permits the adaptation of the model to different classification speed requirements on the fly. △ Less

Submitted 20 December, 2022; v1 submitted 26 September, 2019; originally announced September 2019.

Comments: This paper is under consideration at Pattern Recognition Letters

arXiv:1307.4733 [pdf, ps, other]

Performance Limits of a Cloud Radio

Authors: Maaz M. Mohiuddin, Varun Maheshwari, Sreejith T. V., Kiran Kuchi, G. V. V. Sharma, Shahriar Emami

Abstract: Cooperation in a cellular network is seen as a key technique in managing other cell interference to observe a gain in achievable rate. In this paper, we present the achievable rate regions for a cloud radio network using a sub-optimal zero forcing equalizer with dirty paper precoding. We show that when complete channel state information is available at the cloud, rates close to those achievable wi… ▽ More Cooperation in a cellular network is seen as a key technique in managing other cell interference to observe a gain in achievable rate. In this paper, we present the achievable rate regions for a cloud radio network using a sub-optimal zero forcing equalizer with dirty paper precoding. We show that when complete channel state information is available at the cloud, rates close to those achievable with total interference cancellation can be achieved. With mean capacity gains, of up to 2 fold over the conventional cellular network in both uplink and downlink, this precoding scheme shows great promise for implementation in a cloud radio network. To simplify the analysis, we use a stochastic geometric framework based of Poisson point processes instead of the traditional grid based cellular network model. We also study the impact of limiting the channel state information and geographical clustering to limit the cloud size on the achievable rate. We have observed that using this zero forcing-dirty paper coding technique, the adverse effect of inter-cluster interference can be minimized thereby transforming an interference limited network into a noise limited network as experienced by an average user in the network for low operating signal-to-noise-ratios. However, for higher signal-to-noise-ratios, both the average achievable rate and cell-edge achievable rate saturate as observed in literature. As the implementation of dirty paper coding is practically not feasible, we present a practical design of a cloud radio network using cloud a minimum mean square equalizer for processing the uplink streams and use Tomlinson-Harashima precoder as a sub-optimal substitute for a dirty paper precoder in downlink. △ Less

Submitted 17 July, 2013; originally announced July 2013.

arXiv:0904.4836 [pdf]

FaceBots: Steps Towards Enhanced Long-Term Human-Robot Interaction by Utilizing and Publishing Online Social Information

Authors: Nikolaos Mavridis, Shervin Emami, Chandan Datta, Wajahat Kamzi, Chiraz BenAbdelkader, Panos Toulis, Andry Tanoto, Tamer Rabie

Abstract: Our project aims at supporting the creation of sustainable and meaningful longer-term human-robot relationships through the creation of embodied robots with face recognition and natural language dialogue capabilities, which exploit and publish social information available on the web (Facebook). Our main underlying experimental hypothesis is that such relationships can be significantly enhanced i… ▽ More Our project aims at supporting the creation of sustainable and meaningful longer-term human-robot relationships through the creation of embodied robots with face recognition and natural language dialogue capabilities, which exploit and publish social information available on the web (Facebook). Our main underlying experimental hypothesis is that such relationships can be significantly enhanced if the human and the robot are gradually creating a pool of shared episodic memories that they can co-refer to (shared memories), and if they are both embedded in a social web of other humans and robots they both know and encounter (shared friends). In this paper, we are presenting such a robot, which as we will see achieves two significant novelties. △ Less

Submitted 30 April, 2009; originally announced April 2009.

ACM Class: H.5.2; I.2.9

Showing 1–7 of 7 results for author: Emami, S