Search | arXiv e-print repository

Viewing the process of generating counterfactuals as a source of knowledge: a new approach for explaining classifiers

Authors: Vincent Lemaire, Nathan Le Boudec, Victor Guyomard, Françoise Fessant

Abstract: There are now many explainable AI methods for understanding the decisions of a machine learning model. Among these are those based on counterfactual reasoning, which involve simulating features changes and observing the impact on the prediction. This article proposes to view this simulation process as a source of creating a certain amount of knowledge that can be stored to be used, later, in diffe… ▽ More There are now many explainable AI methods for understanding the decisions of a machine learning model. Among these are those based on counterfactual reasoning, which involve simulating features changes and observing the impact on the prediction. This article proposes to view this simulation process as a source of creating a certain amount of knowledge that can be stored to be used, later, in different ways. This process is illustrated in the additive model and, more specifically, in the case of the naive Bayes classifier, whose interesting properties for this purpose are shown. △ Less

Submitted 12 April, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 8 pages

arXiv:2304.12943 [pdf, other]

Generating robust counterfactual explanations

Authors: Victor Guyomard, Françoise Fessant, Thomas Guyet, Tassadit Bouadi, Alexandre Termier

Abstract: Counterfactual explanations have become a mainstay of the XAI field. This particularly intuitive statement allows the user to understand what small but necessary changes would have to be made to a given situation in order to change a model prediction. The quality of a counterfactual depends on several criteria: realism, actionability, validity, robustness, etc. In this paper, we are interested in… ▽ More Counterfactual explanations have become a mainstay of the XAI field. This particularly intuitive statement allows the user to understand what small but necessary changes would have to be made to a given situation in order to change a model prediction. The quality of a counterfactual depends on several criteria: realism, actionability, validity, robustness, etc. In this paper, we are interested in the notion of robustness of a counterfactual. More precisely, we focus on robustness to counterfactual input changes. This form of robustness is particularly challenging as it involves a trade-off between the robustness of the counterfactual and the proximity with the example to explain. We propose a new framework, CROCO, that generates robust counterfactuals while managing effectively this trade-off, and guarantees the user a minimal robustness. An empirical evaluation on tabular datasets confirms the relevance and effectiveness of our approach. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2212.10847 [pdf, other]

VCNet: A self-explaining model for realistic counterfactual generation

Authors: Victor Guyomard, Françoise Fessant, Thomas Guyet, Tassadit Bouadi, Alexandre Termier

Abstract: Counterfactual explanation is a common class of methods to make local explanations of machine learning decisions. For a given instance, these methods aim to find the smallest modification of feature values that changes the predicted decision made by a machine learning model. One of the challenges of counterfactual explanation is the efficient generation of realistic counterfactuals. To address thi… ▽ More Counterfactual explanation is a common class of methods to make local explanations of machine learning decisions. For a given instance, these methods aim to find the smallest modification of feature values that changes the predicted decision made by a machine learning model. One of the challenges of counterfactual explanation is the efficient generation of realistic counterfactuals. To address this challenge, we propose VCNet-Variational Counter Net-a model architecture that combines a predictor and a counterfactual generator that are jointly trained, for regression or classification tasks. VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem. Our contribution is the generation of counterfactuals that are close to the distribution of the predicted class. This is done by learning a variational autoencoder conditionally to the output of the predictor in a join-training fashion. We present an empirical evaluation on tabular datasets and across several interpretability metrics. The results are competitive with the state-of-the-art method. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Journal ref: ECML PKDD 2022 - European Conference on Machine Learning and Knowledge Discovery in Databases., Sep 2022, Grenoble, France

arXiv:1903.12211 [pdf, other]

Privacy in trajectory micro-data publishing : a survey

Authors: Marco Fiore, Panagiota Katsikouli, Elli Zavou, Mathieu Cunche, Françoise Fessant, Dominique Le Hello, Ulrich Matchi Aivodji, Baptiste Olivier, Tony Quertier, Razvan Stanica

Abstract: We survey the literature on the privacy of trajectory micro-data, i.e., spatiotemporal information about the mobility of individuals, whose collection is becoming increasingly simple and frequent thanks to emerging information and communication technologies. The focus of our review is on privacy-preserving data publishing (PPDP), i.e., the publication of databases of trajectory micro-data that pre… ▽ More We survey the literature on the privacy of trajectory micro-data, i.e., spatiotemporal information about the mobility of individuals, whose collection is becoming increasingly simple and frequent thanks to emerging information and communication technologies. The focus of our review is on privacy-preserving data publishing (PPDP), i.e., the publication of databases of trajectory micro-data that preserve the privacy of the monitored individuals. We classify and present the literature of attacks against trajectory micro-data, as well as solutions proposed to date for protecting databases from such attacks. This paper serves as an introductory reading on a critical subject in an era of growing awareness about privacy risks connected to digital services, and provides insights into open problems and future directions for research. △ Less

Submitted 13 May, 2020; v1 submitted 26 March, 2019; originally announced March 2019.

Comments: Accepted for publication at Transactions for Data Privacy

arXiv:1209.1983 [pdf, other]

Toward a New Protocol to Evaluate Recommender Systems

Authors: Frank Meyer, Françoise Fessant, Fabrice Clérot, Eric Gaussier

Abstract: In this paper, we propose an approach to analyze the performance and the added value of automatic recommender systems in an industrial context. We show that recommender systems are multifaceted and can be organized around 4 structuring functions: help users to decide, help users to compare, help users to discover, help users to explore. A global off line protocol is then proposed to evaluate recom… ▽ More In this paper, we propose an approach to analyze the performance and the added value of automatic recommender systems in an industrial context. We show that recommender systems are multifaceted and can be organized around 4 structuring functions: help users to decide, help users to compare, help users to discover, help users to explore. A global off line protocol is then proposed to evaluate recommender systems. This protocol is based on the definition of appropriate evaluation measures for each aforementioned function. The evaluation protocol is discussed from the perspective of the usefulness and trust of the recommendation. A new measure called Average Measure of Impact is introduced. This measure evaluates the impact of the personalized recommendation. We experiment with two classical methods, K-Nearest Neighbors (KNN) and Matrix Factorization (MF), using the well known dataset: Netflix. A segmentation of both users and items is proposed to finely analyze where the algorithms perform well or badly. We show that the performance is strongly dependent on the segments and that there is no clear correlation between the RMSE and the quality of the recommendation. △ Less

Submitted 10 September, 2012; originally announced September 2012.

Comments: 6 pages. arXiv admin note: text overlap with arXiv:1203.4487

ACM Class: H.3.3; H.3.4

arXiv:1012.0379 [pdf, ps, other]

Quality of Source Location Protection in Globally Attacked Sensor Networks

Authors: Silvija Kokalj-Filipovic, Fabrice Le Fessant, Predrag Spasojevic

Abstract: We propose an efficient scheme for generating fake network traffic to disguise the real event notification in the presence of a global eavesdropper, which is especially relevant for the quality of service in delay-intolerant applications monitoring rare and spatially sparse events, and deployed as large wireless sensor networks with single data collector. The efficiency of the scheme that provides… ▽ More We propose an efficient scheme for generating fake network traffic to disguise the real event notification in the presence of a global eavesdropper, which is especially relevant for the quality of service in delay-intolerant applications monitoring rare and spatially sparse events, and deployed as large wireless sensor networks with single data collector. The efficiency of the scheme that provides statistical source anonymity is achieved by partitioning network nodes randomly into several node groups. Members of the same group collectively emulate both temporal and spatial distribution of the event. Under such dummy-traffic framework of the source anonymity protection, we aim to better model the global eavesdropper, especially her way of using statistical tests to detect the real event, and to present the quality of the location protection as relative to the adversary's strength. In addition, our approach aims to reduce the per-event work spent to generate the fake traffic while, most importantly, providing a guaranteed latency in reporting the event. The latency is controlled by decoupling the routing from the fake traffic schedule. We believe that the proposed source anonymity protection strategy, and the quality evaluation framework, are well justified by the abundance of the applications that monitor a rare event with known temporal statistics, and uniform spatial distribution. △ Less

Submitted 1 December, 2010; originally announced December 2010.

Comments: shorter versiom

arXiv:1012.0378 [pdf, ps, other]

Some Important Aspects of Source Location Protection in Globally Attacked Sensor Networks

Authors: Silvija Kokalj-Filipovic, Fabrice Le Fessant, Predrag Spasojevic

Abstract: In the problem of location anonymity of the events exposed to a global eavesdropper, we highlight and analyze some aspects that are missing in the prior work, which is especially relevant for the quality of secure sensing in delay-intolerant applications monitoring rare and spatially sparse events, and deployed as large wireless sensor networks with single data collector. We propose an efficient s… ▽ More In the problem of location anonymity of the events exposed to a global eavesdropper, we highlight and analyze some aspects that are missing in the prior work, which is especially relevant for the quality of secure sensing in delay-intolerant applications monitoring rare and spatially sparse events, and deployed as large wireless sensor networks with single data collector. We propose an efficient scheme for generating fake network traffic to disguise the real event notification. The efficiency of the scheme that provides statistical source location anonymity is achieved by partitioning network nodes randomly into several dummy source groups. Members of the same group collectively emulate both temporal and spatial distribution of the event. Under such dummy-traffic framework of the source anonymity protection, we aim to better model the global eavesdropper, especially her way of using statistical tests to detect the real event, and to present the quality of the location protection as relative to the adversary's strength. In addition, our approach aims to reduce the per-event work spent to generate the fake traffic while, most importantly, providing a guaranteed latency in reporting the event. The latency is controlled by decoupling the routing from the fake-traffic schedule. A good dummy source group design also provides a robust protection of event bursts. This is achieved at the expense of the significant overhead as the number of dummy source groups must be increased to the reciprocal value of the false alarm parameter used in the statistical test. We believe that the proposed source anonymity protection strategy, and the evaluation framework, are well justified by the abundance of the applications that monitor a rare event with known temporal statistics, and uniform spatial distribution. △ Less

Submitted 1 December, 2010; originally announced December 2010.

Comments: longer version

arXiv:1004.0930 [pdf, ps, other]

Spying the World from your Laptop -- Identifying and Profiling Content Providers and Big Downloaders in BitTorrent

Authors: Stevens Le Blond, Arnaud Legout, Fabrice Le Fessant, Walid Dabbous, Mohamed Ali Kaafar

Abstract: This paper presents a set of exploits an adversary can use to continuously spy on most BitTorrent users of the Internet from a single machine and for a long period of time. Using these exploits for a period of 103 days, we collected 148 million IPs downloading 2 billion copies of contents. We identify the IP address of the content providers for 70% of the BitTorrent contents we spied on. We show… ▽ More This paper presents a set of exploits an adversary can use to continuously spy on most BitTorrent users of the Internet from a single machine and for a long period of time. Using these exploits for a period of 103 days, we collected 148 million IPs downloading 2 billion copies of contents. We identify the IP address of the content providers for 70% of the BitTorrent contents we spied on. We show that a few content providers inject most contents into BitTorrent and that those content providers are located in foreign data centers. We also show that an adversary can compromise the privacy of any peer in BitTorrent and identify the big downloaders that we define as the peers who subscribe to a large number of contents. This infringement on users' privacy poses a significant impediment to the legal adoption of BitTorrent. △ Less

Submitted 6 April, 2010; originally announced April 2010.

Journal ref: 3rd USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET'10), San Jose, CA : United States (2010)

Showing 1–8 of 8 results for author: Fessant, F