-
IRS-aided Near-field Communication: Prospects and Challenges with Codebook Approach
Authors:
Ryuhei Hibi,
Hiroaki Hashida,
Yuichi Kawamoto,
Nei Kato
Abstract:
Intelligent reflecting surfaces (IRSs) are gaining attention as a low-cost solution to the coverage reduction in high-frequency bands used in next-generation communications. IRSs achieve low costs by controlling only the reflection of radio waves. However, to improve further the propagation environment, larger IRS sizes are required owing to their inability to amplify and retransmit signals. As th…
▽ More
Intelligent reflecting surfaces (IRSs) are gaining attention as a low-cost solution to the coverage reduction in high-frequency bands used in next-generation communications. IRSs achieve low costs by controlling only the reflection of radio waves. However, to improve further the propagation environment, larger IRS sizes are required owing to their inability to amplify and retransmit signals. As the IRS size increases, the near-field region expands, requiring beamfocusing instead of beamforming, which is extensively used in existing research. This results in considerable overhead for IRS control decisions. To address this, constructing a codebook that achieves high communication quality with fewer IRS control patterns is effective. This article presents experimental results demonstrating the effectiveness of beamfocusing, construction policy for nonuniform three-dimensional codebooks, and simulation evaluation results of communication performance when operating IRSs with various codebooks. We believe these insights will foster further value for IRSs in next-generation communications.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
StatWhy: Formal Verification Tool for Statistical Hypothesis Testing Programs
Authors:
Yusuke Kawamoto,
Kentaro Kobayashi,
Kohei Suenaga
Abstract:
Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a tool-assisted method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the stat…
▽ More
Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a tool-assisted method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the statistical programs with the requirements for these methods. Through this annotation, they are reminded to check the requirements for statistical methods, including those that cannot be formally verified, such as the distribution of the unknown true population. Our software tool StatWhy automatically checks whether programmers have properly specified the requirements for the statistical methods, thereby identifying any missing requirements that need to be addressed. This tool is implemented using the Why3 platform to verify the correctness of OCaml programs that conduct statistical hypothesis testing. We demonstrate how StatWhy can be used to avoid common errors in various statistical hypothesis testing programs.
△ Less
Submitted 1 June, 2025; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Model-Checking in the Loop Model-Based Testing for Automotive Operating Systems
Authors:
Toshiaki Aoki,
Aritoshi Hata,
Kazusato Kanamori,
Satoshi Tanaka,
Yuta Kawamoto,
Yasuhiro Tanase,
Masumi Imai,
Fumiya Shigemitsu,
Masaki Gondo,
Tomoji Kishi
Abstract:
While vehicles have primarily been controlled through mechanical means in years past, an increasing number of embedded control systems are being installed and used, keeping pace with advances in electronic control technology and performance. Automotive systems consist of multiple components developed by a range of vendors. To accelerate developments in embedded control systems, industrial standard…
▽ More
While vehicles have primarily been controlled through mechanical means in years past, an increasing number of embedded control systems are being installed and used, keeping pace with advances in electronic control technology and performance. Automotive systems consist of multiple components developed by a range of vendors. To accelerate developments in embedded control systems, industrial standards such as AUTOSAR are being defined for automotive systems, including the design of operating system and middleware technologies. Crucial to ensuring the safety of automotive systems, the operating system is foundational software on which many automotive applications are executed. In this paper, we propose an integrated model-based method for verifying automotive operating systems; our method is called Model-Checking in the Loop Model-Based Testing (MCIL-MBT). In MCIL-MBT, we create a model that formalizes specifications of automotive operating systems and verifies the specifications via model-checking. Next, we conduct model-based testing with the verified model to ensure that a specific operating system implementation conforms to the model. These verification and testing stages are iterated over until no flaws are detected. Our method has already been introduced to an automotive system supplier and an operating system vendor. Through our approach, we successfully identified flaws that were not detected by conventional review and testing methods.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
A Survey on Multi-AP Coordination Approaches over Emerging WLANs: Future Directions and Open Challenges
Authors:
Shikhar Verma,
Tiago Koketsu Rodrigues,
Yuichi Kawamoto,
Mostafa M. Fouda,
Nei Kato
Abstract:
Recent advancements in wireless local area network (WLAN) technology include IEEE 802.11be and 802.11ay, often known as Wi-Fi 7 and WiGig, respectively. The goal of these developments is to provide Extremely High Throughput (EHT) and low latency to meet the demands of future applications like as 8K videos, augmented and virtual reality, the Internet of Things, telesurgery, and other developing tec…
▽ More
Recent advancements in wireless local area network (WLAN) technology include IEEE 802.11be and 802.11ay, often known as Wi-Fi 7 and WiGig, respectively. The goal of these developments is to provide Extremely High Throughput (EHT) and low latency to meet the demands of future applications like as 8K videos, augmented and virtual reality, the Internet of Things, telesurgery, and other developing technologies. IEEE 802.11be includes new features such as 320 MHz bandwidth, multi-link operation, Multi-user Multi-Input Multi-Output, orthogonal frequency-division multiple access, and Multiple-Access Point (multi-AP) coordination (MAP-Co) to achieve EHT. With the increase in the number of overlapping APs and inter-AP interference, researchers have focused on studying MAP-Co approaches for coordinated transmission in IEEE 802.11be, making MAP-Co a key feature of future WLANs. Moreover, similar issues may arise in EHF bands WLAN, particularly for standards beyond IEEE 802.11ay. This has prompted researchers to investigate the implementation of MAP-Co over future 802.11ay WLANs. Thus, in this article, we provide a comprehensive review of the state-of-the-art MAP-Co features and their shortcomings concerning emerging WLAN. Finally, we discuss several novel future directions and open challenges for MAP-Co.
△ Less
Submitted 19 December, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Smart Handover with Predicted User Behavior using Convolutional Neural Networks for WiGig Systems
Authors:
Tiago Koketsu Rodrigues,
Shikhar Verma,
Yuichi Kawamoto,
Nei Kato,
Mostafa M. Fouda,
Muhammad Ismail
Abstract:
WiGig networks and 60 GHz frequency communications have a lot of potential for commercial and personal use. They can offer extremely high transmission rates but at the cost of low range and penetration. Due to these issues, WiGig systems are unstable and need to rely on frequent handovers to maintain high-quality connections. However, this solution is problematic as it forces users into bad connec…
▽ More
WiGig networks and 60 GHz frequency communications have a lot of potential for commercial and personal use. They can offer extremely high transmission rates but at the cost of low range and penetration. Due to these issues, WiGig systems are unstable and need to rely on frequent handovers to maintain high-quality connections. However, this solution is problematic as it forces users into bad connections and downtime before they are switched to a better access point. In this work, we use Machine Learning to identify patterns in user behaviors and predict user actions. This prediction is used to do proactive handovers, switching users to access points with better future transmission rates and a more stable environment based on the future state of the user. Results show that not only the proposal is effective at predicting channel data, but the use of such predictions improves system performance and avoids unnecessary handovers.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Threats, Vulnerabilities, and Controls of Machine Learning Based Systems: A Survey and Taxonomy
Authors:
Yusuke Kawamoto,
Kazumasa Miyake,
Koichi Konishi,
Yutaka Oiwa
Abstract:
In this article, we propose the Artificial Intelligence Security Taxonomy to systematize the knowledge of threats, vulnerabilities, and security controls of machine-learning-based (ML-based) systems. We first classify the damage caused by attacks against ML-based systems, define ML-specific security, and discuss its characteristics. Next, we enumerate all relevant assets and stakeholders and provi…
▽ More
In this article, we propose the Artificial Intelligence Security Taxonomy to systematize the knowledge of threats, vulnerabilities, and security controls of machine-learning-based (ML-based) systems. We first classify the damage caused by attacks against ML-based systems, define ML-specific security, and discuss its characteristics. Next, we enumerate all relevant assets and stakeholders and provide a general taxonomy for ML-specific threats. Then, we collect a wide range of security controls against ML-specific threats through an extensive review of recent literature. Finally, we classify the vulnerabilities and controls of an ML-based system in terms of each vulnerable asset in the system's entire lifecycle.
△ Less
Submitted 18 January, 2023; v1 submitted 18 January, 2023;
originally announced January 2023.
-
Formalizing Statistical Causality via Modal Logic
Authors:
Yusuke Kawamoto,
Tetsuya Sato,
Kohei Suenaga
Abstract:
We propose a formal language for describing and explaining statistical causality. Concretely, we define Statistical Causality Language (StaCL) for expressing causal effects and specifying the requirements for causal inference. StaCL incorporates modal operators for interventions to express causal properties between probability distributions in different possible worlds in a Kripke model. We formal…
▽ More
We propose a formal language for describing and explaining statistical causality. Concretely, we define Statistical Causality Language (StaCL) for expressing causal effects and specifying the requirements for causal inference. StaCL incorporates modal operators for interventions to express causal properties between probability distributions in different possible worlds in a Kripke model. We formalize axioms for probability distributions, interventions, and causal predicates using StaCL formulas. These axioms are expressive enough to derive the rules of Pearl's do-calculus. Finally, we demonstrate by examples that StaCL can be used to specify and explain the correctness of statistical causal inference.
△ Less
Submitted 17 September, 2023; v1 submitted 30 October, 2022;
originally announced October 2022.
-
Sound and Relatively Complete Belief Hoare Logic for Statistical Hypothesis Testing Programs
Authors:
Yusuke Kawamoto,
Tetsuya Sato,
Kohei Suenaga
Abstract:
We propose a new approach to formally describing the requirement for statistical inference and checking whether a program uses the statistical method appropriately. Specifically, we define belief Hoare logic (BHL) for formalizing and reasoning about the statistical beliefs acquired via hypothesis testing. This program logic is sound and relatively complete with respect to a Kripke model for hypoth…
▽ More
We propose a new approach to formally describing the requirement for statistical inference and checking whether a program uses the statistical method appropriately. Specifically, we define belief Hoare logic (BHL) for formalizing and reasoning about the statistical beliefs acquired via hypothesis testing. This program logic is sound and relatively complete with respect to a Kripke model for hypothesis tests. We demonstrate by examples that BHL is useful for reasoning about practical issues in hypothesis testing. In our framework, we clarify the importance of prior beliefs in acquiring statistical beliefs through hypothesis testing, and discuss the whole picture of the justification of statistical inference inside and outside the program logic.
△ Less
Submitted 8 November, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Theme Aspect Argumentation Model for Handling Fallacies
Authors:
Ryuta Arisaka,
Ryoma Nakai,
Yusuke Kawamoto,
Takayuki Ito
Abstract:
From daily discussions to marketing ads to political statements, information manipulation is rife. It is increasingly more important that we have the right set of tools to defend ourselves from manipulative rhetoric, or fallacies. Suitable techniques to automatically identify fallacies are being investigated in natural language processing research. However, a fallacy in one context may not be a fa…
▽ More
From daily discussions to marketing ads to political statements, information manipulation is rife. It is increasingly more important that we have the right set of tools to defend ourselves from manipulative rhetoric, or fallacies. Suitable techniques to automatically identify fallacies are being investigated in natural language processing research. However, a fallacy in one context may not be a fallacy in another context, so there is also a need to explain how and why it has come to be judged a fallacy. For the explainable fallacy identification, we present a novel approach to characterising fallacies through formal constraints, as a viable alternative to more traditional fallacy classifications by informal criteria. To achieve this objective, we introduce a novel context-aware argumentation model, the theme aspect argumentation model, which can do both: the modelling of a given argumentation as it is expressed (rhetorical modelling); and a deeper semantic analysis of the rhetorical argumentation model. By identifying fallacies with formal constraints, it becomes possible to tell whether a fallacy lurks in the modelled rhetoric with a formal rigour. We present core formal constraints for the theme aspect argumentation model and then more formal constraints that improve its fallacy identification capability. We show and prove the consequences of these formal constraints. We then analyse the computational complexities of deciding the satisfiability of the constraints.
△ Less
Submitted 25 October, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Information Leakage Games: Exploring Information as a Utility Function
Authors:
Mário S. Alvim,
Konstantinos Chatzikokolakis,
Yusuke Kawamoto,
Catuscia Palamidessi
Abstract:
A common goal in the areas of secure information flow and privacy is to build effective defenses against unwanted leakage of information. To this end, one must be able to reason about potential attacks and their interplay with possible defenses. In this paper, we propose a game-theoretic framework to formalize strategies of attacker and defender in the context of information leakage, and provide a…
▽ More
A common goal in the areas of secure information flow and privacy is to build effective defenses against unwanted leakage of information. To this end, one must be able to reason about potential attacks and their interplay with possible defenses. In this paper, we propose a game-theoretic framework to formalize strategies of attacker and defender in the context of information leakage, and provide a basis for developing optimal defense methods. A novelty of our games is that their utility is given by information leakage, which in some cases may behave in a non-linear way. This causes a significant deviation from classic game theory, in which utility functions are linear with respect to players' strategies. Hence, a key contribution of this paper is the establishment of the foundations of information leakage games. We consider two kinds of games, depending on the notion of leakage considered. The first kind, the QIF-games, is tailored for the theory of quantitative information flow (QIF). The second one, the DP-games, corresponds to differential privacy (DP).
△ Less
Submitted 28 January, 2022; v1 submitted 22 December, 2020;
originally announced December 2020.
-
TransMIA: Membership Inference Attacks Using Transfer Shadow Training
Authors:
Seira Hidano,
Takao Murakami,
Yusuke Kawamoto
Abstract:
Transfer learning has been widely studied and gained increasing popularity to improve the accuracy of machine learning models by transferring some knowledge acquired in different training. However, no prior work has pointed out that transfer learning can strengthen privacy attacks on machine learning models. In this paper, we propose TransMIA (Transfer learning-based Membership Inference Attacks),…
▽ More
Transfer learning has been widely studied and gained increasing popularity to improve the accuracy of machine learning models by transferring some knowledge acquired in different training. However, no prior work has pointed out that transfer learning can strengthen privacy attacks on machine learning models. In this paper, we propose TransMIA (Transfer learning-based Membership Inference Attacks), which use transfer learning to perform membership inference attacks on the source model when the adversary is able to access the parameters of the transferred model. In particular, we propose a transfer shadow training technique, where an adversary employs the parameters of the transferred model to construct shadow models, to significantly improve the performance of membership inference when a limited amount of shadow training data is available to the adversary. We evaluate our attacks using two real datasets, and show that our attacks outperform the state-of-the-art that does not use our transfer shadow training technique. We also compare four combinations of the learning-based/entropy-based approach and the fine-tuning/freezing approach, all of which employ our transfer shadow training technique. Then we examine the performance of these four approaches based on the distributions of confidence values, and discuss possible countermeasures against our attacks.
△ Less
Submitted 23 April, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Locality Sensitive Hashing with Extended Differential Privacy
Authors:
Natasha Fernandes,
Yusuke Kawamoto,
Takao Murakami
Abstract:
Extended differential privacy, a generalization of standard differential privacy (DP) using a general metric, has been widely studied to provide rigorous privacy guarantees while keeping high utility. However, existing works on extended DP are limited to few metrics, such as the Euclidean metric. Consequently, they have only a small number of applications, such as location-based services and docum…
▽ More
Extended differential privacy, a generalization of standard differential privacy (DP) using a general metric, has been widely studied to provide rigorous privacy guarantees while keeping high utility. However, existing works on extended DP are limited to few metrics, such as the Euclidean metric. Consequently, they have only a small number of applications, such as location-based services and document processing. In this paper, we propose a couple of mechanisms providing extended DP with a different metric: angular distance (or cosine distance). Our mechanisms are based on locality sensitive hashing (LSH), which can be applied to the angular distance and work well for personal data in a high-dimensional space. We theoretically analyze the privacy properties of our mechanisms, and prove extended DP for input data by taking into account that LSH preserves the original metric only approximately. We apply our mechanisms to friend matching based on high-dimensional personal data with angular distance in the local model, and evaluate our mechanisms using two real datasets. We show that LDP requires a very large privacy budget and that RAPPOR does not work in this application. Then we show that our mechanisms enable friend matching with high utility and rigorous privacy guarantees based on extended DP.
△ Less
Submitted 12 August, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
An Epistemic Approach to the Formal Specification of Statistical Machine Learning
Authors:
Yusuke Kawamoto
Abstract:
We propose an epistemic approach to formalizing statistical properties of machine learning. Specifically, we introduce a formal model for supervised learning based on a Kripke model where each possible world corresponds to a possible dataset and modal operators are interpreted as transformation and testing on datasets. Then we formalize various notions of the classification performance, robustness…
▽ More
We propose an epistemic approach to formalizing statistical properties of machine learning. Specifically, we introduce a formal model for supervised learning based on a Kripke model where each possible world corresponds to a possible dataset and modal operators are interpreted as transformation and testing on datasets. Then we formalize various notions of the classification performance, robustness, and fairness of statistical classifiers by using our extension of statistical epistemic logic (StatEL). In this formalization, we show relationships among properties of classifiers, and relevance between classification performance and robustness. As far as we know, this is the first work that uses epistemic models and logical formulas to express statistical properties of machine learning, and would be a starting point to develop theories of formal specification of machine learning.
△ Less
Submitted 20 September, 2020; v1 submitted 27 April, 2020;
originally announced April 2020.
-
Privacy-Preserving Multiple Tensor Factorization for Synthesizing Large-Scale Location Traces with Cluster-Specific Features
Authors:
Takao Murakami,
Koki Hamada,
Yusuke Kawamoto,
Takuma Hatano
Abstract:
With the widespread use of LBSs (Location-based Services), synthesizing location traces plays an increasingly important role in analyzing spatial big data while protecting user privacy. In particular, a synthetic trace that preserves a feature specific to a cluster of users (e.g., those who commute by train, those who go shopping) is important for various geo-data analysis tasks and for providing…
▽ More
With the widespread use of LBSs (Location-based Services), synthesizing location traces plays an increasingly important role in analyzing spatial big data while protecting user privacy. In particular, a synthetic trace that preserves a feature specific to a cluster of users (e.g., those who commute by train, those who go shopping) is important for various geo-data analysis tasks and for providing a synthetic location dataset. Although location synthesizers have been widely studied, existing synthesizers do not provide sufficient utility, privacy, or scalability, hence are not practical for large-scale location traces. To overcome this issue, we propose a novel location synthesizer called PPMTF (Privacy-Preserving Multiple Tensor Factorization). We model various statistical features of the original traces by a transition-count tensor and a visit-count tensor. We factorize these two tensors simultaneously via multiple tensor factorization, and train factor matrices via posterior sampling. Then we synthesize traces from reconstructed tensors, and perform a plausible deniability test for a synthetic trace. We comprehensively evaluate PPMTF using two datasets. Our experimental results show that PPMTF preserves various statistical features including cluster-specific features, protects user privacy, and synthesizes large-scale location traces in practical time. PPMTF also significantly outperforms the state-of-the-art methods in terms of utility and scalability at the same level of privacy.
△ Less
Submitted 18 November, 2020; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Towards Logical Specification of Statistical Machine Learning
Authors:
Yusuke Kawamoto
Abstract:
We introduce a logical approach to formalizing statistical properties of machine learning. Specifically, we propose a formal model for statistical classification based on a Kripke model, and formalize various notions of classification performance, robustness, and fairness of classifiers by using epistemic logic. Then we show some relationships among properties of classifiers and those between clas…
▽ More
We introduce a logical approach to formalizing statistical properties of machine learning. Specifically, we propose a formal model for statistical classification based on a Kripke model, and formalize various notions of classification performance, robustness, and fairness of classifiers by using epistemic logic. Then we show some relationships among properties of classifiers and those between classification performance and robustness, which suggests robustness-related properties that have not been formalized in the literature as far as we know. To formalize fairness properties, we define a notion of counterfactual knowledge and show techniques to formalize conditional indistinguishability by using counterfactual epistemic operators. As far as we know, this is the first work that uses logical formulas to express statistical properties of machine learning, and that provides epistemic (resp. counterfactually epistemic) views on robustness (resp. fairness) of classifiers.
△ Less
Submitted 28 March, 2020; v1 submitted 24 July, 2019;
originally announced July 2019.
-
Statistical Epistemic Logic
Authors:
Yusuke Kawamoto
Abstract:
We introduce a modal logic for describing statistical knowledge, which we call statistical epistemic logic. We propose a Kripke model dealing with probability distributions and stochastic assignments, and show a stochastic semantics for the logic. To our knowledge, this is the first semantics for modal logic that can express the statistical knowledge dependent on non-deterministic inputs and the s…
▽ More
We introduce a modal logic for describing statistical knowledge, which we call statistical epistemic logic. We propose a Kripke model dealing with probability distributions and stochastic assignments, and show a stochastic semantics for the logic. To our knowledge, this is the first semantics for modal logic that can express the statistical knowledge dependent on non-deterministic inputs and the statistical significance of observed results. By using statistical epistemic logic, we express a notion of statistical secrecy with a confidence level. We also show that this logic is useful to formalize statistical hypothesis testing and differential privacy in a simple and abstract manner.
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
Local Distribution Obfuscation via Probability Coupling
Authors:
Yusuke Kawamoto,
Takao Murakami
Abstract:
We introduce a general model for the local obfuscation of probability distributions by probabilistic perturbation, e.g., by adding differentially private noise, and investigate its theoretical properties. Specifically, we relax a notion of distribution privacy (DistP) by generalizing it to divergence, and propose local obfuscation mechanisms that provide divergence distribution privacy. To provide…
▽ More
We introduce a general model for the local obfuscation of probability distributions by probabilistic perturbation, e.g., by adding differentially private noise, and investigate its theoretical properties. Specifically, we relax a notion of distribution privacy (DistP) by generalizing it to divergence, and propose local obfuscation mechanisms that provide divergence distribution privacy. To provide f-divergence distribution privacy, we prove that probabilistic perturbation noise should be added proportionally to the Earth mover's distance between the probability distributions that we want to make indistinguishable. Furthermore, we introduce a local obfuscation mechanism, which we call a coupling mechanism, that provides divergence distribution privacy while optimizing the utility of obfuscated data by using exact/approximate auxiliary information on the input distributions we want to protect.
△ Less
Submitted 30 September, 2019; v1 submitted 12 July, 2019;
originally announced July 2019.
-
Local Obfuscation Mechanisms for Hiding Probability Distributions
Authors:
Yusuke Kawamoto,
Takao Murakami
Abstract:
We introduce a formal model for the information leakage of probability distributions and define a notion called distribution privacy as the local differential privacy for probability distributions. Roughly, the distribution privacy of a local obfuscation mechanism means that the attacker cannot significantly gain any information on the distribution of the mechanism's input by observing its output.…
▽ More
We introduce a formal model for the information leakage of probability distributions and define a notion called distribution privacy as the local differential privacy for probability distributions. Roughly, the distribution privacy of a local obfuscation mechanism means that the attacker cannot significantly gain any information on the distribution of the mechanism's input by observing its output. Then we show that existing local mechanisms can hide input distributions in terms of distribution privacy, while deteriorating the utility by adding too much noise. For example, we prove that the Laplace mechanism needs to add a large amount of noise proportionally to the infinite Wasserstein distance between the two distributions we want to make indistinguishable. To improve the tradeoff between distribution privacy and utility, we introduce a local obfuscation mechanism, called a tupling mechanism, that adds random dummy data to the output. Then we apply this mechanism to the protection of user attributes in location based services. By experiments, we demonstrate that the tupling mechanism outperforms popular local mechanisms in terms of attribute obfuscation and service quality.
△ Less
Submitted 6 August, 2019; v1 submitted 3 December, 2018;
originally announced December 2018.
-
Hybrid Statistical Estimation of Mutual Information and its Application to Information Flow
Authors:
Fabrizio Biondi,
Yusuke Kawamoto,
Axel Legay,
Louis-Marie Traonouez
Abstract:
Analysis of a probabilistic system often requires to learn the joint probability distribution of its random variables. The computation of the exact distribution is usually an exhaustive precise analysis on all executions of the system. To avoid the high computational cost of such an exhaustive search, statistical analysis has been studied to efficiently obtain approximate estimates by analyzing on…
▽ More
Analysis of a probabilistic system often requires to learn the joint probability distribution of its random variables. The computation of the exact distribution is usually an exhaustive precise analysis on all executions of the system. To avoid the high computational cost of such an exhaustive search, statistical analysis has been studied to efficiently obtain approximate estimates by analyzing only a small but representative subset of the system's behavior. In this paper we propose a hybrid statistical estimation method that combines precise and statistical analyses to estimate mutual information, Shannon entropy, and conditional entropy, together with their confidence intervals. We show how to combine the analyses on different components of a discrete system with different accuracy to obtain an estimate for the whole system. The new method performs weighted statistical analysis with different sample sizes over different components and dynamically finds their optimal sample sizes. Moreover, it can reduce sample sizes by using prior knowledge about systems and a new abstraction-then-sampling technique based on qualitative analysis. To apply the method to the source code of a system, we show how to decompose the code into components and to determine the analysis method for each component by overviewing the implementation of those techniques in the HyLeak tool. We demonstrate with case studies that the new method outperforms the state of the art in quantifying information leakage.
△ Less
Submitted 8 September, 2018;
originally announced September 2018.
-
Utility-Optimized Local Differential Privacy Mechanisms for Distribution Estimation
Authors:
Takao Murakami,
Yusuke Kawamoto
Abstract:
LDP (Local Differential Privacy) has been widely studied to estimate statistics of personal data (e.g., distribution underlying the data) while protecting users' privacy. Although LDP does not require a trusted third party, it regards all personal data equally sensitive, which causes excessive obfuscation hence the loss of utility. In this paper, we introduce the notion of ULDP (Utility-optimized…
▽ More
LDP (Local Differential Privacy) has been widely studied to estimate statistics of personal data (e.g., distribution underlying the data) while protecting users' privacy. Although LDP does not require a trusted third party, it regards all personal data equally sensitive, which causes excessive obfuscation hence the loss of utility. In this paper, we introduce the notion of ULDP (Utility-optimized LDP), which provides a privacy guarantee equivalent to LDP only for sensitive data. We first consider the setting where all users use the same obfuscation mechanism, and propose two mechanisms providing ULDP: utility-optimized randomized response and utility-optimized RAPPOR. We then consider the setting where the distinction between sensitive and non-sensitive data can be different from user to user. For this setting, we propose a personalized ULDP mechanism with semantic tags to estimate the distribution of personal data with high utility while keeping secret what is sensitive for each user. We show theoretically and experimentally that our mechanisms provide much higher utility than the existing LDP mechanisms when there are a lot of non-sensitive data. We also show that when most of the data are non-sensitive, our mechanisms even provide almost the same utility as non-private mechanisms in the low privacy regime.
△ Less
Submitted 26 May, 2019; v1 submitted 30 July, 2018;
originally announced July 2018.
-
On the Anonymization of Differentially Private Location Obfuscation
Authors:
Yusuke Kawamoto,
Takao Murakami
Abstract:
Obfuscation techniques in location-based services (LBSs) have been shown useful to hide the concrete locations of service users, whereas they do not necessarily provide the anonymity. We quantify the anonymity of the location data obfuscated by the planar Laplacian mechanism and that by the optimal geo-indistinguishable mechanism of Bordenabe et al. We empirically show that the latter provides str…
▽ More
Obfuscation techniques in location-based services (LBSs) have been shown useful to hide the concrete locations of service users, whereas they do not necessarily provide the anonymity. We quantify the anonymity of the location data obfuscated by the planar Laplacian mechanism and that by the optimal geo-indistinguishable mechanism of Bordenabe et al. We empirically show that the latter provides stronger anonymity than the former in the sense that more users in the database satisfy k-anonymity. To formalize and analyze such approximate anonymity we introduce the notion of asymptotic anonymity. Then we show that the location data obfuscated by the optimal geo-indistinguishable mechanism can be anonymized by removing a smaller number of users from the database. Furthermore, we demonstrate that the optimal geo-indistinguishable mechanism has better utility both for users and for data analysts.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
A Game-Theoretic Approach to Information-Flow Control via Protocol Composition
Authors:
Mário S. Alvim,
Konstantinos Chatzikokolakis,
Yusuke Kawamoto,
Catuscia Palamidessi
Abstract:
In the inference attacks studied in Quantitative Information Flow (QIF), the attacker typically tries to interfere with the system in the attempt to increase its leakage of secret information. The defender, on the other hand, typically tries to decrease leakage by introducing some controlled noise. This noise introduction can be modeled as a type of protocol composition, i.e., a probabilistic choi…
▽ More
In the inference attacks studied in Quantitative Information Flow (QIF), the attacker typically tries to interfere with the system in the attempt to increase its leakage of secret information. The defender, on the other hand, typically tries to decrease leakage by introducing some controlled noise. This noise introduction can be modeled as a type of protocol composition, i.e., a probabilistic choice among different protocols, and its effect on the amount of leakage depends heavily on whether or not this choice is visible to the attacker. In this work, we consider operators for modeling visible and hidden choice in protocol composition, and we study their algebraic properties. We then formalize the interplay between defender and attacker in a game-theoretic framework adapted to the specific issues of QIF, where the payoff is information leakage. We consider various kinds of leakage games, depending on whether players act simultaneously or sequentially, and on whether or not the choices of the defender are visible to the attacker. In the case of sequential games, the choice of the second player is generally a function of the choice of the first player, and his/her probabilistic choice can be either over the possible functions (mixed strategy) or it can be on the result of the function (behavioral strategy). We show that when the attacker moves first in a sequential game with a hidden choice, then behavioral strategies are more advantageous for the defender than mixed strategies. This contrasts with the standard game theory, where the two types of strategies are equivalent. Finally, we establish a hierarchy of these games in terms of their information leakage and provide methods for finding optimal strategies (at the points of equilibrium) for both attacker and defender in the various cases.
△ Less
Submitted 21 May, 2018; v1 submitted 27 March, 2018;
originally announced March 2018.
-
Leakage and Protocol Composition in a Game-Theoretic Perspective
Authors:
Mário S. Alvim,
Konstantinos Chatzikokolakis,
Yusuke Kawamoto,
Catuscia Palamidessi
Abstract:
In the inference attacks studied in Quantitative Information Flow (QIF), the adversary typically tries to interfere with the system in the attempt to increase its leakage of secret information. The defender, on the other hand, typically tries to decrease leakage by introducing some controlled noise. This noise introduction can be modeled as a type of protocol composition, i.e., a probabilistic cho…
▽ More
In the inference attacks studied in Quantitative Information Flow (QIF), the adversary typically tries to interfere with the system in the attempt to increase its leakage of secret information. The defender, on the other hand, typically tries to decrease leakage by introducing some controlled noise. This noise introduction can be modeled as a type of protocol composition, i.e., a probabilistic choice among different protocols, and its effect on the amount of leakage depends heavily on whether or not this choice is visible to the adversary. In this work we consider operators for modeling visible and invisible choice in protocol composition, and we study their algebraic properties. We then formalize the interplay between defender and adversary in a game-theoretic framework adapted to the specific issues of QIF, where the payoff is information leakage. We consider various kinds of leakage games, depending on whether players act simultaneously or sequentially, and on whether or not the choices of the defender are visible to the adversary. Finally, we establish a hierarchy of these games in terms of their information leakage, and provide methods for finding optimal strategies (at the points of equilibrium) for both attacker and defender in the various cases. The full version of this paper can be found in arXiv:1803.10042
△ Less
Submitted 30 March, 2018; v1 submitted 27 February, 2018;
originally announced February 2018.
-
Information Leakage Games
Authors:
Mário S. Alvim,
Konstantinos Chatzikokolakis,
Yusuke Kawamoto,
Catuscia Palamidessi
Abstract:
We consider a game-theoretic setting to model the interplay between attacker and defender in the context of information flow, and to reason about their optimal strategies. In contrast with standard game theory, in our games the utility of a mixed strategy is a convex function of the distribution on the defender's pure actions, rather than the expected value of their utilities. Nevertheless, the im…
▽ More
We consider a game-theoretic setting to model the interplay between attacker and defender in the context of information flow, and to reason about their optimal strategies. In contrast with standard game theory, in our games the utility of a mixed strategy is a convex function of the distribution on the defender's pure actions, rather than the expected value of their utilities. Nevertheless, the important properties of game theory, notably the existence of a Nash equilibrium, still hold for our (zero-sum) leakage games, and we provide algorithms to compute the corresponding optimal strategies. As typical in (simultaneous) game theory, the optimal strategy is usually mixed, i.e., probabilistic, for both the attacker and the defender. From the point of view of information flow, this was to be expected in the case of the defender, since it is well known that randomization at the level of the system design may help to reduce information leaks. Regarding the attacker, however, this seems the first work (w.r.t. the literature in information flow) proving formally that in certain cases the optimal attack strategy is necessarily probabilistic.
△ Less
Submitted 21 August, 2017; v1 submitted 14 May, 2017;
originally announced May 2017.
-
On the Compositionality of Quantitative Information Flow
Authors:
Yusuke Kawamoto,
Konstantinos Chatzikokolakis,
Catuscia Palamidessi
Abstract:
Information flow is the branch of security that studies the leakage of information due to correlation between secrets and observables. Since in general such correlation cannot be avoided completely, it is important to quantify the leakage. The most followed approaches to defining appropriate measures are those based on information theory. In particular, one of the most successful approaches is the…
▽ More
Information flow is the branch of security that studies the leakage of information due to correlation between secrets and observables. Since in general such correlation cannot be avoided completely, it is important to quantify the leakage. The most followed approaches to defining appropriate measures are those based on information theory. In particular, one of the most successful approaches is the recently proposed $g$-leakage framework, which encompasses most of the information-theoretic ones. A problem with $g$-leakage, however, is that it is defined in terms of a minimization problem, which, in the case of large systems, can be computationally rather heavy. In this paper we study the case in which the channel associated to the system can be decomposed into simpler channels, which typically happens when the observables consist of multiple components. Our main contribution is the derivation of bounds on the (multiplicative version of) $g$-leakage of the whole system in terms of the $g$-leakages of its components. We also consider the particular cases of min-entropy leakage and of parallel channels, generalizing and systematizing results from the literature. We demonstrate the effectiveness of our method and evaluate the precision of our bounds using examples.
△ Less
Submitted 14 August, 2017; v1 submitted 1 November, 2016;
originally announced November 2016.
-
Quantitative Information Flow for Scheduler-Dependent Systems
Authors:
Yusuke Kawamoto,
Thomas Given-Wilson
Abstract:
Quantitative information flow analyses measure how much information on secrets is leaked by publicly observable outputs. One area of interest is to quantify and estimate the information leakage of composed systems. Prior work has focused on running disjoint component systems in parallel and reasoning about the leakage compositionally, but has not explored how the component systems are run in paral…
▽ More
Quantitative information flow analyses measure how much information on secrets is leaked by publicly observable outputs. One area of interest is to quantify and estimate the information leakage of composed systems. Prior work has focused on running disjoint component systems in parallel and reasoning about the leakage compositionally, but has not explored how the component systems are run in parallel or how the leakage of composed systems can be minimised. In this paper we consider the manner in which parallel systems can be combined or scheduled. This considers the effects of scheduling channels where resources may be shared, or whether the outputs may be incrementally observed. We also generalise the attacker's capability, of observing outputs of the system, to consider attackers who may be imperfect in their observations, e.g. when outputs may be confused with one another, or when assessing the time taken for an output to appear. Our main contribution is to present how scheduling and observation effect information leakage properties. In particular, that scheduling can hide some leaked information from perfect observers, while some scheduling may reveal secret information that is hidden to imperfect observers. In addition we present an algorithm to construct a scheduler that minimises the min-entropy leakage and min-capacity in the presence of any observer.
△ Less
Submitted 28 September, 2015;
originally announced September 2015.