-
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Authors:
Philipp Guldimann,
Alexander Spiridonov,
Robin Staab,
Nikola Jovanović,
Mark Vero,
Velko Vechev,
Anna-Maria Gueorguieva,
Mislav Balunović,
Nikola Konstantinov,
Pavol Bielik,
Petar Tsankov,
Martin Vechev
Abstract:
The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development, but lacks clear technical interpretation, making it difficult to assess models' compliance. This work presents COMPL-AI, a comprehensive framework consisting of (i) the first technical interpretation of the EU AI Act, translating its broad regulatory requirements into measurable technical requir…
▽ More
The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development, but lacks clear technical interpretation, making it difficult to assess models' compliance. This work presents COMPL-AI, a comprehensive framework consisting of (i) the first technical interpretation of the EU AI Act, translating its broad regulatory requirements into measurable technical requirements, with the focus on large language models (LLMs), and (ii) an open-source Act-centered benchmarking suite, based on thorough surveying and implementation of state-of-the-art LLM benchmarks. By evaluating 12 prominent LLMs in the context of COMPL-AI, we reveal shortcomings in existing models and benchmarks, particularly in areas like robustness, safety, diversity, and fairness. This work highlights the need for a shift in focus towards these aspects, encouraging balanced development of LLMs and more comprehensive regulation-aligned benchmarks. Simultaneously, COMPL-AI for the first time demonstrates the possibilities and difficulties of bringing the Act's obligations to a more concrete, technical level. As such, our work can serve as a useful first step towards having actionable recommendations for model providers, and contributes to ongoing efforts of the EU to enable application of the Act, such as the drafting of the GPAI Code of Practice.
△ Less
Submitted 3 February, 2025; v1 submitted 10 October, 2024;
originally announced October 2024.
-
Robustness testing of AI systems: A case study for traffic sign recognition
Authors:
Christian Berghoff,
Pavol Bielik,
Matthias Neu,
Petar Tsankov,
Arndt von Twickel
Abstract:
In the last years, AI systems, in particular neural networks, have seen a tremendous increase in performance, and they are now used in a broad range of applications. Unlike classical symbolic AI systems, neural networks are trained using large data sets and their inner structure containing possibly billions of parameters does not lend itself to human interpretation. As a consequence, it is so far…
▽ More
In the last years, AI systems, in particular neural networks, have seen a tremendous increase in performance, and they are now used in a broad range of applications. Unlike classical symbolic AI systems, neural networks are trained using large data sets and their inner structure containing possibly billions of parameters does not lend itself to human interpretation. As a consequence, it is so far not feasible to provide broad guarantees for the correct behaviour of neural networks during operation if they process input data that significantly differ from those seen during training. However, many applications of AI systems are security- or safety-critical, and hence require obtaining statements on the robustness of the systems when facing unexpected events, whether they occur naturally or are induced by an attacker in a targeted way. As a step towards developing robust AI systems for such applications, this paper presents how the robustness of AI systems can be practically examined and which methods and metrics can be used to do so. The robustness testing methodology is described and analysed for the example use case of traffic sign recognition in autonomous driving.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
Automated Discovery of Adaptive Attacks on Adversarial Defenses
Authors:
Chengyuan Yao,
Pavol Bielik,
Petar Tsankov,
Martin Vechev
Abstract:
Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense's inner workings or approaches based on an ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed of reusable building blocks that can be formalized in a…
▽ More
Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense's inner workings or approaches based on an ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed of reusable building blocks that can be formalized in a search space and used to automatically discover attacks for unknown defenses. We evaluated our approach on 24 adversarial defenses and show that it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses: our tool discovered significantly stronger attacks by producing 3.0\%-50.8\% additional adversarial examples for 10 models, while obtaining attacks with slightly stronger or similar strength for the remaining models.
△ Less
Submitted 27 October, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
zkay v0.2: Practical Data Privacy for Smart Contracts
Authors:
Nick Baumann,
Samuel Steffen,
Benjamin Bichsel,
Petar Tsankov,
Martin Vechev
Abstract:
Recent work introduces zkay, a system for specifying and enforcing data privacy in smart contracts. While the original prototype implementation of zkay (v0.1) demonstrates the feasibility of the approach, its proof-of-concept implementation suffers from severe limitations such as insecure encryption and lack of important language features.
In this report, we present zkay v0.2, which addresses it…
▽ More
Recent work introduces zkay, a system for specifying and enforcing data privacy in smart contracts. While the original prototype implementation of zkay (v0.1) demonstrates the feasibility of the approach, its proof-of-concept implementation suffers from severe limitations such as insecure encryption and lack of important language features.
In this report, we present zkay v0.2, which addresses its predecessor's limitations. The new implementation significantly improves security, usability, modularity, and performance of the system. In particular, zkay v0.2 supports state-of-the-art asymmetric and hybrid encryption, introduces many new language features (such as function calls, private control flow, and extended type support), allows for different zk-SNARKs backends, and reduces both compilation time and on-chain costs.
△ Less
Submitted 9 September, 2020; v1 submitted 2 September, 2020;
originally announced September 2020.
-
Securify: Practical Security Analysis of Smart Contracts
Authors:
Petar Tsankov,
Andrei Dan,
Dana Drachsler Cohen,
Arthur Gervais,
Florian Buenzli,
Martin Vechev
Abstract:
Permissionless blockchains allow the execution of arbitrary programs (called smart contracts), enabling mutually untrusted entities to interact without relying on trusted third parties. Despite their potential, repeated security concerns have shaken the trust in handling billions of USD by smart contracts.
To address this problem, we present Securify, a security analyzer for Ethereum smart contr…
▽ More
Permissionless blockchains allow the execution of arbitrary programs (called smart contracts), enabling mutually untrusted entities to interact without relying on trusted third parties. Despite their potential, repeated security concerns have shaken the trust in handling billions of USD by smart contracts.
To address this problem, we present Securify, a security analyzer for Ethereum smart contracts that is scalable, fully automated, and able to prove contract behaviors as safe/unsafe with respect to a given property. Securify's analysis consists of two steps. First, it symbolically analyzes the contract's dependency graph to extract precise semantic information from the code. Then, it checks compliance and violation patterns that capture sufficient conditions for proving if a property holds or not. To enable extensibility, all patterns are specified in a designated domain-specific language.
Securify is publicly released, it has analyzed >18K contracts submitted by its users, and is regularly used to conduct security audits by experts. We present an extensive evaluation of Securify over real-world Ethereum smart contracts and demonstrate that it can effectively prove the correctness of smart contracts and discover critical violations.
△ Less
Submitted 24 August, 2018; v1 submitted 4 June, 2018;
originally announced June 2018.
-
Network-wide Configuration Synthesis
Authors:
Ahmed El-Hassany,
Petar Tsankov,
Laurent Vanbever,
Martin Vechev
Abstract:
Computer networks are hard to manage. Given a set of high-level requirements (e.g., reachability, security), operators have to manually figure out the individual configuration of potentially hundreds of devices running complex distributed protocols so that they, collectively, compute a compatible forwarding state. Not surprisingly, operators often make mistakes which lead to downtimes. To address…
▽ More
Computer networks are hard to manage. Given a set of high-level requirements (e.g., reachability, security), operators have to manually figure out the individual configuration of potentially hundreds of devices running complex distributed protocols so that they, collectively, compute a compatible forwarding state. Not surprisingly, operators often make mistakes which lead to downtimes. To address this problem, we present a novel synthesis approach that automatically computes correct network configurations that comply with the operator's requirements. We capture the behavior of existing routers along with the distributed protocols they run in stratified Datalog. Our key insight is to reduce the problem of finding correct input configurations to the task of synthesizing inputs for a stratified Datalog program. To solve this synthesis task, we introduce a new algorithm that synthesizes inputs for stratified Datalog programs. This algorithm is applicable beyond the domain of networks. We leverage our synthesis algorithm to construct the first network-wide configuration synthesis system, called SyNET, that support multiple interacting routing protocols (OSPF and BGP) and static routes. We show that our system is practical and can infer correct input configurations, in a reasonable amount time, for networks of realistic size (> 50 routers) that forward packets for multiple traffic classes.
△ Less
Submitted 30 May, 2017; v1 submitted 8 November, 2016;
originally announced November 2016.
-
Access Control Synthesis for Physical Spaces
Authors:
Petar Tsankov,
Mohammad Torabi Dashti,
David Basin
Abstract:
Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open…
▽ More
Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open. In practice, the gap between the system-wide, global requirements and the many local policies is bridged manually, which is tedious, error-prone, and scales poorly.
We propose a framework to automatically synthesize local access control policies from a set of global requirements for physical spaces. Our framework consists of an expressive language to specify both global requirements and physical spaces, and an algorithm for synthesizing local, attribute-based policies from the global specification. We empirically demonstrate the framework's effectiveness on three substantial case studies. The studies demonstrate that access control synthesis is practical even for complex physical spaces, such as airports, with many interrelated security requirements.
△ Less
Submitted 5 May, 2016;
originally announced May 2016.