Search | arXiv e-print repository

COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act

Authors: Philipp Guldimann, Alexander Spiridonov, Robin Staab, Nikola Jovanović, Mark Vero, Velko Vechev, Anna-Maria Gueorguieva, Mislav Balunović, Nikola Konstantinov, Pavol Bielik, Petar Tsankov, Martin Vechev

Abstract: The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development, but lacks clear technical interpretation, making it difficult to assess models' compliance. This work presents COMPL-AI, a comprehensive framework consisting of (i) the first technical interpretation of the EU AI Act, translating its broad regulatory requirements into measurable technical requir… ▽ More The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development, but lacks clear technical interpretation, making it difficult to assess models' compliance. This work presents COMPL-AI, a comprehensive framework consisting of (i) the first technical interpretation of the EU AI Act, translating its broad regulatory requirements into measurable technical requirements, with the focus on large language models (LLMs), and (ii) an open-source Act-centered benchmarking suite, based on thorough surveying and implementation of state-of-the-art LLM benchmarks. By evaluating 12 prominent LLMs in the context of COMPL-AI, we reveal shortcomings in existing models and benchmarks, particularly in areas like robustness, safety, diversity, and fairness. This work highlights the need for a shift in focus towards these aspects, encouraging balanced development of LLMs and more comprehensive regulation-aligned benchmarks. Simultaneously, COMPL-AI for the first time demonstrates the possibilities and difficulties of bringing the Act's obligations to a more concrete, technical level. As such, our work can serve as a useful first step towards having actionable recommendations for model providers, and contributes to ongoing efforts of the EU to enable application of the Act, such as the drafting of the GPAI Code of Practice. △ Less

Submitted 3 February, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

arXiv:2108.06159 [pdf, other]

doi 10.1007/978-3-030-79150-6_21

Robustness testing of AI systems: A case study for traffic sign recognition

Authors: Christian Berghoff, Pavol Bielik, Matthias Neu, Petar Tsankov, Arndt von Twickel

Abstract: In the last years, AI systems, in particular neural networks, have seen a tremendous increase in performance, and they are now used in a broad range of applications. Unlike classical symbolic AI systems, neural networks are trained using large data sets and their inner structure containing possibly billions of parameters does not lend itself to human interpretation. As a consequence, it is so far… ▽ More In the last years, AI systems, in particular neural networks, have seen a tremendous increase in performance, and they are now used in a broad range of applications. Unlike classical symbolic AI systems, neural networks are trained using large data sets and their inner structure containing possibly billions of parameters does not lend itself to human interpretation. As a consequence, it is so far not feasible to provide broad guarantees for the correct behaviour of neural networks during operation if they process input data that significantly differ from those seen during training. However, many applications of AI systems are security- or safety-critical, and hence require obtaining statements on the robustness of the systems when facing unexpected events, whether they occur naturally or are induced by an attacker in a targeted way. As a step towards developing robust AI systems for such applications, this paper presents how the robustness of AI systems can be practically examined and which methods and metrics can be used to do so. The robustness testing methodology is described and analysed for the example use case of traffic sign recognition in autonomous driving. △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: 12 pages, 7 figures. The final publication is available at Springer via https://doi.org/10.1007/978-3-030-79150-6_21

Journal ref: In: Maglogiannis I., Macintyre J., Iliadis L. (eds) Artificial Intelligence Applications and Innovations. AIAI 2021. IFIP Advances in Information and Communication Technology, vol 627. Springer, Cham

arXiv:2102.11860 [pdf, other]

Automated Discovery of Adaptive Attacks on Adversarial Defenses

Authors: Chengyuan Yao, Pavol Bielik, Petar Tsankov, Martin Vechev

Abstract: Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense's inner workings or approaches based on an ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed of reusable building blocks that can be formalized in a… ▽ More Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense's inner workings or approaches based on an ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed of reusable building blocks that can be formalized in a search space and used to automatically discover attacks for unknown defenses. We evaluated our approach on 24 adversarial defenses and show that it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses: our tool discovered significantly stronger attacks by producing 3.0\%-50.8\% additional adversarial examples for 10 models, while obtaining attacks with slightly stronger or similar strength for the remaining models. △ Less

Submitted 27 October, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

Comments: 21 pages, 3 figures, 10 tables. NeurIPS2021

arXiv:2009.01020 [pdf, other]

zkay v0.2: Practical Data Privacy for Smart Contracts

Authors: Nick Baumann, Samuel Steffen, Benjamin Bichsel, Petar Tsankov, Martin Vechev

Abstract: Recent work introduces zkay, a system for specifying and enforcing data privacy in smart contracts. While the original prototype implementation of zkay (v0.1) demonstrates the feasibility of the approach, its proof-of-concept implementation suffers from severe limitations such as insecure encryption and lack of important language features. In this report, we present zkay v0.2, which addresses it… ▽ More Recent work introduces zkay, a system for specifying and enforcing data privacy in smart contracts. While the original prototype implementation of zkay (v0.1) demonstrates the feasibility of the approach, its proof-of-concept implementation suffers from severe limitations such as insecure encryption and lack of important language features. In this report, we present zkay v0.2, which addresses its predecessor's limitations. The new implementation significantly improves security, usability, modularity, and performance of the system. In particular, zkay v0.2 supports state-of-the-art asymmetric and hybrid encryption, introduces many new language features (such as function calls, private control flow, and extended type support), allows for different zk-SNARKs backends, and reduces both compilation time and on-chain costs. △ Less

Submitted 9 September, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

arXiv:1806.01143 [pdf, other]

Securify: Practical Security Analysis of Smart Contracts

Authors: Petar Tsankov, Andrei Dan, Dana Drachsler Cohen, Arthur Gervais, Florian Buenzli, Martin Vechev

Abstract: Permissionless blockchains allow the execution of arbitrary programs (called smart contracts), enabling mutually untrusted entities to interact without relying on trusted third parties. Despite their potential, repeated security concerns have shaken the trust in handling billions of USD by smart contracts. To address this problem, we present Securify, a security analyzer for Ethereum smart contr… ▽ More Permissionless blockchains allow the execution of arbitrary programs (called smart contracts), enabling mutually untrusted entities to interact without relying on trusted third parties. Despite their potential, repeated security concerns have shaken the trust in handling billions of USD by smart contracts. To address this problem, we present Securify, a security analyzer for Ethereum smart contracts that is scalable, fully automated, and able to prove contract behaviors as safe/unsafe with respect to a given property. Securify's analysis consists of two steps. First, it symbolically analyzes the contract's dependency graph to extract precise semantic information from the code. Then, it checks compliance and violation patterns that capture sufficient conditions for proving if a property holds or not. To enable extensibility, all patterns are specified in a designated domain-specific language. Securify is publicly released, it has analyzed >18K contracts submitted by its users, and is regularly used to conduct security audits by experts. We present an extensive evaluation of Securify over real-world Ethereum smart contracts and demonstrate that it can effectively prove the correctness of smart contracts and discover critical violations. △ Less

Submitted 24 August, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

arXiv:1611.02537 [pdf, other]

Network-wide Configuration Synthesis

Authors: Ahmed El-Hassany, Petar Tsankov, Laurent Vanbever, Martin Vechev

Abstract: Computer networks are hard to manage. Given a set of high-level requirements (e.g., reachability, security), operators have to manually figure out the individual configuration of potentially hundreds of devices running complex distributed protocols so that they, collectively, compute a compatible forwarding state. Not surprisingly, operators often make mistakes which lead to downtimes. To address… ▽ More Computer networks are hard to manage. Given a set of high-level requirements (e.g., reachability, security), operators have to manually figure out the individual configuration of potentially hundreds of devices running complex distributed protocols so that they, collectively, compute a compatible forwarding state. Not surprisingly, operators often make mistakes which lead to downtimes. To address this problem, we present a novel synthesis approach that automatically computes correct network configurations that comply with the operator's requirements. We capture the behavior of existing routers along with the distributed protocols they run in stratified Datalog. Our key insight is to reduce the problem of finding correct input configurations to the task of synthesizing inputs for a stratified Datalog program. To solve this synthesis task, we introduce a new algorithm that synthesizes inputs for stratified Datalog programs. This algorithm is applicable beyond the domain of networks. We leverage our synthesis algorithm to construct the first network-wide configuration synthesis system, called SyNET, that support multiple interacting routing protocols (OSPF and BGP) and static routes. We show that our system is practical and can infer correct input configurations, in a reasonable amount time, for networks of realistic size (> 50 routers) that forward packets for multiple traffic classes. △ Less

Submitted 30 May, 2017; v1 submitted 8 November, 2016; originally announced November 2016.

Comments: 24 Pages, short version published in CAV 2017

arXiv:1605.01769 [pdf, other]

Access Control Synthesis for Physical Spaces

Authors: Petar Tsankov, Mohammad Torabi Dashti, David Basin

Abstract: Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open… ▽ More Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open. In practice, the gap between the system-wide, global requirements and the many local policies is bridged manually, which is tedious, error-prone, and scales poorly. We propose a framework to automatically synthesize local access control policies from a set of global requirements for physical spaces. Our framework consists of an expressive language to specify both global requirements and physical spaces, and an algorithm for synthesizing local, attribute-based policies from the global specification. We empirically demonstrate the framework's effectiveness on three substantial case studies. The studies demonstrate that access control synthesis is practical even for complex physical spaces, such as airports, with many interrelated security requirements. △ Less

Submitted 5 May, 2016; originally announced May 2016.

Showing 1–7 of 7 results for author: Tsankov, P