-
In-Situ Hardware Error Detection Using Specification-Derived Petri Net Models and Behavior-Derived State Sequences
Authors:
Tomonari Tanaka,
Takumi Uezono,
Kohei Suenaga,
Masanori Hashimoto
Abstract:
In hardware accelerators used in data centers and safety-critical applications, soft errors and resultant silent data corruption significantly compromise reliability, particularly when upsets occur in control-flow operations, leading to severe failures. To address this, we introduce two methods for monitoring control flows: using specification-derived Petri nets and using behavior-derived state tr…
▽ More
In hardware accelerators used in data centers and safety-critical applications, soft errors and resultant silent data corruption significantly compromise reliability, particularly when upsets occur in control-flow operations, leading to severe failures. To address this, we introduce two methods for monitoring control flows: using specification-derived Petri nets and using behavior-derived state transitions. We validated our method across four designs: convolutional layer operation, Gaussian blur, AES encryption, and a router in Network-on-Chip. Our fault injection campaign targeting the control registers and primary control inputs demonstrated high error detection rates in both datapath and control logic. Synthesis results show that a maximum detection rate is achieved with a few to around 10% area overhead in most cases. The proposed detectors quickly detect 48% to 100% of failures resulting from upsets in internal control registers and perturbations in primary control inputs. The two proposed methods were compared in terms of area overhead and error detection rate. By selectively applying these two methods, a wide range of area constraints can be accommodated, enabling practical implementation and effectively enhancing error detection capabilities.
△ Less
Submitted 8 May, 2025; v1 submitted 6 May, 2025;
originally announced May 2025.
-
SoftMatcha: A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches
Authors:
Hiroyuki Deguchi,
Go Kamoda,
Yusuke Matsushita,
Chihiro Taguchi,
Kohei Suenaga,
Masaki Waga,
Sho Yokoi
Abstract:
Researchers and practitioners in natural language processing and computational linguistics frequently observe and analyze the real language usage in large-scale corpora. For that purpose, they often employ off-the-shelf pattern-matching tools, such as grep, and keyword-in-context concordancers, which is widely used in corpus linguistics for gathering examples. Nonetheless, these existing technique…
▽ More
Researchers and practitioners in natural language processing and computational linguistics frequently observe and analyze the real language usage in large-scale corpora. For that purpose, they often employ off-the-shelf pattern-matching tools, such as grep, and keyword-in-context concordancers, which is widely used in corpus linguistics for gathering examples. Nonetheless, these existing techniques rely on surface-level string matching, and thus they suffer from the major limitation of not being able to handle orthographic variations and paraphrasing -- notable and common phenomena in any natural language. In addition, existing continuous approaches such as dense vector search tend to be overly coarse, often retrieving texts that are unrelated but share similar topics. Given these challenges, we propose a novel algorithm that achieves \emph{soft} (or semantic) yet efficient pattern matching by relaxing a surface-level matching with word embeddings. Our algorithm is highly scalable with respect to the size of the corpus text utilizing inverted indexes. We have prepared an efficient implementation, and we provide an accessible web tool. Our experiments demonstrate that the proposed method (i) can execute searches on billion-scale corpora in less than a second, which is comparable in speed to surface-level string matching and dense vector search; (ii) can extract harmful instances that semantically match queries from a large set of English and Japanese Wikipedia articles; and (iii) can be effectively applied to corpus-linguistic analyses of Latin, a language with highly diverse inflections.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Certifying Lyapunov Stability of Black-Box Nonlinear Systems via Counterexample Guided Synthesis (Extended Version)
Authors:
Chiao Hsieh,
Masaki Waga,
Kohei Suenaga
Abstract:
Finding Lyapunov functions to certify the stability of control systems has been an important topic for verifying safety-critical systems. Most existing methods on finding Lyapunov functions require access to the dynamics of the system. Accurately describing the complete dynamics of a control system however remains highly challenging in practice. Latest trend of using learning-enabled control syste…
▽ More
Finding Lyapunov functions to certify the stability of control systems has been an important topic for verifying safety-critical systems. Most existing methods on finding Lyapunov functions require access to the dynamics of the system. Accurately describing the complete dynamics of a control system however remains highly challenging in practice. Latest trend of using learning-enabled control systems further reduces the transparency. Hence, a method for black-box systems would have much wider applications.
Our work stems from the recent idea of sampling and exploiting Lipschitz continuity to approximate the unknown dynamics. Given Lipschitz constants, one can derive a non-statistical upper bounds on approximation errors; hence a strong certification on this approximation can certify the unknown dynamics. We significantly improve this idea by directly approximating the Lie derivative of Lyapunov functions instead of the dynamics. We propose a framework based on the learner-verifier architecture from Counterexample-Guided Inductive Synthesis (CEGIS). Our insight of combining regional verification conditions and counterexample-guided sampling enables a guided search for samples to prove stability region-by-region. Our CEGIS algorithm further ensures termination.
Our numerical experiments suggest that it is possible to prove the stability of 2D and 3D systems with a few thousands of samples. Our visualization also reveals the regions where the stability is difficult to prove. In comparison with the existing black-box approach, our approach at the best case requires less than 0.01% of samples.
△ Less
Submitted 15 May, 2025; v1 submitted 1 March, 2025;
originally announced March 2025.
-
Skip2-LoRA: A Lightweight On-device DNN Fine-tuning Method for Low-cost Edge Devices
Authors:
Hiroki Matsutani,
Masaaki Kondo,
Kazuki Sunaga,
Radu Marculescu
Abstract:
This paper proposes Skip2-LoRA as a lightweight fine-tuning method for deep neural networks to address the gap between pre-trained and deployed models. In our approach, trainable LoRA (low-rank adaptation) adapters are inserted between the last layer and every other layer to enhance the network expressive power while keeping the backward computation cost low. This architecture is well-suited to ca…
▽ More
This paper proposes Skip2-LoRA as a lightweight fine-tuning method for deep neural networks to address the gap between pre-trained and deployed models. In our approach, trainable LoRA (low-rank adaptation) adapters are inserted between the last layer and every other layer to enhance the network expressive power while keeping the backward computation cost low. This architecture is well-suited to cache intermediate computation results of the forward pass and then can skip the forward computation of seen samples as training epochs progress. We implemented the combination of the proposed architecture and cache, denoted as Skip2-LoRA, and tested it on a $15 single board computer. Our results show that Skip2-LoRA reduces the fine-tuning time by 90.0% on average compared to the counterpart that has the same number of trainable parameters while preserving the accuracy, while taking only a few seconds on the microcontroller board.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
StatWhy: Formal Verification Tool for Statistical Hypothesis Testing Programs
Authors:
Yusuke Kawamoto,
Kentaro Kobayashi,
Kohei Suenaga
Abstract:
Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a tool-assisted method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the stat…
▽ More
Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a tool-assisted method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the statistical programs with the requirements for these methods. Through this annotation, they are reminded to check the requirements for statistical methods, including those that cannot be formally verified, such as the distribution of the unknown true population. Our software tool StatWhy automatically checks whether programmers have properly specified the requirements for the statistical methods, thereby identifying any missing requirements that need to be addressed. This tool is implemented using the Why3 platform to verify the correctness of OCaml programs that conduct statistical hypothesis testing. We demonstrate how StatWhy can be used to avoid common errors in various statistical hypothesis testing programs.
△ Less
Submitted 1 June, 2025; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Oblivious Monitoring for Discrete-Time STL via Fully Homomorphic Encryption
Authors:
Masaki Waga,
Kotaro Matsuoka,
Takashi Suwa,
Naoki Matsumoto,
Ryotaro Banno,
Song Bian,
Kohei Suenaga
Abstract:
When monitoring a cyber-physical system (CPS) from a remote server, keeping the monitored data secret is crucial, particularly when they contain sensitive information, e.g., biological or location data. Recently, Banno et al. (CAV'22) proposed a protocol for online LTL monitoring that keeps data concealed from the server using Fully Homomorphic Encryption (FHE). We build on this protocol to allow…
▽ More
When monitoring a cyber-physical system (CPS) from a remote server, keeping the monitored data secret is crucial, particularly when they contain sensitive information, e.g., biological or location data. Recently, Banno et al. (CAV'22) proposed a protocol for online LTL monitoring that keeps data concealed from the server using Fully Homomorphic Encryption (FHE). We build on this protocol to allow arithmetic operations over encrypted values, e.g., to compute a safety measurement combining distance, velocity, and so forth. Overall, our protocol enables oblivious online monitoring of discrete-time real-valued signals against signal temporal logic (STL) formulas. Our protocol combines two FHE schemes, CKKS and TFHE, leveraging their respective strengths. We employ CKKS to evaluate arithmetic predicates in STL formulas while utilizing TFHE to process them using a DFA derived from the STL formula. We conducted case studies on monitoring blood glucose levels and vehicles' behavior against the Responsibility-Sensitive Safety (RSS) rules. Our results suggest the practical relevance of our protocol.
△ Less
Submitted 18 October, 2024; v1 submitted 26 May, 2024;
originally announced May 2024.
-
An FPGA-Based Accelerator for Graph Embedding using Sequential Training Algorithm
Authors:
Kazuki Sunaga,
Keisuke Sugiura,
Hiroki Matsutani
Abstract:
A graph embedding is an emerging approach that can represent a graph structure with a fixed-length low-dimensional vector. node2vec is a well-known algorithm to obtain such a graph embedding by sampling neighboring nodes on a given graph with a random walk technique. However, the original node2vec algorithm typically relies on a batch training of graph structures; thus, it is not suited for applic…
▽ More
A graph embedding is an emerging approach that can represent a graph structure with a fixed-length low-dimensional vector. node2vec is a well-known algorithm to obtain such a graph embedding by sampling neighboring nodes on a given graph with a random walk technique. However, the original node2vec algorithm typically relies on a batch training of graph structures; thus, it is not suited for applications in which the graph structure changes after the deployment. In this paper, we focus on node2vec applications for IoT (Internet of Things) environments. To handle the changes of graph structures after the IoT devices have been deployed in edge environments, in this paper we propose to combine an online sequential training algorithm with node2vec. The proposed sequentially-trainable model is implemented on an FPGA (Field-Programmable Gate Array) device to demonstrate the benefits of our approach. The proposed FPGA implementation achieves up to 205.25 times speedup compared to the original model on ARM Cortex-A53 CPU. Evaluation results using dynamic graphs show that although the accuracy is decreased in the original model, the proposed sequential model can obtain better graph embedding that achieves a higher accuracy even when the graph structure is changed.
△ Less
Submitted 29 April, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Probabilistic Black-Box Checking via Active MDP Learning
Authors:
Junya Shijubo,
Masaki Waga,
Kohei Suenaga
Abstract:
We introduce a novel methodology for testing stochastic black-box systems, frequently encountered in embedded systems. Our approach enhances the established black-box checking (BBC) technique to address stochastic behavior. Traditional BBC primarily involves iteratively identifying an input that breaches the system's specifications by executing the following three phases: the learning phase to con…
▽ More
We introduce a novel methodology for testing stochastic black-box systems, frequently encountered in embedded systems. Our approach enhances the established black-box checking (BBC) technique to address stochastic behavior. Traditional BBC primarily involves iteratively identifying an input that breaches the system's specifications by executing the following three phases: the learning phase to construct an automaton approximating the black box's behavior, the synthesis phase to identify a candidate counterexample from the learned automaton, and the validation phase to validate the obtained candidate counterexample and the learned automaton against the original black-box system. Our method, ProbBBC, refines the conventional BBC approach by (1) employing an active Markov Decision Process (MDP) learning method during the learning phase, (2) incorporating probabilistic model checking in the synthesis phase, and (3) applying statistical hypothesis testing in the validation phase. ProbBBC uniquely integrates these techniques rather than merely substituting each method in the traditional BBC; for instance, the statistical hypothesis testing and the MDP learning procedure exchange information regarding the black-box system's observation with one another. The experiment results suggest that ProbBBC outperforms an existing method, especially for systems with limited observation.
△ Less
Submitted 15 July, 2023;
originally announced August 2023.
-
Learning nonlinear hybrid automata from input--output time-series data
Authors:
Amit Gurung,
Masaki Waga,
Kohei Suenaga
Abstract:
Learning an automaton that approximates the behavior of a black-box system is a long-studied problem. Besides its theoretical significance, its application to search-based testing and model understanding is recently recognized. We present an algorithm to learn a nonlinear hybrid automaton (HA) that approximates a black-box hybrid system (HS) from a set of input--output traces generated by the HS.…
▽ More
Learning an automaton that approximates the behavior of a black-box system is a long-studied problem. Besides its theoretical significance, its application to search-based testing and model understanding is recently recognized. We present an algorithm to learn a nonlinear hybrid automaton (HA) that approximates a black-box hybrid system (HS) from a set of input--output traces generated by the HS. Our method is novel in handling (1) both exogenous and endogenous HS and (2) HA with reset associated with each transition. To our knowledge, ours is the first method that achieves both features. We applied our algorithm to various benchmarks and confirmed its effectiveness.
△ Less
Submitted 27 July, 2023; v1 submitted 10 January, 2023;
originally announced January 2023.
-
BOREx: Bayesian-Optimization--Based Refinement of Saliency Map for Image- and Video-Classification Models
Authors:
Atsushi Kikuchi,
Kotaro Uchida,
Masaki Waga,
Kohei Suenaga
Abstract:
Explaining a classification result produced by an image- and video-classification model is one of the important but challenging issues in computer vision. Many methods have been proposed for producing heat-map--based explanations for this purpose, including ones based on the white-box approach that uses the internal information of a model (e.g., LRP, Grad-CAM, and Grad-CAM++) and ones based on the…
▽ More
Explaining a classification result produced by an image- and video-classification model is one of the important but challenging issues in computer vision. Many methods have been proposed for producing heat-map--based explanations for this purpose, including ones based on the white-box approach that uses the internal information of a model (e.g., LRP, Grad-CAM, and Grad-CAM++) and ones based on the black-box approach that does not use any internal information (e.g., LIME, SHAP, and RISE). We propose a new black-box method BOREx (Bayesian Optimization for Refinement of visual model Explanation) to refine a heat map produced by any method. Our observation is that a heat-map--based explanation can be seen as a prior for an explanation method based on Bayesian optimization. Based on this observation, BOREx conducts Gaussian process regression (GPR) to estimate the saliency of each pixel in a given image starting from the one produced by another explanation method. Our experiments statistically demonstrate that the refinement by BOREx improves low-quality heat maps for image- and video-classification results.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Formalizing Statistical Causality via Modal Logic
Authors:
Yusuke Kawamoto,
Tetsuya Sato,
Kohei Suenaga
Abstract:
We propose a formal language for describing and explaining statistical causality. Concretely, we define Statistical Causality Language (StaCL) for expressing causal effects and specifying the requirements for causal inference. StaCL incorporates modal operators for interventions to express causal properties between probability distributions in different possible worlds in a Kripke model. We formal…
▽ More
We propose a formal language for describing and explaining statistical causality. Concretely, we define Statistical Causality Language (StaCL) for expressing causal effects and specifying the requirements for causal inference. StaCL incorporates modal operators for interventions to express causal properties between probability distributions in different possible worlds in a Kripke model. We formalize axioms for probability distributions, interventions, and causal predicates using StaCL formulas. These axioms are expressive enough to derive the rules of Pearl's do-calculus. Finally, we demonstrate by examples that StaCL can be used to specify and explain the correctness of statistical causal inference.
△ Less
Submitted 17 September, 2023; v1 submitted 30 October, 2022;
originally announced October 2022.
-
Sound and Relatively Complete Belief Hoare Logic for Statistical Hypothesis Testing Programs
Authors:
Yusuke Kawamoto,
Tetsuya Sato,
Kohei Suenaga
Abstract:
We propose a new approach to formally describing the requirement for statistical inference and checking whether a program uses the statistical method appropriately. Specifically, we define belief Hoare logic (BHL) for formalizing and reasoning about the statistical beliefs acquired via hypothesis testing. This program logic is sound and relatively complete with respect to a Kripke model for hypoth…
▽ More
We propose a new approach to formally describing the requirement for statistical inference and checking whether a program uses the statistical method appropriately. Specifically, we define belief Hoare logic (BHL) for formalizing and reasoning about the statistical beliefs acquired via hypothesis testing. This program logic is sound and relatively complete with respect to a Kripke model for hypothesis tests. We demonstrate by examples that BHL is useful for reasoning about practical issues in hypothesis testing. In our framework, we clarify the importance of prior beliefs in acquiring statistical beliefs through hypothesis testing, and discuss the whole picture of the justification of statistical inference inside and outside the program logic.
△ Less
Submitted 8 November, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Goal-Aware RSS for Complex Scenarios via Program Logic
Authors:
Ichiro Hasuo,
Clovis Eberhart,
James Haydon,
Jérémy Dubut,
Rose Bohrer,
Tsutomu Kobayashi,
Sasinee Pruekprasert,
Xiao-Yi Zhang,
Erik André Pallas,
Akihisa Yamada,
Kohei Suenaga,
Fuyuki Ishikawa,
Kenji Kamijo,
Yoshiyuki Shinya,
Takamasa Suetomi
Abstract:
We introduce a goal-aware extension of responsibility-sensitive safety (RSS), a recent methodology for rule-based safety guarantee for automated driving systems (ADS). Making RSS rules guarantee goal achievement -- in addition to collision avoidance as in the original RSS -- requires complex planning over long sequences of manoeuvres. To deal with the complexity, we introduce a compositional reaso…
▽ More
We introduce a goal-aware extension of responsibility-sensitive safety (RSS), a recent methodology for rule-based safety guarantee for automated driving systems (ADS). Making RSS rules guarantee goal achievement -- in addition to collision avoidance as in the original RSS -- requires complex planning over long sequences of manoeuvres. To deal with the complexity, we introduce a compositional reasoning framework based on program logic, in which one can systematically develop RSS rules for smaller subscenarios and combine them to obtain RSS rules for bigger scenarios. As the basis of the framework, we introduce a program logic dFHL that accommodates continuous dynamics and safety conditions. Our framework presents a dFHL-based workflow for deriving goal-aware RSS rules; we discuss its software support, too. We conducted experimental evaluation using RSS rules in a safety architecture. Its results show that goal-aware RSS is indeed effective in realising both collision avoidance and goal achievement.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Oblivious Online Monitoring for Safety LTL Specification via Fully Homomorphic Encryption
Authors:
Ryotaro Banno,
Kotaro Matsuoka,
Naoki Matsumoto,
Song Bian,
Masaki Waga,
Kohei Suenaga
Abstract:
In many Internet of Things (IoT) applications, data sensed by an IoT device are continuously sent to the server and monitored against a specification. Since the data often contain sensitive information, and the monitored specification is usually proprietary, both must be kept private from the other end. We propose a protocol to conduct oblivious online monitoring -- online monitoring conducted wit…
▽ More
In many Internet of Things (IoT) applications, data sensed by an IoT device are continuously sent to the server and monitored against a specification. Since the data often contain sensitive information, and the monitored specification is usually proprietary, both must be kept private from the other end. We propose a protocol to conduct oblivious online monitoring -- online monitoring conducted without revealing the private information of each party to the other -- against a safety LTL specification. In our protocol, we first convert a safety LTL formula into a DFA and conduct online monitoring with the DFA. Based on fully homomorphic encryption (FHE), we propose two online algorithms (Reverse and Block) to run a DFA obliviously. We prove the correctness and security of our entire protocol. We also show the scalability of our algorithms theoretically and empirically. Our case study shows that our algorithms are fast enough to monitor blood glucose levels online, demonstrating our protocol's practical relevance.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
The Lattice-Theoretic Essence of Property Directed Reachability Analysis
Authors:
Mayuko Kori,
Natsuki Urabe,
Shin-ya Katsumata,
Kohei Suenaga,
Ichiro Hasuo
Abstract:
We present LT-PDR, a lattice-theoretic generalization of Bradley's property directed reachability analysis (PDR) algorithm. LT-PDR identifies the essence of PDR to be an ingenious combination of verification and refutation attempts based on the Knaster-Tarski and Kleene theorems. We introduce four concrete instances of LT-PDR, derive their implementation from a generic Haskell implementation of LT…
▽ More
We present LT-PDR, a lattice-theoretic generalization of Bradley's property directed reachability analysis (PDR) algorithm. LT-PDR identifies the essence of PDR to be an ingenious combination of verification and refutation attempts based on the Knaster-Tarski and Kleene theorems. We introduce four concrete instances of LT-PDR, derive their implementation from a generic Haskell implementation of LT-PDR, and experimentally evaluate them. We also present a categorical structural theory that derives these instances.
△ Less
Submitted 13 August, 2022; v1 submitted 27 March, 2022;
originally announced March 2022.
-
Addressing Gap between Training Data and Deployed Environment by On-Device Learning
Authors:
Kazuki Sunaga,
Masaaki Kondo,
Hiroki Matsutani
Abstract:
The accuracy of tinyML applications is often affected by various environmental factors, such as noises, location/calibration of sensors, and time-related changes. This article introduces a neural network based on-device learning (ODL) approach to address this issue by retraining in deployed environments. Our approach relies on semi-supervised sequential training of multiple neural networks tailore…
▽ More
The accuracy of tinyML applications is often affected by various environmental factors, such as noises, location/calibration of sensors, and time-related changes. This article introduces a neural network based on-device learning (ODL) approach to address this issue by retraining in deployed environments. Our approach relies on semi-supervised sequential training of multiple neural networks tailored for low-end edge devices. This article introduces its algorithm and implementation on wireless sensor nodes consisting of a Raspberry Pi Pico and low-power wireless module. Experiments using vibration patterns of rotating machines demonstrate that retraining by ODL improves anomaly detection accuracy compared with a prediction-only deep neural network in a noisy environment. The results also show that the ODL approach can save communication cost and energy consumption for battery-powered Internet of Things devices.
△ Less
Submitted 24 December, 2023; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Efficient Black-Box Checking via Model Checking with Strengthened Specifications
Authors:
Junya Shijubo,
Masaki Waga,
Kohei Suenaga
Abstract:
Black-box checking (BBC)} is a testing method for cyber-physical systems (CPSs) as well as software systems. BBC consists of active automata learning and model checking; a Mealy machine is learned from the system under test (SUT), and the learned Mealy machine is verified against a specification using model checking. When the Mealy machine violates the specification, the model checker returns an i…
▽ More
Black-box checking (BBC)} is a testing method for cyber-physical systems (CPSs) as well as software systems. BBC consists of active automata learning and model checking; a Mealy machine is learned from the system under test (SUT), and the learned Mealy machine is verified against a specification using model checking. When the Mealy machine violates the specification, the model checker returns an input witnessing the specification violation of the Mealy machine. We use it to refine the Mealy machine or conclude that the SUT violates the specification. Otherwise, we conduct equivalence testing to find an input witnessing the difference between the Mealy machine and the SUT. In the BBC for CPSs, equivalence testing tends to be time-consuming due to the time for the system execution. In this paper, we enhance the BBC utilizing model checking with strengthened specifications. By model checking with a strengthened specification, we have more chance to obtain an input witnessing the specification violation than model checking with the original specification. The refinement of the Mealy machine with such an input tends to reduce the number of equivalence testing, which improves the efficiency. We conducted experiments with an automotive benchmark. Our experiment results demonstrate the merit of our method.
△ Less
Submitted 10 September, 2021;
originally announced September 2021.
-
HELMHOLTZ: A Verifier for Tezos Smart Contracts Based on Refinement Types
Authors:
Yuki Nishida,
Hiromasa Saito,
Ran Chen,
Akira Kawata,
Jun Furuse,
Kohei Suenaga,
Atsushi Igarashi
Abstract:
A smart contract is a program executed on a blockchain, based on which many cryptocurrencies are implemented, and is being used for automating transactions. Due to the large amount of money that smart contracts deal with, there is a surging demand for a method that can statically and formally verify them.
This article describes our type-based static verification tool HELMHOLTZ for Michelson, whi…
▽ More
A smart contract is a program executed on a blockchain, based on which many cryptocurrencies are implemented, and is being used for automating transactions. Due to the large amount of money that smart contracts deal with, there is a surging demand for a method that can statically and formally verify them.
This article describes our type-based static verification tool HELMHOLTZ for Michelson, which is a statically typed stack-based language for writing smart contracts that are executed on the blockchain platform Tezos. HELMHOLTZ is designed on top of our extension of Michelson's type system with refinement types. HELMHOLTZ takes a Michelson program annotated with a user-defined specification written in the form of a refinement type as input; it then typechecks the program against the specification based on the refinement type system, discharging the generated verification conditions with the SMT solver Z3. We briefly introduce our refinement type system for the core calculus Mini-Michelson of Michelson, which incorporates the characteristic features such as compound datatypes (e.g., lists and pairs), higher-order functions, and invocation of another contract. \HELMHOLTZ{} successfully verifies several practical Michelson programs, including one that transfers money to an account and that checks a digital signature.
△ Less
Submitted 10 September, 2021; v1 submitted 29 August, 2021;
originally announced August 2021.
-
Learning Heuristics for Template-based CEGIS of Loop Invariants with Reinforcement Learning
Authors:
Minchao Wu,
Takeshi Tsukada,
Hiroshi Unno,
Taro Sekiyama,
Kohei Suenaga
Abstract:
Loop-invariant synthesis is the basis of program verification. Due to the undecidability of the problem in general, a tool for invariant synthesis necessarily uses heuristics. Despite the common belief that the design of heuristics is vital for the performance of a synthesizer, heuristics are often engineered by their developers based on experience and intuition, sometimes in an \emph{ad-hoc} mann…
▽ More
Loop-invariant synthesis is the basis of program verification. Due to the undecidability of the problem in general, a tool for invariant synthesis necessarily uses heuristics. Despite the common belief that the design of heuristics is vital for the performance of a synthesizer, heuristics are often engineered by their developers based on experience and intuition, sometimes in an \emph{ad-hoc} manner. In this work, we propose an approach to systematically learning heuristics for template-based CounterExample-Guided Inductive Synthesis (CEGIS) with reinforcement learning. As a concrete example, we implement the approach on top of PCSat, which is an invariant synthesizer based on template-based CEGIS. Experiments show that PCSat guided by the heuristics learned by our framework not only outperforms existing state-of-the-art CEGIS-based solvers such as HoICE and the neural solver Code2Inv, but also has slight advantages over non-CEGIS-based solvers such as Eldarica and Spacer in linear Constrained Horn Clause (CHC) solving.
△ Less
Submitted 15 June, 2022; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Verification of a Merkle Patricia Tree Library Using F*
Authors:
Sota Sato,
Ryotaro Banno,
Jun Furuse,
Kohei Suenaga,
Atsushi Igarashi
Abstract:
A Merkle tree is a data structure for representing a key-value store as a tree. Each node of a Merkle tree is equipped with a hash value computed from those of their descendants. A Merkle tree is often used for representing a state of a blockchain system since it can be used for efficiently auditing the state in a trustless manner. Due to the safety-critical nature of blockchains, ensuring the cor…
▽ More
A Merkle tree is a data structure for representing a key-value store as a tree. Each node of a Merkle tree is equipped with a hash value computed from those of their descendants. A Merkle tree is often used for representing a state of a blockchain system since it can be used for efficiently auditing the state in a trustless manner. Due to the safety-critical nature of blockchains, ensuring the correctness of their implementation is paramount.
We show our formally verified implementation of the core part of Plebeia using F*. Plebeia is a library to manipulate an extension of Merkle trees (called Plebeia trees). It is being implemented as a part of the storage system of the Tezos blockchain system. To this end, we gradually ported Plebeia to F*; the OCaml code extracted from the modules ported to F* is linked with the unverified part of Plebeia. By this gradual porting process, we can obtain a working code from our partially verified implementation of Plebeia; we confirmed that the binary passes all the unit tests of Plebeia.
More specifically, we verified the following properties on the implementation of Plebeia: (1) Each tree-manipulating function preserves the invariants on the data structure of a Plebeia tree and satisfies the functional requirements as a nested key-value store; (2) Each function for serializing/deserializing a Plebeia tree to/from the low-level storage is implemented correctly; and (3) The hash function for a Plebeia tree is relatively collision-resistant with respect to the cryptographic safety of the blake2b hash function. During porting Plebeia to F*, we found a bug in an old version of Plebeia, which was overlooked by the tests bundled with the original implementation. To the best of our knowledge, this is the first work that verifies a production-level implementation of a Merkle-tree library by F*.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Control-Data Separation and Logical Condition Propagation for Efficient Inference on Probabilistic Programs
Authors:
Ichiro Hasuo,
Yuichiro Oyabu,
Clovis Eberhart,
Kohei Suenaga,
Kenta Cho,
Shin-ya Katsumata
Abstract:
We present a novel sampling framework for probabilistic programs. The framework combines two recent ideas -- \emph{control-data separation} and \emph{logical condition propagation} -- in a nontrivial manner so that the two ideas boost the benefits of each other. We implemented our algorithm on top of Anglican. The experimental results demonstrate our algorithm's efficiency, especially for programs…
▽ More
We present a novel sampling framework for probabilistic programs. The framework combines two recent ideas -- \emph{control-data separation} and \emph{logical condition propagation} -- in a nontrivial manner so that the two ideas boost the benefits of each other. We implemented our algorithm on top of Anglican. The experimental results demonstrate our algorithm's efficiency, especially for programs with while loops and rare observations.
△ Less
Submitted 29 September, 2023; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Visualizing Color-wise Saliency of Black-Box Image Classification Models
Authors:
Yuhki Hatakeyama,
Hiroki Sakuma,
Yoshinori Konishi,
Kohei Suenaga
Abstract:
Image classification based on machine learning is being commonly used. However, a classification result given by an advanced method, including deep learning, is often hard to interpret. This problem of interpretability is one of the major obstacles in deploying a trained model in safety-critical systems. Several techniques have been proposed to address this problem; one of which is RISE, which exp…
▽ More
Image classification based on machine learning is being commonly used. However, a classification result given by an advanced method, including deep learning, is often hard to interpret. This problem of interpretability is one of the major obstacles in deploying a trained model in safety-critical systems. Several techniques have been proposed to address this problem; one of which is RISE, which explains a classification result by a heatmap, called a saliency map, which explains the significance of each pixel. We propose MC-RISE (Multi-Color RISE), which is an enhancement of RISE to take color information into account in an explanation. Our method not only shows the saliency of each pixel in a given image as the original RISE does, but the significance of color components of each pixel; a saliency map with color information is useful especially in the domain where the color information matters (e.g., traffic-sign recognition). We implemented MC-RISE and evaluate them using two datasets (GTSRB and ImageNet) to demonstrate the effectiveness of our methods in comparison with existing techniques for interpreting image classification results.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types for Imperative Programs
Authors:
John Toman,
Ren Siqi,
Kohei Suenaga,
Atsushi Igarashi,
Naoki Kobayashi
Abstract:
We present ConSORT, a type system for safety verification in the presence of mutability and aliasing. Mutability requires strong updates to model changing invariants during program execution, but aliasing between pointers makes it difficult to determine which invariants must be updated in response to mutation. Our type system addresses this difficulty with a novel combination of refinement types a…
▽ More
We present ConSORT, a type system for safety verification in the presence of mutability and aliasing. Mutability requires strong updates to model changing invariants during program execution, but aliasing between pointers makes it difficult to determine which invariants must be updated in response to mutation. Our type system addresses this difficulty with a novel combination of refinement types and fractional ownership types. Fractional ownership types provide flow-sensitive and precise aliasing information for reference variables. ConSORT interprets this ownership information to soundly handle strong updates of potentially aliased references. We have proved ConSORT sound and implemented a prototype, fully automated inference tool. We evaluated our tool and found it verifies non-trivial programs including data structure implementations.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Generalized Property-Directed Reachability for Hybrid Systems
Authors:
Kohei Suenaga,
Takuya Ishizawa
Abstract:
Generalized property-directed reachability (GPDR) belongs to the family of the model-checking techniques called IC3/PDR. It has been successfully applied to software verification; for example, it is the core of Spacer, a state-of-the-art Horn-clause solver bundled with Z3. However, it has yet to be applied to hybrid systems, which involve a continuous evolution of values over time. As the first st…
▽ More
Generalized property-directed reachability (GPDR) belongs to the family of the model-checking techniques called IC3/PDR. It has been successfully applied to software verification; for example, it is the core of Spacer, a state-of-the-art Horn-clause solver bundled with Z3. However, it has yet to be applied to hybrid systems, which involve a continuous evolution of values over time. As the first step towards GPDR- based model checking for hybrid systems, this paper formalizes HGPDR, an adaptation of GPDR to hybrid systems, and proves its soundness. We also implemented a semi-automated proof-of-concept verifier, which allows a user to provide hints to guide verification steps.
△ Less
Submitted 26 November, 2019; v1 submitted 9 October, 2019;
originally announced October 2019.
-
MONAA: A Tool for Timed Pattern Matching with Automata-Based Acceleration
Authors:
Masaki Waga,
Ichiro Hasuo,
Kohei Suenaga
Abstract:
We present monaa, a monitoring tool over a real-time property specified by either a timed automaton or a timed regular expression. It implements a timed pattern matching algorithm that combines 1) features suited for online monitoring, and 2) acceleration by automata-based skipping. Our experiments demonstrate monaa's performance advantage, especially in online usage.
We present monaa, a monitoring tool over a real-time property specified by either a timed automaton or a timed regular expression. It implements a timed pattern matching algorithm that combines 1) features suited for online monitoring, and 2) acceleration by automata-based skipping. Our experiments demonstrate monaa's performance advantage, especially in online usage.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Automated proof synthesis for propositional logic with deep neural networks
Authors:
Taro Sekiyama,
Kohei Suenaga
Abstract:
This work explores the application of deep learning, a machine learning technique that uses deep neural networks (DNN) in its core, to an automated theorem proving (ATP) problem. To this end, we construct a statistical model which quantifies the likelihood that a proof is indeed a correct one of a given proposition. Based on this model, we give a proof-synthesis procedure that searches for a proof…
▽ More
This work explores the application of deep learning, a machine learning technique that uses deep neural networks (DNN) in its core, to an automated theorem proving (ATP) problem. To this end, we construct a statistical model which quantifies the likelihood that a proof is indeed a correct one of a given proposition. Based on this model, we give a proof-synthesis procedure that searches for a proof in the order of the likelihood. This procedure uses an estimator of the likelihood of an inference rule being applied at each step of a proof. As an implementation of the estimator, we propose a proposition-to-proof architecture, which is a DNN tailored to the automated proof synthesis problem. To empirically demonstrate its usefulness, we apply our model to synthesize proofs of propositional logic. We train the proposition-to-proof model using a training dataset of proposition-proof pairs. The evaluation against a benchmark set shows the very high accuracy and an improvement to the recent work of neural proof synthesis.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
Sharper and Simpler Nonlinear Interpolants for Program Verification
Authors:
Takamasa Okudono,
Yuki Nishida,
Kensuke Kojima,
Kohei Suenaga,
Kengo Kido,
Ichiro Hasuo
Abstract:
Interpolation of jointly infeasible predicates plays important roles in various program verification techniques such as invariant synthesis and CEGAR. Intrigued by the recent result by Dai et al.\ that combines real algebraic geometry and SDP optimization in synthesis of polynomial interpolants, the current paper contributes its enhancement that yields sharper and simpler interpolants. The enhance…
▽ More
Interpolation of jointly infeasible predicates plays important roles in various program verification techniques such as invariant synthesis and CEGAR. Intrigued by the recent result by Dai et al.\ that combines real algebraic geometry and SDP optimization in synthesis of polynomial interpolants, the current paper contributes its enhancement that yields sharper and simpler interpolants. The enhancement is made possible by: theoretical observations in real algebraic geometry; and our continued fraction-based algorithm that rounds off (potentially erroneous) numerical solutions of SDP solvers. Experiment results support our tool's effectiveness; we also demonstrate the benefit of sharp and simple interpolants in program verification examples.
△ Less
Submitted 1 September, 2017;
originally announced September 2017.
-
Efficient Online Timed Pattern Matching by Automata-Based Skipping
Authors:
Masaki Waga,
Ichiro Hasuo,
Kohei Suenaga
Abstract:
The timed pattern matching problem is an actively studied topic because of its relevance in monitoring of real-time systems. There one is given a log $w$ and a specification $\mathcal{A}$ (given by a timed word and a timed automaton in this paper), and one wishes to return the set of intervals for which the log $w$, when restricted to the interval, satisfies the specification $\mathcal{A}$. In our…
▽ More
The timed pattern matching problem is an actively studied topic because of its relevance in monitoring of real-time systems. There one is given a log $w$ and a specification $\mathcal{A}$ (given by a timed word and a timed automaton in this paper), and one wishes to return the set of intervals for which the log $w$, when restricted to the interval, satisfies the specification $\mathcal{A}$. In our previous work we presented an efficient timed pattern matching algorithm: it adopts a skipping mechanism inspired by the classic Boyer--Moore (BM) string matching algorithm. In this work we tackle the problem of online timed pattern matching, towards embedded applications where it is vital to process a vast amount of incoming data in a timely manner. Specifically, we start with the Franek-Jennings-Smyth (FJS) string matching algorithm---a recent variant of the BM algorithm---and extend it to timed pattern matching. Our experiments indicate the efficiency of our FJS-type algorithm in online and offline timed pattern matching.
△ Less
Submitted 28 June, 2017;
originally announced June 2017.
-
Towards Proof Synthesis Guided by Neural Machine Translation for Intuitionistic Propositional Logic
Authors:
Taro Sekiyama,
Akifumi Imanishi,
Kohei Suenaga
Abstract:
Inspired by the recent evolution of deep neural networks (DNNs) in machine learning, we explore their application to PL-related topics. This paper is the first step towards this goal; we propose a proof-synthesis method for the negation-free propositional logic in which we use a DNN to obtain a guide of proof search. The idea is to view the proof-synthesis problem as a translation from a propositi…
▽ More
Inspired by the recent evolution of deep neural networks (DNNs) in machine learning, we explore their application to PL-related topics. This paper is the first step towards this goal; we propose a proof-synthesis method for the negation-free propositional logic in which we use a DNN to obtain a guide of proof search. The idea is to view the proof-synthesis problem as a translation from a proposition to its proof. We train seq2seq, which is a popular network in neural machine translation, so that it generates a proof encoded as a $λ$-term of a given proposition. We implement the whole framework and empirically observe that a generated proof term is close to a correct proof in terms of the tree edit distance of AST. This observation justifies using the output from a trained seq2seq model as a guide for proof search.
△ Less
Submitted 20 June, 2017;
originally announced June 2017.
-
Generalized Homogeneous Polynomials for Efficient Template-Based Nonlinear Invariant Synthesis
Authors:
Kensuke Kojima,
Minoru Kinoshita,
Kohei Suenaga
Abstract:
The template-based method is one of the most successful approaches to algebraic invariant synthesis. In this method, an algorithm designates a template polynomial p over program variables, generates constraints for p=0 to be an invariant, and solves the generated constraints. However, this approach often suffers from an increasing template size if the degree of a template polynomial is too high.…
▽ More
The template-based method is one of the most successful approaches to algebraic invariant synthesis. In this method, an algorithm designates a template polynomial p over program variables, generates constraints for p=0 to be an invariant, and solves the generated constraints. However, this approach often suffers from an increasing template size if the degree of a template polynomial is too high.
We propose a technique to make template-based methods more efficient. Our technique is based on the following finding: If an algebraic invariant exists, then there is a specific algebraic invariant that we call a generalized homogeneous algebraic invariant that is often smaller. This finding justifies using only a smaller template that corresponds to a generalized homogeneous algebraic invariant.
Concretely, we state our finding above formally based on the abstract semantics of an imperative program proposed by Cachera et al. Then, we modify their template-based invariant synthesis so that it generates only generalized homogeneous algebraic invariants. This modification is proved to be sound. Furthermore, we also empirically demonstrate the merit of the restriction to generalized homogeneous algebraic invariants. Our implementation outperforms that of Cachera et al. for programs that require a higher-degree template.
△ Less
Submitted 13 September, 2016; v1 submitted 25 April, 2016;
originally announced April 2016.
-
Resource Usage Analysis for the Pi-Calculus
Authors:
Naoki Kobayashi,
Kohei Suenaga,
Lucian Wischik
Abstract:
We propose a type-based resource usage analysis for the π-calculus extended with resource creation/access primitives. The goal of the resource usage analysis is to statically check that a program accesses resources such as files and memory in a valid manner. Our type system is an extension of previous behavioral type systems for the π-calculus, and can guarantee the safety property tha…
▽ More
We propose a type-based resource usage analysis for the π-calculus extended with resource creation/access primitives. The goal of the resource usage analysis is to statically check that a program accesses resources such as files and memory in a valid manner. Our type system is an extension of previous behavioral type systems for the π-calculus, and can guarantee the safety property that no invalid access is performed, as well as the property that necessary accesses (such as the close operation for a file) are eventually performed unless the program diverges. A sound type inference algorithm for the type system is also developed to free the programmer from the burden of writing complex type annotations. Based on the algorithm, we have implemented a prototype resource usage analyzer for the π-calculus. To the authors' knowledge, ours is the first type-based resource usage analysis that deals with an expressive concurrent language like the pi-calculus.
△ Less
Submitted 13 September, 2006; v1 submitted 7 August, 2006;
originally announced August 2006.