Search | arXiv e-print repository

Estimating the Robustness Radius for Randomized Smoothing with 100$\times$ Sample Efficiency

Authors: Emmanouil Seferis, Stefanos Kollias, Chih-Hong Cheng

Abstract: Randomized smoothing (RS) has successfully been used to improve the robustness of predictions for deep neural networks (DNNs) by adding random noise to create multiple variations of an input, followed by deciding the consensus. To understand if an RS-enabled DNN is effective in the sampled input domains, it is mandatory to sample data points within the operational design domain, acquire the point-… ▽ More Randomized smoothing (RS) has successfully been used to improve the robustness of predictions for deep neural networks (DNNs) by adding random noise to create multiple variations of an input, followed by deciding the consensus. To understand if an RS-enabled DNN is effective in the sampled input domains, it is mandatory to sample data points within the operational design domain, acquire the point-wise certificate regarding robustness radius, and compare it with pre-defined acceptance criteria. Consequently, ensuring that a point-wise robustness certificate for any given data point is obtained relatively cost-effectively is crucial. This work demonstrates that reducing the number of samples by one or two orders of magnitude can still enable the computation of a slightly smaller robustness radius (commonly ~20% radius reduction) with the same confidence. We provide the mathematical foundation for explaining the phenomenon while experimentally showing promising results on the standard CIFAR-10 and ImageNet datasets. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2209.11632 [pdf, other]

Facilitating Change Implementation for Continuous ML-Safety Assurance

Authors: Chih-Hong Cheng, Nguyen Anh Vu Doan, Balahari Balu, Franziska Schwaiger, Emmanouil Seferis, Simon Burton, Yassine Qamsane, Ankit Shukla, Yinchong Yang, Zhiliang Wu, Andreas Hapfelmeier, Ingo Thon

Abstract: We propose a method for deploying a safety-critical machine-learning component into continuously evolving environments where an increased degree of automation in the engineering process is desired. We associate semantic tags with the safety case argumentation and turn each piece of evidence into a quantitative metric or a logic formula. With proper tool support, the impact can be characterized by… ▽ More We propose a method for deploying a safety-critical machine-learning component into continuously evolving environments where an increased degree of automation in the engineering process is desired. We associate semantic tags with the safety case argumentation and turn each piece of evidence into a quantitative metric or a logic formula. With proper tool support, the impact can be characterized by a query over the safety argumentation tree to highlight evidence turning invalid. The concept is exemplified using a vision-based emergency braking system of an autonomous guided vehicle for factory automation. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2205.07736 [pdf, other]

Prioritizing Corners in OoD Detectors via Symbolic String Manipulation

Authors: Chih-Hong Cheng, Changshun Wu, Emmanouil Seferis, Saddek Bensalem

Abstract: For safety assurance of deep neural networks (DNNs), out-of-distribution (OoD) monitoring techniques are essential as they filter spurious input that is distant from the training dataset. This paper studies the problem of systematically testing OoD monitors to avoid cases where an input data point is tested as in-distribution by the monitor, but the DNN produces spurious output predictions. We con… ▽ More For safety assurance of deep neural networks (DNNs), out-of-distribution (OoD) monitoring techniques are essential as they filter spurious input that is distant from the training dataset. This paper studies the problem of systematically testing OoD monitors to avoid cases where an input data point is tested as in-distribution by the monitor, but the DNN produces spurious output predictions. We consider the definition of "in-distribution" characterized in the feature space by a union of hyperrectangles learned from the training dataset. Thus the testing is reduced to finding corners in hyperrectangles distant from the available training data in the feature space. Concretely, we encode the abstract location of every data point as a finite-length binary string, and the union of all binary strings is stored compactly using binary decision diagrams (BDDs). We demonstrate how to use BDDs to symbolically extract corners distant from all data points within the training set. Apart from test case generation, we explain how to use the proposed corners to fine-tune the DNN to ensure that it does not predict overly confidently. The result is evaluated over examples such as number and traffic sign recognition. △ Less

Submitted 16 May, 2022; originally announced May 2022.

arXiv:2203.06974 [pdf, other]

SMC4PEP: Stochastic Model Checking of Product Engineering Processes

Authors: Hassan Hage, Emmanouil Seferis, Vahid Hashemi, Frank Mantwill

Abstract: Product Engineering Processes (PEPs) are used for describing complex product developments in big enterprises such as automotive and avionics industries. The Business Process Model Notation (BPMN) is a widely used language to encode interactions among several participants in such PEPs. In this paper, we present SMC4PEP as a tool to convert graphical representations of a business process using the B… ▽ More Product Engineering Processes (PEPs) are used for describing complex product developments in big enterprises such as automotive and avionics industries. The Business Process Model Notation (BPMN) is a widely used language to encode interactions among several participants in such PEPs. In this paper, we present SMC4PEP as a tool to convert graphical representations of a business process using the BPMN standard to an equivalent discrete-time stochastic control process called Markov Decision Process (MDP). To this aim, we first follow the approach described in an earlier investigation to generate a semantically equivalent business process which is more capable of handling the PEP complexity. In particular, the interaction between different levels of abstraction is realized by events rather than direct message flows. Afterwards, SMC4PEP converts the generated process to an MDP model described by the syntax of the probabilistic model checking tool PRISM. As such, SMC4PEP provides a framework for automatic verification and validation of business processes in particular with respect to requirements from legal standards such as Automotive SPICE. Moreover, our experimental results confirm a faster verification routine due to smaller MDP models generated from the alternative event-based BPMN models. △ Less

Submitted 18 February, 2022; originally announced March 2022.

Comments: Paper accepted at the 25th International Conference on Fundamental Approaches to Software Engineering (FASE'22)

arXiv:2202.05123 [pdf, other]

Unaligned but Safe -- Formally Compensating Performance Limitations for Imprecise 2D Object Detection

Authors: Tobias Schuster, Emmanouil Seferis, Simon Burton, Chih-Hong Cheng

Abstract: In this paper, we consider the imperfection within machine learning-based 2D object detection and its impact on safety. We address a special sub-type of performance limitations: the prediction bounding box cannot be perfectly aligned with the ground truth, but the computed Intersection-over-Union metric is always larger than a given threshold. Under such type of performance limitation, we formally… ▽ More In this paper, we consider the imperfection within machine learning-based 2D object detection and its impact on safety. We address a special sub-type of performance limitations: the prediction bounding box cannot be perfectly aligned with the ground truth, but the computed Intersection-over-Union metric is always larger than a given threshold. Under such type of performance limitation, we formally prove the minimum required bounding box enlargement factor to cover the ground truth. We then demonstrate that the factor can be mathematically adjusted to a smaller value, provided that the motion planner takes a fixed-length buffer in making its decisions. Finally, observing the difference between an empirically measured enlargement factor and our formally derived worst-case enlargement factor offers an interesting connection between the quantitative evidence (demonstrated by statistics) and the qualitative evidence (demonstrated by worst-case analysis). △ Less

Submitted 10 February, 2022; originally announced February 2022.

Showing 1–5 of 5 results for author: Seferis, E