-
Probabilistic Reach-Avoid for Bayesian Neural Networks
Authors:
Matthew Wicker,
Luca Laurenti,
Andrea Patane,
Nicola Paoletti,
Alessandro Abate,
Marta Kwiatkowska
Abstract:
Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: f…
▽ More
Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Adversarial Robustness Certification for Bayesian Neural Networks
Authors:
Matthew Wicker,
Andrea Patane,
Luca Laurenti,
Marta Kwiatkowska
Abstract:
We study the problem of certifying the robustness of Bayesian neural networks (BNNs) to adversarial input perturbations. Given a compact set of input points $T \subseteq \mathbb{R}^m$ and a set of output points $S \subseteq \mathbb{R}^n$, we define two notions of robustness for BNNs in an adversarial setting: probabilistic robustness and decision robustness. Probabilistic robustness is the probabi…
▽ More
We study the problem of certifying the robustness of Bayesian neural networks (BNNs) to adversarial input perturbations. Given a compact set of input points $T \subseteq \mathbb{R}^m$ and a set of output points $S \subseteq \mathbb{R}^n$, we define two notions of robustness for BNNs in an adversarial setting: probabilistic robustness and decision robustness. Probabilistic robustness is the probability that for all points in $T$ the output of a BNN sampled from the posterior is in $S$. On the other hand, decision robustness considers the optimal decision of a BNN and checks if for all points in $T$ the optimal decision of the BNN for a given loss function lies within the output set $S$. Although exact computation of these robustness properties is challenging due to the probabilistic and non-convex nature of BNNs, we present a unified computational framework for efficiently and formally bounding them. Our approach is based on weight interval sampling, integration, and bound propagation techniques, and can be applied to BNNs with a large number of parameters, and independently of the (approximate) inference method employed to train the BNN. We evaluate the effectiveness of our methods on various regression and classification tasks, including an industrial regression benchmark, MNIST, traffic sign recognition, and airborne collision avoidance, and demonstrate that our approach enables certification of robustness and uncertainty of BNN predictions.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming
Authors:
Steven Adams,
Andrea Patane,
Morteza Lahijanian,
Luca Laurenti
Abstract:
In this paper, we introduce BNN-DP, an efficient algorithmic framework for analysis of adversarial robustness of Bayesian Neural Networks (BNNs). Given a compact set of input points $T\subset \mathbb{R}^n$, BNN-DP computes lower and upper bounds on the BNN's predictions for all the points in $T$. The framework is based on an interpretation of BNNs as stochastic dynamical systems, which enables the…
▽ More
In this paper, we introduce BNN-DP, an efficient algorithmic framework for analysis of adversarial robustness of Bayesian Neural Networks (BNNs). Given a compact set of input points $T\subset \mathbb{R}^n$, BNN-DP computes lower and upper bounds on the BNN's predictions for all the points in $T$. The framework is based on an interpretation of BNNs as stochastic dynamical systems, which enables the use of Dynamic Programming (DP) algorithms to bound the prediction range along the layers of the network. Specifically, the method uses bound propagation techniques and convex relaxations to derive a backward recursion procedure to over-approximate the prediction range of the BNN with piecewise affine functions. The algorithm is general and can handle both regression and classification tasks. On a set of experiments on various regression and classification tasks and BNN architectures, we show that BNN-DP outperforms state-of-the-art methods by up to four orders of magnitude in both tightness of the bounds and computational efficiency.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Individual Fairness in Bayesian Neural Networks
Authors:
Alice Doherty,
Matthew Wicker,
Luca Laurenti,
Andrea Patane
Abstract:
We study Individual Fairness (IF) for Bayesian neural networks (BNNs). Specifically, we consider the $ε$-$δ$-individual fairness notion, which requires that, for any pair of input points that are $ε$-similar according to a given similarity metrics, the output of the BNN is within a given tolerance $δ>0.$ We leverage bounds on statistical sampling over the input space and the relationship between a…
▽ More
We study Individual Fairness (IF) for Bayesian neural networks (BNNs). Specifically, we consider the $ε$-$δ$-individual fairness notion, which requires that, for any pair of input points that are $ε$-similar according to a given similarity metrics, the output of the BNN is within a given tolerance $δ>0.$ We leverage bounds on statistical sampling over the input space and the relationship between adversarial robustness and individual fairness to derive a framework for the systematic estimation of $ε$-$δ$-IF, designing Fair-FGSM and Fair-PGD as global,fairness-aware extensions to gradient-based attacks for BNNs. We empirically study IF of a variety of approximately inferred BNNs with different architectures on fairness benchmarks, and compare against deterministic models learnt using frequentist techniques. Interestingly, we find that BNNs trained by means of approximate Bayesian inference consistently tend to be markedly more individually fair than their deterministic counterparts.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
On the Robustness of Bayesian Neural Networks to Adversarial Attacks
Authors:
Luca Bortolussi,
Ginevra Carbone,
Luca Laurenti,
Andrea Patane,
Guido Sanguinetti,
Matthew Wicker
Abstract:
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian…
▽ More
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks. Experimental results on the MNIST, Fashion MNIST, and half moons datasets, representing the finite data regime, with BNNs trained with Hamiltonian Monte Carlo and Variational Inference, support this line of arguments, showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free based adversarial attacks.
△ Less
Submitted 28 February, 2024; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Individual Fairness Guarantees for Neural Networks
Authors:
Elias Benussi,
Andrea Patane,
Matthew Wicker,
Luca Laurenti,
Marta Kwiatkowska
Abstract:
We consider the problem of certifying the individual fairness (IF) of feed-forward neural networks (NNs). In particular, we work with the $ε$-$δ$-IF formulation, which, given a NN and a similarity metric learnt from data, requires that the output difference between any pair of $ε$-similar individuals is bounded by a maximum decision tolerance $δ\geq 0$. Working with a range of metrics, including t…
▽ More
We consider the problem of certifying the individual fairness (IF) of feed-forward neural networks (NNs). In particular, we work with the $ε$-$δ$-IF formulation, which, given a NN and a similarity metric learnt from data, requires that the output difference between any pair of $ε$-similar individuals is bounded by a maximum decision tolerance $δ\geq 0$. Working with a range of metrics, including the Mahalanobis distance, we propose a method to overapproximate the resulting optimisation problem using piecewise-linear functions to lower and upper bound the NN's non-linearities globally over the input space. We encode this computation as the solution of a Mixed-Integer Linear Programming problem and demonstrate that it can be used to compute IF guarantees on four datasets widely used for fairness benchmarking. We show how this formulation can be used to encourage models' fairness at training time by modifying the NN loss, and empirically confirm our approach yields NNs that are orders of magnitude fairer than state-of-the-art methods.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Certification of Iterative Predictions in Bayesian Neural Networks
Authors:
Matthew Wicker,
Luca Laurenti,
Andrea Patane,
Nicola Paoletti,
Alessandro Abate,
Marta Kwiatkowska
Abstract:
We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context…
▽ More
We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context of control and reinforcement learning to provide safety certification for given control policies, as well as to synthesize control policies that improve the certification bounds. On a set of benchmarks, we demonstrate that our framework can be employed to certify policies over BNNs predictions for problems of more than $10$ dimensions, and to effectively synthesize policies that significantly increase the lower bound on the satisfaction probability.
△ Less
Submitted 19 June, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
Adversarial Robustness Guarantees for Gaussian Processes
Authors:
Andrea Patane,
Arno Blaas,
Luca Laurenti,
Luca Cardelli,
Stephen Roberts,
Marta Kwiatkowska
Abstract:
Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications. Such scenarios demand that GP decisions are not only accurate, but also robust to perturbations. In this paper we present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations. Given a compact su…
▽ More
Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications. Such scenarios demand that GP decisions are not only accurate, but also robust to perturbations. In this paper we present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations. Given a compact subset of the input space $T\subseteq \mathbb{R}^d$, a point $x^*$ and a GP, we provide provable guarantees of adversarial robustness of the GP by computing lower and upper bounds on its prediction range in $T$. We develop a branch-and-bound scheme to refine the bounds and show, for any $ε> 0$, that our algorithm is guaranteed to converge to values $ε$-close to the actual values in finitely many iterations. The algorithm is anytime and can handle both regression and classification tasks, with analytical formulation for most kernels used in practice. We evaluate our methods on a collection of synthetic and standard benchmark datasets, including SPAM, MNIST and FashionMNIST. We study the effect of approximate inference techniques on robustness and demonstrate how our method can be used for interpretability. Our empirical results suggest that the adversarial robustness of GPs increases with accurate posterior estimation.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Bayesian Inference with Certifiable Adversarial Robustness
Authors:
Matthew Wicker,
Luca Laurenti,
Andrea Patane,
Zhoutong Chen,
Zheng Zhang,
Marta Kwiatkowska
Abstract:
We consider adversarial training of deep neural networks through the lens of Bayesian learning, and present a principled framework for adversarial training of Bayesian Neural Networks (BNNs) with certifiable guarantees. We rely on techniques from constraint relaxation of non-convex optimisation problems and modify the standard cross-entropy error model to enforce posterior robustness to worst-case…
▽ More
We consider adversarial training of deep neural networks through the lens of Bayesian learning, and present a principled framework for adversarial training of Bayesian Neural Networks (BNNs) with certifiable guarantees. We rely on techniques from constraint relaxation of non-convex optimisation problems and modify the standard cross-entropy error model to enforce posterior robustness to worst-case perturbations in $ε$-balls around input points. We illustrate how the resulting framework can be combined with methods commonly employed for approximate inference of BNNs. In an empirical investigation, we demonstrate that the presented approach enables training of certifiably robust models on MNIST, FashionMNIST and CIFAR-10 and can also be beneficial for uncertainty calibration. Our method is the first to directly train certifiable BNNs, thus facilitating their deployment in safety-critical applications.
△ Less
Submitted 22 February, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Probabilistic Safety for Bayesian Neural Networks
Authors:
Matthew Wicker,
Luca Laurenti,
Andrea Patane,
Marta Kwiatkowska
Abstract:
We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points, $T \subseteq \mathbb{R}^m$, we study the probability w.r.t. the BNN posterior that all the points in $T$ are mapped to the same region $S$ in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is…
▽ More
We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points, $T \subseteq \mathbb{R}^m$, we study the probability w.r.t. the BNN posterior that all the points in $T$ are mapped to the same region $S$ in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is vulnerable to adversarial attacks. We rely on relaxation techniques from non-convex optimization to develop a method for computing a lower bound on probabilistic safety for BNNs, deriving explicit procedures for the case of interval and linear function propagation techniques. We apply our methods to BNNs trained on a regression task, airborne collision avoidance, and MNIST, empirically showing that our approach allows one to certify probabilistic safety of BNNs with millions of parameters.
△ Less
Submitted 18 June, 2020; v1 submitted 21 April, 2020;
originally announced April 2020.
-
Robustness of Bayesian Neural Networks to Gradient-Based Attacks
Authors:
Ginevra Carbone,
Matthew Wicker,
Luca Laurenti,
Andrea Patane,
Luca Bortolussi,
Guido Sanguinetti
Abstract:
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerabi…
▽ More
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks. Experimental results on the MNIST and Fashion MNIST datasets with BNNs trained with Hamiltonian Monte Carlo and Variational Inference support this line of argument, showing that BNNs can display both high accuracy and robustness to gradient based adversarial attacks.
△ Less
Submitted 24 June, 2020; v1 submitted 11 February, 2020;
originally announced February 2020.
-
Safety Guarantees for Planning Based on Iterative Gaussian Processes
Authors:
Kyriakos Polymenakos,
Luca Laurenti,
Andrea Patane,
Jan-Peter Calliess,
Luca Cardelli,
Marta Kwiatkowska,
Alessandro Abate,
Stephen Roberts
Abstract:
Gaussian Processes (GPs) are widely employed in control and learning because of their principled treatment of uncertainty. However, tracking uncertainty for iterative, multi-step predictions in general leads to an analytically intractable problem. While approximation methods exist, they do not come with guarantees, making it difficult to estimate their reliability and to trust their predictions. I…
▽ More
Gaussian Processes (GPs) are widely employed in control and learning because of their principled treatment of uncertainty. However, tracking uncertainty for iterative, multi-step predictions in general leads to an analytically intractable problem. While approximation methods exist, they do not come with guarantees, making it difficult to estimate their reliability and to trust their predictions. In this work, we derive formal probability error bounds for iterative prediction and planning with GPs. Building on GP properties, we bound the probability that random trajectories lie in specific regions around the predicted values. Namely, given a tolerance $ε> 0 $, we compute regions around the predicted trajectory values, such that GP trajectories are guaranteed to lie inside them with probability at least $1-ε$. We verify experimentally that our method tracks the predictive uncertainty correctly, even when current approximation techniques fail. Furthermore, we show how the proposed bounds can be employed within a safe reinforcement learning framework to verify the safety of candidate control policies, guiding the synthesis of provably safe controllers.
△ Less
Submitted 7 September, 2020; v1 submitted 29 November, 2019;
originally announced December 2019.
-
Adversarial Robustness Guarantees for Classification with Gaussian Processes
Authors:
Arno Blaas,
Andrea Patane,
Luca Laurenti,
Luca Cardelli,
Marta Kwiatkowska,
Stephen Roberts
Abstract:
We investigate adversarial robustness of Gaussian Process Classification (GPC) models. Given a compact subset of the input space $T\subseteq \mathbb{R}^d$ enclosing a test point $x^*$ and a GPC trained on a dataset $\mathcal{D}$, we aim to compute the minimum and the maximum classification probability for the GPC over all the points in $T$. In order to do so, we show how functions lower- and upper…
▽ More
We investigate adversarial robustness of Gaussian Process Classification (GPC) models. Given a compact subset of the input space $T\subseteq \mathbb{R}^d$ enclosing a test point $x^*$ and a GPC trained on a dataset $\mathcal{D}$, we aim to compute the minimum and the maximum classification probability for the GPC over all the points in $T$. In order to do so, we show how functions lower- and upper-bounding the GPC output in $T$ can be derived, and implement those in a branch and bound optimisation algorithm. For any error threshold $ε> 0$ selected a priori, we show that our algorithm is guaranteed to reach values $ε$-close to the actual values in finitely many iterations. We apply our method to investigate the robustness of GPC models on a 2D synthetic dataset, the SPAM dataset and a subset of the MNIST dataset, providing comparisons of different GPC training techniques, and show how our method can be used for interpretability analysis. Our empirical analysis suggests that GPC robustness increases with more accurate posterior estimation.
△ Less
Submitted 11 March, 2020; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Statistical Guarantees for the Robustness of Bayesian Neural Networks
Authors:
Luca Cardelli,
Marta Kwiatkowska,
Luca Laurenti,
Nicola Paoletti,
Andrea Patane,
Matthew Wicker
Abstract:
We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for pr…
▽ More
We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for probabilistic models, we develop a framework that allows us to estimate probabilistic robustness for a BNN with statistical guarantees, i.e., with a priori error and confidence bounds. We provide experimental comparison for several approximate BNN inference techniques on image classification tasks associated to MNIST and a two-class subset of the GTSRB dataset. Our results enable quantification of uncertainty of BNN predictions in adversarial settings.
△ Less
Submitted 5 March, 2019;
originally announced March 2019.
-
Robustness Guarantees for Bayesian Inference with Gaussian Processes
Authors:
Luca Cardelli,
Marta Kwiatkowska,
Luca Laurenti,
Andrea Patane
Abstract:
Bayesian inference and Gaussian processes are widely used in applications ranging from robotics and control to biological systems. Many of these applications are safety-critical and require a characterization of the uncertainty associated with the learning model and formal guarantees on its predictions. In this paper we define a robustness measure for Bayesian inference against input perturbations…
▽ More
Bayesian inference and Gaussian processes are widely used in applications ranging from robotics and control to biological systems. Many of these applications are safety-critical and require a characterization of the uncertainty associated with the learning model and formal guarantees on its predictions. In this paper we define a robustness measure for Bayesian inference against input perturbations, given by the probability that, for a test point and a compact set in the input space containing the test point, the prediction of the learning model will remain $δ-$close for all the points in the set, for $δ>0.$ Such measures can be used to provide formal guarantees for the absence of adversarial examples. By employing the theory of Gaussian processes, we derive tight upper bounds on the resulting robustness by utilising the Borell-TIS inequality, and propose algorithms for their computation. We evaluate our techniques on two examples, a GP regression problem and a fully-connected deep neural network, where we rely on weak convergence to GPs to study adversarial examples on the MNIST dataset.
△ Less
Submitted 24 October, 2018; v1 submitted 17 September, 2018;
originally announced September 2018.