Adversarial Robustness of Bayesian Neural Networks

[Wic22] Matthew Wicker. Adversarial Robustness of Bayesian Neural Networks. Ph.D. thesis, Department of Computer Science, University of Oxford. 2022. [pdf] [bib]

Downloads:

pdf (6.09 MB) $bib$ bib

Links: [Google] [Google Scholar] [CiteSeer]

Abstract. This thesis puts forward methods for computing local robustness of probabilistic neural networks, specifically those resulting from Bayesian inference. In theory, applying Bayesian inference to the learning of neural network parameters carries the promise of solving many practically vexing problems that arise under the frequentist learning paradigm. In particular, Bayesian learning allows for principled architecture comparison and selection, the encoding of prior knowledge, and calibration of predictive uncertainties. Recent studies have shown that Bayesian learning can lead to more adversarially robust predictors. Though theoretically this is the case, and empirically has been shown in particular instances, anecdotal evidence of heightened robustness does not provide sufficient assurances for those who wish to deploy Bayesian deep learning in a safety-critical context. While methods exist for arriving at guarantees of robustness for deterministic neural networks, the probabilistic nature of Bayesian neural network weights renders these methods inoperable.

In this thesis, we investigate concepts of robustness for Bayesian neural networks, which allow for robustness guarantees which consider both the stochasticity of the model as well as the model’s decision. We provide methodologies which compute these quantities for a given Bayesian neural network with either a priori statistical guarantees on the precision of our estimates, or probabilistic upper and lower bounds which are provably sound. Finally, we consider robustness as a primary desideratum in the Bayesian inference of neural network parameters and demonstrate how to modify the likelihood in order to infer a posterior distribution with favorable robustness properties. The modification of the likelihood make our method transparent to the approximate inference technique for Bayesian neural networks.

We assess the practical applicability of our proposed methodology using Bayesian neural networks trained on several real-world datasets including airborne collision avoidance and traffic sign recognition. Additionally, we assess the robustness of Bayesian posterior distributions approximately inferred using five different approximate inference methods. We find that our methodology provides the first provable robustness guarantees for Bayesian neural networks, thus enabling their deployment in safety-critical scenarios. Further, our proposed methodology for robust Bayesian inference of neural network parameters enables us infer posterior distributions which have greatly heightened provable robustness even on full-color images.