Robustness Evaluation of Deep Neural Networks with Provable Guarantees

[Wu20] Min Wu. Robustness Evaluation of Deep Neural Networks with Provable Guarantees. Ph.D. thesis, Department of Computer Science, University of Oxford. May 2020. [pdf] [bib]

Downloads:

pdf (26.58 MB) $bib$ bib

Links: [Google] [Google Scholar] [CiteSeer]

Abstract. This thesis presents methodologies to guarantee the robustness of deep neural networks, thus facilitating the deployment of deep learning techniques in safety-critical real-world systems. We study the maximum safe radius of a network with respect to an input, such that all the points within the radius are guaranteed to be safe, while, if exceeding the radius, there must exist an adversarial example. We extend the maximum safe radius to two variants: the expected maximum safe radius to evaluate global robustness of a dataset, and the maximum safe radius w.r.t optical flow when the input is a video. We also study the feature robustness problem to quantify the robustness of features, extracted from an input, to adversarial perturbations. Specifically, we develop tensor-based parallelisation algorithms to compute the (expected) maximum safe radius of networks on (a set of) pixel-level images. For features of an image, we propose a game-based framework to compute the maximum safe radius and the feature robustness properties, where Player I selects features and Player II determines pixels within the feature to manipulate. Subsequently, we extend this game framework to video inputs to compute the maximum safe radius w.r.t optical flow, with Player I choosing flows and Player II imposing modifications within the flow. As our work applies to large neural networks and high-dimensional inputs, we calculate the upper and lower bounds to approximate the maximum safe radius and the feature robustness, and guarantee that they converge to the optimal value, by utilising Lipschitz continuity. We implement the algorithms into three tools, DeepTRE, DeepGame, and DeepVideo, and demonstrate their effectiveness on benchmark image and video datasets.