arxiv: v1 [cs.lg] 6 Dec 2018

Size: px
Start display at page:

Download "arxiv: v1 [cs.lg] 6 Dec 2018"

Transcription

1 Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, and Ruitong Huang Borealis AI arxiv: v1 [cs.lg] 6 Dec 2018 Abstract We propose Max-Margin Adversarial (MMA) training for directly maximizing the input space margin. This margin maximization is direct, in the sense that the margin s gradient w.r.t. model parameters can be shown to be parallel with the loss gradient at the minimal length perturbation, thus gradient ascent on margins can be performed by gradient descent on losses. We further propose a specific formulation of MMA training to maximize the average margin of training examples in order to train models that are robust to adversarial perturbations. It is implemented by performing adversarial training on a novel adaptive norm projected gradient descent (AN-PGD) attack. Preliminary experimental results demonstrate that our method outperforms the existing state of the art methods. In particular, testing against both whitebox and transfer projected gradient descent attacks on MNIST, our trained model improves the SOTA l ɛ = 0.3 robust accuracy by 2%, while maintaining the SOTA clean accuracy. Furthermore, the same model provides, to the best of our knowledge, the first model that is robust at l ɛ = 0.4, with a robust accuracy of 86.51%. 1 Introduction Despite their impressive performance on various learning tasks, neural networks have been shown to be vulnerable. An otherwise highly accurate network can be completely fooled by an artificially constructed perturbation imperceptible to human perception, known as the adversarial attack (Szegedy et al., 2013; Biggio et al., 2013). Not surprisingly, numerous algorithms in defending adversarial attacks have already been proposed in the literature which, arguably, can be interpreted as different ways in increasing the margins, i.e. the smallest distance from the sample point to the decision boundary induced by the network. Obviously, adversarial robustness is equivalent to large margins. One type of the algorithms is to use regularization in the learning to enforce the Lipschitz constant of the network (Cisse et al., 2017; Ross and Doshi-Velez, 2017; Hein and Andriushchenko, 2017; Tsuzuku et al., 2018), thus a small loss sample point would have a large margin since the loss cannot increase too fast. If the Lipschitz constant is regularized on data points, it is usually too local and not accurate in a neighborhood; if it is controlled globally, the constraint on the model is often too strong that it harms accuracy. So far, such methods seem not able to achieve very robust models. There are also efforts using first-order approximation to estimate and maximize input space margin (Elsayed et al., 2018; Sokolic et al., 2017; Matyasko and Chau, 2017). Similarly to local Lipschitz regularization, the reliance on local information might not provide accurate margin estimation and efficient maximization. Indeed, many defending methods have been broken Preprint. Work in progress. 1

2 Figure 1: Illustration of decision boundary, minimal length perturbation, margin and etc. after their proposal (Carlini and Wagner, 2016, 2017a; Athalye et al., 2018). Adversarial training (Szegedy et al., 2013; Huang et al., 2015; Madry et al., 2017) is one of the few defense methods that has stood the test of time, and in fact, represents the current SOTA results on several datasets including MNIST. However, as illustrated in Section 1.2, the update in adversarial training does not necessarily increase the margins, which in return diminishes the effectiveness of adversarial training. In this paper, we propose to directly maximize the decision boundary on each training sample, called Max-Margin Adversarial (MMA) training, to achieve the greatest possible robustness. Interestingly, we show that our method, MMA training, has a close relationship with adversarial training. More specifically, we prove in this paper that the margin s gradient w.r.t. model parameter is parallel with the loss gradient at the minimal length perturbation. Such property makes SGD possible for the MMA objective, despite the model parameter is involved in the constraints. We further show that MMA training is an improvement over adversarial training, in the sense that it picks the right perturbation magnitude for adversarial training. We test our algorithms on MNIST and CIFAR10, demonstrating that our method outperforms adversarial training. Moreover, MMA training achieves 86.51% robust accuracy on MNIST against the l ɛ = 0.4 PGD attack. To our best knowledge, this is the first robust model in the literature that can still maintain good robustness against such a strong attack. 1.1 Notations We focus on K-class classification problem in this paper, and assume that the output of the model is a score function f θ (x) = (f 1 (x),..., f K (x)) which assign score f i (x) to the ith class. The predicted label of x is then decided by arg max i f i (x). We use the logit margin loss 1 L θ (x) = max j y f j (x) f y (x). Note that when L θ (x) < 0, the classification is correct, and when L θ (x) > 0, the classification is wrong. Given a model f θ and a data pair z = (x, y), one can compute the minimal length perturbation, δ by δ = arg min δ s.t. L θ (x + δ, y) 0, (1) δ as illustrated in Figure 1. Denote the margin of example z by d θ (z) = δ, which is the (minimal) distance from the input x to the decision boundary L θ (z) = 0 in the input space. Note that Eq. (1) is also compatible with misclassified samples, as for a sample z that is misclassified by f, the optimal δ would be 0, hence d θ (z) = 0. Throughout this paper, we use the single word margin to denote input space margin. For other notions of margin, we will use specific phrases, e.g. output space 1 Since the scores (f 1(x),..., f K(x)) output by f are also called logits in neural network literature. 2

3 Figure 2: Fixed-norm adversarial training does not necessarily improve robustness. margin or logit space margin. All the proofs are postponed to Appendix A. 1.2 Adversarial Training Does NOT Necessarily Increase Margin We use an imaginary example in this section to illustrate that one update step of adversarial training does not necessarily increase margin. Adversarial training (Huang et al., 2015) is based on a minmax formulation as min max L θ(x + δ, y) (2) θ δ ɛ where the worst-case loss is minimized under a fixed perturbation magnitude ɛ. Consider a onedimensional example in Figure 2, where the input example x is a scalar. We perturb x in the positive direction with perturbation δ. We assume L(δ) = L θ (x + δ, y) is monotonically on increasing on δ, which means that larger perturbation is always stronger. Let L 0 ( ) denote the original function before an update step, and δ0 denote the corresponding margin (minimal length perturbation). Next, an update is made to the parameter of L 0 ( ), such that L 0 (ɛ) is reduced. After this update, we imagine two different possibilities of the updated loss functions, L α ( ) in blue dotted-dash line, and L β ( ) in red dash line. L α (ɛ) and L β (ɛ) have the same decreased value at ɛ. However, we can see that L α ( ) has an increased margin δα > δ0, and L β( ) has a decreased margin δβ < δ 0. The second case is undesired since it is less robust as the margin decreases. 2 Max-Margin Adversarial Training This section is devoted to the presentation of our method, MMA training. Before presenting it, we still need some preparation. We calculated the gradient of the margin in Section 2.1. Our method MMA training is presented in Section 2.2. Lastly in Section 2.3, we discussed the relationship between MMA training and adversarial training. 3

4 2.1 Gradients of the Margins In this section, we show how to calculate gradients of the margins, d θ (z), on individual examples, as a preparation for presenting MMA training. Computing such gradients for a linear model is relatively easy due to the existence of its closed-form solution, e.g. SVM. But it is not clear for general functions such as neural networks. Recall that d θ (z) = δ = min L(δ,θ) 0 δ. It is easy to see that for a wrongly classified example z, δ is achieved at 0 and thus θ d θ (z) = 0. Therefore in this section, we focus on correctly classified data examples. Denote the Lagrangian as L(δ, λ), where L(δ, λ) = δ + λl(δ, θ). Also for a fixed θ, denote the optimizers of δ and λ by δ and λ. The following theorem provides an efficient way to compute θ d θ (z). The exact implementation of our algorithm is presented in Section 2.2. Theorem 2.1. Assume that ɛ(δ) = δ and L(δ, θ) are C 2 functions almost everywhere 2. Also assume that the matrix M is full rank almost everywhere, where Then, M = ( 2 ɛ(δ ) δ 2 + λ 2 L(δ,θ) L(δ,θ) δ δ 2 L(δ,θ) δ θ d θ (z) L(δ, θ). θ Remark 2.1. The condition on the matrix M is serving as a technical condition to guarantee that the implicit function theorem is applicable and thus the gradient can be computed. 0 ). 2.2 Max-Margin Adversarial Training Motivated by the observation in Section 1.2, MMA training directly uses margins in its objective: max d θ (z i ) J θ (z j ) θ, (3) i S + θ where J θ ( ) is a regular classification loss function, e.g. cross-entropy loss, which can be different from L θ ( ). S + θ = {i : L θ (z i ) < 0} is the set of all correctly classified examples, and S θ = {i : L θ (z i ) 0} is the set of wrongly classified examples. Intuitively, MMA training tries to minimize classification loss on wrongly classified examples in S θ, until they are correctly classified. For already correctly classified examples, MMA training tries to maximize the margin d θ (z i ). Note that we do not maximize margins on wrongly classified examples, since the gradient of their margins are 0, as we discussed in Section 2.1. Given the Eq. (3) and Theorem 2.1, we can implement MMA training using SGD. It is straightforward to calculate θ J θ (z j ). Based on Theorem 2.1, we can calculate the θ d θ (z i ) as the gradient of the loss at the minimal length perturbation δ. Finding the exact δ is intractable in general settings 3. Here we propose an algorithm to give an approximate solution of δ, the Adaptive Norm Projective Gradient Descent Attack (AN-PGD). In AN-PGD, we apply PGD (Madry et al., 2017) twice to find an approximation. In particular, if we know the true margin ɛ = δ of a data point x in advance, we can use PGD with length ɛ, j S θ 2 All the l p norm with p 0, including p =, satisfies this condition. 3 Exceptions exist, for example linear SVM. 4

5 Algorithm 1 Adaptive Norm Projected Gradient Descent Attack. Inputs: (x, y) is the data example. ɛ init is the initial norm constraint used in the first PGD attack. Outputs: δ is the approximate minimal length perturbation. Parameters: ɛ inc is the increment of perturbation length. ɛ max is the maximum perturbation length. PGD(x, y, ɛ) represents PGD (Madry et al., 2017) perturbation δ with maximum perturbation length ɛ. 1: Adversarial example δ 1 = PGD(x, y, ɛ init ) 2: Unit perturbation δ unit = δ 1 δ 1 3: if prediction on x + δ 1 is correct then 4: ɛ new = arg min η L(x + ηδ unit, y), η [ɛ init, min{ɛ init + ɛ inc, ɛ max }] 5: else 6: ɛ new = arg min η L(x + ηδ unit, y), η [0, ɛ init ) 7: end if 8: δ = PGD(x, y, ɛ new ) PGD(x, y, ɛ ), to find an approximation of δ. Since we do not know ɛ, we need to approximate it first. Therefore, we first perform a PGD attack on x with a fixed perturbation length ɛ 0 to get δ 1, then we search along the direction of δ 1 to find a scaled perturbation that gives L = 0, whose length is used to approximate ɛ. Algorithm 1 describes the details of the AN-PGD. 4 With the proposed AN-PGD attack, we can implement MMA training easily as described in Algorithm 2. Note that AN-PGD here only serves as an algorithm to give an approximate solution of δ, and it can be decoupled from the rest parts of MMA training. Other attacks that can serve a similar purpose can also fit into our MMA training framework, for example, the Carlini-Wagner attack (Carlini and Wagner, 2017b) 5, and the Decoupled Direction and Norm (DDN) attack (Rony et al., 2018) which is concurrent with our work. 2.3 Relation to Adversarial Training As suggested by Theorem 2.1, MMA training is closely related to adversarial training. In this section, we prove that MMA training is actually an improvement on adversarial training, in the sense that it picks the right perturbation magnitude ɛ for adversarial training. Given a sample z = (x, y) and a fixed length ɛ, recall that adversarial training learns a model f by minimizing the adversarial loss, min J θ (ɛ) = min max L θ(x + δ, y) θ θ δ ɛ For simplicity, we denote L θ (x + δ, y) by L θ (δ) in this section. The next theorem shows that the optimal f that maximizes the margin around the decision boundary also minimizes the adversarial loss given the perturbation magnitude being the right margin. Theorem 2.2. Assume that l L θ (0) Range(L θ ), let θ MMA = arg max θ min Lθ (δ) l δ. Also let ɛ = min LθMMA (δ) l δ, and θ adv = arg min θ max δ ɛ L θ (δ), then θ MMA also minimizes the adversarial loss, i.e. max L δ ɛ θ MMA (δ) = max L δ ɛ θ adv (δ). 4 The zero-crossing binary search is denoted as arg min η L(η) for brevity. 5 Carlini-Wagner attack specifically might suffer from computational issues due to the very high number of iterations it requires. 5

6 Algorithm 2 Max-Margin Adversarial Training. Inputs: The training set {(x i, y i )}. Outputs: the trained neural network model f θ ( ). Parameters: ɛ contains perturbation lengths of training data. ɛ inc is the increment of perturbation length. ɛ min is the minimum perturbation length. ɛ max is the maximum perturbation length. A(x, y, ɛ init ) represents the approximate minimal length perturbation returned by the algorithm A on the data example (x, y) and at the initial norm ɛ init. 1: Randomly initialize the parameter θ of model f, and initialize every element of ɛ as ɛ min 2: repeat 3: Read minibatch B = {(x 1, y 1 ),..., (x m, y m )} from the training set 4: Make predictions on B and into two: wrongly predicted B 0 and correctly predicted B 1 5: for each (x i, y i ) in B 1 do 6: Retrieve perturbation length ɛ i from ɛ 7: δ i = A(x i, y i, ɛ i ) 8: Update the ɛ i in ɛ as δ i, and put (x i + δ i, y i) into B adv 1 9: end for 10: Calculate the gradient of margin on B adv 1 and the gradient of classification loss on B 0 11: Perform one step gradient step update on θ 12: until training converged or maximum number of steps reached Note that an analogy of Theorem 2.2, that the optimal f that minimizes the adversarial loss also maximizes the margin given the corresponding threshold l, doesn t hold due to the reason that picking a perturbation magnitude ɛ may be overshooting the margin. The next theorem shows that if ɛ is picked tightly, then the above claim is true. Therefore, MMA training can also be interpreted as a way of improving adversarial training by finding the proper ɛ for each training sample, which is exactly the flavor of Algorithm 2. Theorem 2.3. Given ɛ 0, let θ adv = arg min θ max δ ɛ L θ (δ). Also let l = max δ ɛ L θadv (δ), and θ MMA = arg max θ min Lθ (δ) l δ. Further assume that min δ = ɛ, L θadv (δ) l then θ adv also maximizes the margin distance, i.e. (No overshooting assumption) min δ = min δ. L θadv (δ) l L θmma (δ) l Remark 2.2. The above theorems only focus on the case of one single sample, and they cannot be directly extended to the whole training set. In practice, the behaviour of the algorithm could be much more complicated. 3 Preliminary Results on Adversarial Robustness We test the max-margin adversarial (MMA) training on the MNIST and CIFAR10 datasets under l and l 2 perturbations. We compare our models with PGD trained models (Madry et al., 2017) which represent the SOTA robust models in the literature 6. MMA trained algorithm outperforms PGD trained models, and particularly, MMA training is the first method in the literature, to our best knowledge, that can defend l PGD attacks with length 0.4 on MNIST. 6 All the models are trained and tested under the same type of norm. 6

7 (a) MNIST l (b) MNIST l 2 (c) CIFAR10 l (d) CIFAR10 l 2 Figure 3: Robust accuracy vs perturbation length for MNIST and CIFAR10 under l and l 2 perturbations MNIST under l perturbation We follow a similar test protocol to Madry et al. (2017) using the l PGD attack. Our baseline is the PGD trained model with ɛ = Figure 3a shows the robust accuracies under white-box PGD l perturbations with different norm constraints, all with the number of iterations equals 100. We can see that the MMA trained model is better than the PGD trained model at all ɛ s. When ɛ > 0.3, the MMA trained model is significantly better than PGD trained model and is able to achieve good robustness for very large ɛ s, e.g To our best knowledge, this is so far the only algorithm that can defend l PGD attacks with length 0.4 in the literature. We further perform extensive additional experiments to attack the MMA trained model. We test our models with 50 random restarts under both white-box PGD attacks and transfer attacks generated by attacking a PGD model and a standardly trained model. Note that we always pick the strongest attack among random restarts. We also test using 40 iterations under the transfer attack case to avoid the situation where the attacks overfit to the source model. Table 1 shows all the results. Under these considered attacks, when ɛ = 0.3, MMA model yields a 2% robust accuracy improvement over the PGD model (92.86% vs 90.65%). When ɛ = 0.4, MMA training still achieves robust accuracy at 86.51%, while the PGD model is completely cracked. Notice that the strongest attack on MMA trained model is the transfer attack generated from PGD model. This could indicate that the white-box robustness of the MMA model is partially achieved due to a strong degree of flatness in the loss surface about the input data point. Despite this, however, the MMA model still manages to defend against the various transfer attacks tested. We are continuing to benchmark MMA models in ongoing work. 7 We have also attempted to train PGD models with ɛ = 0.4, but training didn t converge. 7

8 Another notable phenomenon is that MMA trained model is even robust to white-box l PGD attack with ɛ = 0.5, when the PGD is performed on the MMA trained model itself. It has the robust accuracy 89.76% under a single attack, and 70.30% under 50 random starts. This is a surprising phenomenon in the sense that a well-crafted perturbation at ɛ = 0.5 should decrease the accuracy to approximately 10%, where a gray image with constant pixel value 0.5 can already achieve. However, as shown in Table 1, when being attacked by transfer attacks from the standardly trained model, the MMA trained model s robust accuracy can be reduced to 10.96%. These observations combined could suggest that, when ɛ is large, PGD attack gets stuck in poor local minima on the MMA trained models, and we are further investigating this interesting phenomenon. MNIST under l 2 perturbation We compare our MMA trained model with models trained on fixed length PGD perturbations at ɛ = 1, 3 and 5, as shown in Figure 3a. We benchmark using the boundary attack (Brendel et al., 2017), as we observe that it is stronger than the PGD l 2 attack, which is consistent with the observation in Anonymous (2019). We can see that MMA model is not always better than the best PGD model at a given perturbation length, but it provides similar performance compared with the best PGD model at all ɛ s. Also, the MMA model outperforms PGD models on large perturbations, e.g. ɛ = 5, even when the PGD model is specifically trained on perturbations with that distortion level. Having robustness against the gradient-free boundary attack, to some degree, also indicates that, the l 2 robustness of the MMA trained model is likely not purely due to flatness in the loss surface about the input data point. CIFAR10 For l perturbation, we compare the MMA model with models trained with PGD ɛ = 8/255 and ɛ = 16/255, 8 as shown in Figure 3c. For l 2 perturbation, we compare the MMA trained model with models trained with PGD ɛ = 0.5 and ɛ = 1.5, 9 as shown in Figure 3d. For both the l and l 2 cases, when MMA models have similar clean accuracy to PGD models (l 2 : PGD ɛ = 0.5 vs MMA, l : PGD ɛ = 8/255 vs MMA), they have similar robust accuracy for small perturbation lengths, and MMA training outperforms PGD training when perturbation length becomes larger. It is possible that models trained with larger PGD attack are more robust at a certain band of ɛ, but this is at the cost of losing clean accuracy. 4 Other Related Works Several other related works also exist in the literature, as reviewed in this section. First-order Large Margin Previous works (Elsayed et al., 2018; Sokolic et al., 2017; Matyasko and Chau, 2017) have attempted to use first-order approximation to estimate the input space margin. For first-order methods, the margin will be accurately estimated when the classification function is linear. MMA s margin estimation is exact when the minimal length perturbation δ can be solved, which is not only satisfied by linear models, but also by a broader range of models, e.g. models that are convex w.r.t. input x. This relaxed condition could potentially enable more accurate margin estimation which improves MMA training s performance. 8 Training with l ɛ = 32/255 did not converge. 9 Training with l 2 ɛ = 4.5 did not converge. 8

9 (Cross-)Lipschitz Regularization On the other spectrum, Tsuzuku et al. (2018) enlarges their margin by controlling the global Lipschitz constant, which in return places a strong constraint on the model and harms its learning capabilities. Instead, our method, alike adversarial training, uses adversarial attacks to estimate the margin to the decision boundary. With a strong method, our estimate is much more precise in the neighborhood around the data point, while being much more flexible due to not relying on a global Lipschitz constraint. Hard-Margin SVM (Vapnik, 2013) in the separable case Assuming that all the training examples are correctly classified and using our notations on general classifiers, the hard-margin SVM objective can be written as: { } arg max min d θ (z i ) s.t. L θ (z i ) < 0, i. (4) θ i On the other hand, under the same separable and correct assumptions, MMA training formulation in Eq. (3) can be written as { } arg max θ d θ (z i ) s.t. L θ (z i ) < 0, i, (5) i which is maximizing the average margin rather than the minimum margin in SVM. Note that the theorem on gradient calculation of the margin in Section 2.1 also applies to the SVM formulation of differentiable functions. Because of this, we can also use SGD to solve the following SVM-style formulation, which we delay to future work: max θ min i S + θ d θ (z i ) j S θ J θ (z j ). (6) 5 Conclusion In this paper, we propose max-margin adversarial (MMA) training. We show that the input space margin s gradient w.r.t. the model parameter can be calculated with the loss gradient on the minimal length perturbation. This enables gradient descent on objectives that involves input space margin. We propose a specific formulation of max-margin adversarial training to maximize the average margin motivated by improving adversarial robustness. Our preliminary results on MNIST and CIFAR10 show that MMA training is very promising for training robust neural networks, and notably, the MMA training model achieves decent robustness (86.51%) against l PGD attack with ɛ = 0.4 which, to our best knowledge, is first time shown in the literature. References Anonymous (2019). Towards the first adversarially robust neural network model on mnist. In Submitted to International Conference on Learning Representations. under review. 9

10 Athalye, A., Carlini, N., and Wagner, D. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arxiv preprint arxiv: Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., and Roli, F. (2013). Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases, pages Springer. Brendel, W., Rauber, J., and Bethge, M. (2017). Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2016). Defensive distillation is not robust to adversarial examples. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2017a). Adversarial examples are not easily detected: Bypassing ten detection methods. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2017b). Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on, pages IEEE. Cisse, M., Bojanowski, P., Grave, E., Dauphin, Y., and Usunier, N. (2017). Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages Elsayed, G. F., Krishnan, D., Mobahi, H., Regan, K., and Bengio, S. (2018). Large margin deep networks for classification. arxiv preprint arxiv: Hein, M. and Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. arxiv preprint arxiv: Huang, R., Xu, B., Schuurmans, D., and Szepesvári, C. (2015). Learning with a strong adversary. arxiv preprint arxiv: Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arxiv preprint arxiv: Matyasko, A. and Chau, L.-P. (2017). Margin maximization for robust classification using deep learning. In Neural Networks (IJCNN), 2017 International Joint Conference on, pages IEEE. Rony, J., Hafemann, L. G., Oliveira, L. S., Ayed, I. B., Sabourin, R., and Granger, E. (2018). Decoupling direction and norm for efficient gradient-based l2 adversarial attacks and defenses. arxiv preprint arxiv: Ross, A. S. and Doshi-Velez, F. (2017). Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. arxiv preprint arxiv: Sokolic, J., Giryes, R., Sapiro, G., and Rodrigues, M. R. (2017). Robust large margin deep neural networks. IEEE Transactions on Signal Processing. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2013). Intriguing properties of neural networks. arxiv preprint arxiv: Tsuzuku, Y., Sato, I., and Sugiyama, M. (2018). Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks. arxiv preprint arxiv: Vapnik, V. (2013). The nature of statistical learning theory. Springer science & business media. 10

11 Table 1: MNIST: Performance of the adversarially trained models against different l adversaries for ɛ = 0.3, 0.4, 0.5. For each ɛ, we show the most successful attack with bold. The source networks used for the attack are: the MMA trained model (MMA) (white-box attack for MMA, transfer attack for PGD), the PGD trained model (PGD) (white-box attack for PGD, transfer attack for MMA), the standardly trained model (STD) (transfer attack for both MMA and PGD). Attack Method ɛ Steps Restarts Source Model Accuracy MMA Model Accuracy PGD Model Natural % 98.63% PGD MMA 96.64% 97.99% PGD MMA 93.64% 96.71% FGSM PGD 95.5% 94.62% PGD PGD 94.65% 92.40% PGD PGD 92.86% 90.65% PGD PGD 94.65% 92.37% PGD PGD 92.87% 90.64% FGSM STD 95.46% 93.45% PGD STD 95.86% 93.97% PGD STD 94.38% 92.53% PGD STD 95.86% 93.97% PGD STD 94.41% 92.55% PGD MMA 95.33% 96.14% PGD MMA 89.99% 89.74% FGSM PGD 94.16% 88.18% PGD PGD 91.18% 9.89% PGD PGD 86.69% 2.83% PGD PGD 91.16% 8.72% PGD PGD 86.51% 2.42% FGSM STD 91.51% 38.23% PGD STD 93.21% 15.38% PGD STD 89.12% 5.23% PGD STD 93.18% 15.17% PGD STD 89.14% 5.07% PGD MMA 89.76% 60.36% PGD MMA 70.30% 26.02% FGSM PGD 88.39% 51.67% PGD PGD 48.05% 0.00% PGD PGD 15.50% 0.00% PGD PGD 45.75% 0.00% PGD PGD 14.20% 0.00% FGSM STD 32.96% 4.45% PGD STD 34.05% 0.00% PGD STD 10.96% 0.00% PGD STD 33.66% 0.00% PGD STD 11.01% 0.00% 11

12 A Proofs A.1 Proof of theorem 2.1 Proof. Recall that z = (x, y) and ɛ(δ) = δ. Here we compute the gradient for d θ (z) in its general form. Consider the following optimization problem: d θ (z) = min δ (θ) ɛ(δ), where (θ) = {δ : L θ (x + δ, y) = 0}, ɛ and g are both C 2 functions 10. Denotes its Lagrangian by L(δ, λ), where L(δ, λ) = ɛ(δ) + λl θ (x + δ, y) For a fixed θ, the optimizer δ and λ must satisfy the first-order conditions (FOC) ɛ(δ) + λ L θ(x + δ, y) δ δ = 0, (7) δ=δ,λ=λ Put the FOC equations in vector form, G((δ, λ), θ) = L θ (x + δ, y) δ=δ = 0. ( ɛ(δ) δ + λ L θ(x+δ,y) δ L θ (x + δ, y) ) δ=δ,λ=λ = 0. Note that G is C 1 continuously differentiable since ɛ and g are C 2 functions. Furthermore, the Jacobian matrix of G w.r.t (δ, λ) is ( ) 2 ɛ(δ ) (δ,λ) G((δ, λ + λ 2 g(θ,δ ) g(θ,δ ) ), θ) = δ 2 δ 2 δ 0 g(θ,δ ) δ which by assumption is full rank. Therefore, by the implicit function theorem, δ and λ can be expressed as a function of θ, denoted by δ (θ) and λ (θ). To further compute θ d θ (z), note that d θ (z) = ɛ(δ (θ)). Thus, θ d θ (z) = ɛ(δ ) δ (θ) δ θ = λ g(θ, δ ) δ δ (θ), (8) θ where the second equality is by Eq. (7). The implicit function theorem also provides a way of computing δ (θ) θ which is complicated involving taking inverse of the matrix (δ,λ) G((δ, λ ), θ). Here we present a relatively simple way to compute this gradient. Note that Differentiate with w.r.t. θ on both sides: Combining Eq. (8) and Eq. (9), g(θ, δ ) θ g(θ, δ (θ)) = 0. + g(θ, δ ) δ (θ) = 0. (9) δ θ θ d θ (z) = λ (θ) g(θ, δ ). (10) θ 10 Note that a simple application of Danskin s theorem would not be valid as the constraint set (θ) depends on the parameter θ. 12

13 A.2 A lemma for later proofs The following lemma helps relate the objective of adversarial training with that of our MMA training. Here, we denote L θ (x + δ, y) as L θ (δ) for simplicity. Lemma A.1. Given (x, y) and θ, assume that L θ (δ) is continuous in δ, then for ɛ 0, and l L θ (0) Range(L θ ), it holds that min δ = ɛ = max L θ(δ) = l ; (11) L θ (δ) l δ ɛ max L θ(δ) = l = δ ɛ min δ ɛ. (12) L θ (δ) l Proof. Eq. (11). We prove this by contradiction. Suppose max δ ɛ L θ (δ) > l. When ɛ = 0, this violates our asssumption l L θ (0) in the theorem. So assume ɛ > 0. Since L θ (δ) is a continuous function defined on a compact set, the maximum is attained by δ such that δ ɛ and L θ ( δ) > l. Note that L θ is continuous and l L θ (0), then there exists δ 0, δ i.e. the line segment connecting 0 and δ, such that δ < ɛ and L θ ( δ) = l. This follows from the intermediate value theorem by restricting L θ (δ) onto 0, δ. This contradicts min Lθ (δ) l δ = ɛ. If max δ ɛ L θ (δ) < l, then {δ : δ ɛ} {δ : L θ (δ) < l}. Every point p {δ : δ ɛ} is in the open set {δ : L θ (δ) < l}, there exists an open ball with some radius r p centered at p such that B rp {δ : L θ (δ) < l}. This forms an open cover for {δ : δ ɛ}. Since {δ : δ ɛ} is compact, there is an open finite subcover U ɛ such that: {δ : δ ɛ} U ɛ {δ : L θ (δ) < l}. Sicne U ɛ is finite, there exists h > 0 such that {δ : δ ɛ + h} {δ : L θ (δ) < l}. Thus {δ : L θ (δ) l} {δ : δ > ɛ + h}, contradicting min Lθ (δ) l δ = ɛ again. Eq. (12). Assume that min Lθ (δ) l δ > ɛ, then {δ : L θ (δ) l} {δ : δ > ɛ}. Taking complementary set of both sides, {δ : δ ɛ} {δ : L θ (δ) < l}. Therefore, by the compactness of {δ : δ ɛ}, max δ ɛ L θ (δ) < l, contradiction. A.3 Proof of theorem 2.2 Proof. By the definition of θ adv, We further prove it is not possible that max L δ ɛ θ MMA (δ) max L δ ɛ θ adv (δ). max L δ ɛ θ MMA (δ) > max L δ ɛ θ adv (δ). If so, by the first implication in Lemma A.1 and the definition of ɛ, we have l = max δ ɛ L θ MMA (δ) > max δ ɛ L θ adv (δ), and thus with similar arguments to the proof of Lemma A.1, contradicting the definition of θ MMA. Therefore, min δ > L θadv (δ) l ɛ = δ (θ MMA ), max δ ɛ L θ MMA (δ) = max δ ɛ L θ adv (δ). 13

14 A.4 Proof of Theorem 2.3 Proof. By the definition of θ MMA, We further prove it is not possible that If so, by the no overshooting assumption, min δ min δ. L θmma (δ) l L θadv (δ) l min δ > min δ. L θmma (δ) l L θadv (δ) l min δ > ɛ. L θmma (δ) l and thus with similar arguments to the proof of the second implication in Lemma A.1, contradicting the definition of θ adv. Therefore, max δ ɛ L θ MMA (δ) < l, max L δ ɛ θ MMA (δ) = max L δ ɛ θ adv (δ). 14

Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training

Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, and Ruitong Huang Borealis AI arxiv:1812.02637v2

More information

Adversarially Robust Optimization and Generalization

Adversarially Robust Optimization and Generalization Adversarially Robust Optimization and Generalization Ludwig Schmidt MIT UC Berkeley Based on joint works with Logan ngstrom (MIT), Aleksander Madry (MIT), Aleksandar Makelov (MIT), Dimitris Tsipras (MIT),

More information

ON THE SENSITIVITY OF ADVERSARIAL ROBUSTNESS

ON THE SENSITIVITY OF ADVERSARIAL ROBUSTNESS ON THE SENSITIVITY OF ADVERSARIAL ROBUSTNESS TO INPUT DATA DISTRIBUTIONS Anonymous authors Paper under double-blind review ABSTRACT Neural networks are vulnerable to small adversarial perturbations. Existing

More information

Towards ML You Can Rely On. Aleksander Mądry

Towards ML You Can Rely On. Aleksander Mądry Towards ML You Can Rely On Aleksander Mądry @aleks_madry madry-lab.ml Machine Learning: The Success Story? Image classification Reinforcement Learning Machine translation Machine Learning: The Success

More information

Measuring the Robustness of Neural Networks via Minimal Adversarial Examples

Measuring the Robustness of Neural Networks via Minimal Adversarial Examples Measuring the Robustness of Neural Networks via Minimal Adversarial Examples Sumanth Dathathri sdathath@caltech.edu Stephan Zheng stephan@caltech.edu Sicun Gao sicung@ucsd.edu Richard M. Murray murray@cds.caltech.edu

More information

arxiv: v1 [cs.lg] 30 Oct 2018

arxiv: v1 [cs.lg] 30 Oct 2018 On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models arxiv:1810.12715v1 [cs.lg] 30 Oct 2018 Sven Gowal sgowal@google.com Rudy Bunel University of Oxford rudy@robots.ox.ac.uk

More information

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu * 1 Uyeong Jang * 2 Jiefeng Chen 2 Lingjiao Chen 2 Somesh Jha 2 Abstract In this paper we study leveraging

More information

arxiv: v1 [stat.ml] 15 Mar 2018

arxiv: v1 [stat.ml] 15 Mar 2018 Large Margin Deep Networks for Classification arxiv:1803.05598v1 [stat.ml] 15 Mar 2018 Gamaleldin F. Elsayed Dilip Krishnan Hossein Mobahi Kevin Regan Samy Bengio Google Research {gamaleldin,dilipkay,hmobahi,kevinregan,bengio}@google.com

More information

arxiv: v1 [cs.lg] 24 Jan 2019

arxiv: v1 [cs.lg] 24 Jan 2019 Cross-Entropy Loss and Low-Rank Features Have Responsibility for Adversarial Examples Kamil Nar Orhan Ocal S. Shankar Sastry Kannan Ramchandran arxiv:90.08360v [cs.lg] 24 Jan 209 Abstract State-of-the-art

More information

Notes on Adversarial Examples

Notes on Adversarial Examples Notes on Adversarial Examples David Meyer dmm@{1-4-5.net,uoregon.edu,...} March 14, 2017 1 Introduction The surprising discovery of adversarial examples by Szegedy et al. [6] has led to new ways of thinking

More information

TOWARDS DEEP LEARNING MODELS RESISTANT TO ADVERSARIAL ATTACKS

TOWARDS DEEP LEARNING MODELS RESISTANT TO ADVERSARIAL ATTACKS Published as a conference paper at ICLR 8 TOWARDS DEEP LEARNING MODELS RESISTANT TO ADVERSARIAL ATTACKS Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu Department of

More information

Limitations of the Lipschitz constant as a defense against adversarial examples

Limitations of the Lipschitz constant as a defense against adversarial examples Limitations of the Lipschitz constant as a defense against adversarial examples Todd Huster, Cho-Yu Jason Chiang, and Ritu Chadha Perspecta Labs, Basking Ridge, NJ 07920, USA. thuster@perspectalabs.com

More information

arxiv: v1 [cs.lg] 15 Nov 2017 ABSTRACT

arxiv: v1 [cs.lg] 15 Nov 2017 ABSTRACT THE BEST DEFENSE IS A GOOD OFFENSE: COUNTERING BLACK BOX ATTACKS BY PREDICTING SLIGHTLY WRONG LABELS Yannic Kilcher Department of Computer Science ETH Zurich yannic.kilcher@inf.ethz.ch Thomas Hofmann Department

More information

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu * 1 Uyeong Jang * 2 Jiefeng Chen 2 Lingjiao Chen 2 Somesh Jha 2 Abstract In this paper we study leveraging

More information

arxiv: v3 [cs.lg] 22 Mar 2018

arxiv: v3 [cs.lg] 22 Mar 2018 arxiv:1710.06081v3 [cs.lg] 22 Mar 2018 Boosting Adversarial Attacks with Momentum Yinpeng Dong1, Fangzhou Liao1, Tianyu Pang1, Hang Su1, Jun Zhu1, Xiaolin Hu1, Jianguo Li2 1 Department of Computer Science

More information

arxiv: v1 [stat.ml] 27 Nov 2018

arxiv: v1 [stat.ml] 27 Nov 2018 Robust Classification of Financial Risk arxiv:1811.11079v1 [stat.ml] 27 Nov 2018 Suproteem K. Sarkar suproteemsarkar@g.harvard.edu Daniel Giebisch * danielgiebisch@college.harvard.edu Abstract Kojin Oshiba

More information

arxiv: v3 [cs.lg] 8 Jun 2018

arxiv: v3 [cs.lg] 8 Jun 2018 Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope Eric Wong 1 J. Zico Kolter 2 arxiv:1711.00851v3 [cs.lg] 8 Jun 2018 Abstract We propose a method to learn deep ReLU-based

More information

ADVERSARIAL SPHERES ABSTRACT 1 INTRODUCTION. Workshop track - ICLR 2018

ADVERSARIAL SPHERES ABSTRACT 1 INTRODUCTION. Workshop track - ICLR 2018 ADVERSARIAL SPHERES Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, & Ian Goodfellow Google Brain {gilmer,lmetz,schsam,maithra,wattenberg,goodfellow}@google.com

More information

Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification

Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification 1 Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification Luiz G. Hafemann, Robert Sabourin, Member, IEEE, and Luiz S. Oliveira. arxiv:1901.03398v1 [cs.cv] 10

More information

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 AI Foundations Lab, IBM T. J. Watson Research Center, Yorktown

More information

ECE 595: Machine Learning I Adversarial Attack 1

ECE 595: Machine Learning I Adversarial Attack 1 ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology

More information

arxiv: v2 [stat.ml] 20 Nov 2017

arxiv: v2 [stat.ml] 20 Nov 2017 : Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 arxiv:1709.04114v2 [stat.ml] 20 Nov 2017 AI Foundations Lab,

More information

ECE 595: Machine Learning I Adversarial Attack 1

ECE 595: Machine Learning I Adversarial Attack 1 ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology

More information

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7. Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also

More information

SSCNets A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks

SSCNets A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks Hammad Tariq*, Hassan Ali*, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman,

More information

arxiv: v1 [cs.lg] 30 Jan 2019

arxiv: v1 [cs.lg] 30 Jan 2019 A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance Adi Shamir 1, Itay Safran 1, Eyal Ronen 2, and Orr Dunkelman 3 arxiv:1901.10861v1 [cs.lg] 30 Jan 2019 1 Computer

More information

Lecture 11. Linear Soft Margin Support Vector Machines

Lecture 11. Linear Soft Margin Support Vector Machines CS142: Machine Learning Spring 2017 Lecture 11 Instructor: Pedro Felzenszwalb Scribes: Dan Xiang, Tyler Dae Devlin Linear Soft Margin Support Vector Machines We continue our discussion of linear soft margin

More information

Large Margin Deep Networks for Classification

Large Margin Deep Networks for Classification Large Margin Deep Networks for Classification Gamaleldin F. Elsayed Google Research Dilip Krishnan Google Research Hossein Mobahi Google Research Kevin Regan Google Research Samy Bengio Google Research

More information

arxiv: v2 [cs.cr] 11 Sep 2017

arxiv: v2 [cs.cr] 11 Sep 2017 MagNet: a Two-Pronged Defense against Adversarial Examples arxiv:1705.09064v2 [cs.cr] 11 Sep 2017 ABSTRACT Dongyu Meng ShanghaiTech University mengdy@shanghaitech.edu.cn Deep learning has shown impressive

More information

arxiv: v1 [cs.cv] 27 Nov 2018

arxiv: v1 [cs.cv] 27 Nov 2018 Universal Adversarial Training arxiv:1811.11304v1 [cs.cv] 27 Nov 2018 Ali Shafahi ashafahi@cs.umd.edu Abstract Mahyar Najibi najibi@cs.umd.edu Larry S. Davis lsd@umiacs.umd.edu Standard adversarial attacks

More information

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017 Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training

More information

Adversarial Examples

Adversarial Examples Adversarial Examples presentation by Ian Goodfellow Deep Learning Summer School Montreal August 9, 2015 In this presentation. - Intriguing Properties of Neural Networks. Szegedy et al., ICLR 2014. - Explaining

More information

arxiv: v1 [math.oc] 7 Dec 2018

arxiv: v1 [math.oc] 7 Dec 2018 arxiv:1812.02878v1 [math.oc] 7 Dec 2018 Solving Non-Convex Non-Concave Min-Max Games Under Polyak- Lojasiewicz Condition Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee University of Southern California

More information

Negative Momentum for Improved Game Dynamics

Negative Momentum for Improved Game Dynamics Negative Momentum for Improved Game Dynamics Gauthier Gidel Reyhane Askari Hemmat Mohammad Pezeshki Gabriel Huang Rémi Lepriol Simon Lacoste-Julien Ioannis Mitliagkas Mila & DIRO, Université de Montréal

More information

arxiv: v2 [stat.ml] 4 Dec 2018

arxiv: v2 [stat.ml] 4 Dec 2018 Large Margin Deep Networks for Classification Gamaleldin F. Elsayed Google Research Dilip Krishnan Google Research Hossein Mobahi Google Research Kevin Regan Google Research arxiv:1803.05598v2 [stat.ml]

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

The Limitations of Model Uncertainty in Adversarial Settings

The Limitations of Model Uncertainty in Adversarial Settings The Limitations of Model Uncertainty in Adversarial Settings Kathrin Grosse 1, David Pfaff 1, Michael T. Smith 2 and Michael Backes 3 1 CISPA, Saarland Informatics Campus 2 University of Sheffield 3 CISPA

More information

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18 CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H

More information

arxiv: v4 [cs.cr] 27 Jan 2018

arxiv: v4 [cs.cr] 27 Jan 2018 A Learning and Masking Approach to Secure Learning Linh Nguyen, Sky Wang, Arunesh Sinha University of Michigan, Ann Arbor {lvnguyen,skywang,arunesh}@umich.edu arxiv:1709.04447v4 [cs.cr] 27 Jan 2018 ABSTRACT

More information

arxiv: v1 [cs.lg] 4 Mar 2019

arxiv: v1 [cs.lg] 4 Mar 2019 A Fundamental Performance Limitation for Adversarial Classification Abed AlRahman Al Makdah, Vaibhav Katewa, and Fabio Pasqualetti arxiv:1903.01032v1 [cs.lg] 4 Mar 2019 Abstract Despite the widespread

More information

Adversarial Examples Generation and Defense Based on Generative Adversarial Network

Adversarial Examples Generation and Defense Based on Generative Adversarial Network Adversarial Examples Generation and Defense Based on Generative Adversarial Network Fei Xia (06082760), Ruishan Liu (06119690) December 15, 2016 1 Abstract We propose a novel generative adversarial network

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline

More information

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1 90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression

More information

Towards stability and optimality in stochastic gradient descent

Towards stability and optimality in stochastic gradient descent Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction

More information

Universal Adversarial Networks

Universal Adversarial Networks 1 Universal Adversarial Networks Jamie Hayes University College London j.hayes@cs.ucl.ac.uk Abstract Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks Accepted to the 37th IEEE Symposium on Security & Privacy, IEEE 2016. San Jose, CA. Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks 1 Nicolas Papernot, Patrick McDaniel,

More information

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation

More information

Importance Reweighting Using Adversarial-Collaborative Training

Importance Reweighting Using Adversarial-Collaborative Training Importance Reweighting Using Adversarial-Collaborative Training Yifan Wu yw4@andrew.cmu.edu Tianshu Ren tren@andrew.cmu.edu Lidan Mu lmu@andrew.cmu.edu Abstract We consider the problem of reweighting a

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

arxiv: v1 [cs.lg] 30 Nov 2018

arxiv: v1 [cs.lg] 30 Nov 2018 Adversarial Examples as an Input-Fault Tolerance Problem Angus Galloway 1,2, Anna Golubeva 3,4, and Graham W. Taylor 1,2 arxiv:1811.12601v1 [cs.lg] Nov 2018 1 School of Engineering, University of Guelph

More information

Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation Matthias Hein and Maksym Andriushchenko Department of Mathematics and Computer Science Saarland University, Saarbrücken

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Hassan Ali *, Hammad Tariq *, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed * and Muhammad Shafique

More information

Lecture 8. Instructor: Haipeng Luo

Lecture 8. Instructor: Haipeng Luo Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine

More information

Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials

Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials 1. Contents The supplementary materials contain auxiliary experiments for the empirical analyses

More information

Robustness of classifiers: from adversarial to random noise

Robustness of classifiers: from adversarial to random noise Robustness of classifiers: from adversarial to random noise Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard École Polytechnique Fédérale de Lausanne Lausanne, Switzerland {alhussein.fawzi,

More information

Differentiable Abstract Interpretation for Provably Robust Neural Networks

Differentiable Abstract Interpretation for Provably Robust Neural Networks Matthew Mirman 1 Timon Gehr 1 Martin Vechev 1 Abstract We introduce a scalable method for training robust neural networks based on abstract interpretation. We present several abstract transformers which

More information

Mini-Course 1: SGD Escapes Saddle Points

Mini-Course 1: SGD Escapes Saddle Points Mini-Course 1: SGD Escapes Saddle Points Yang Yuan Computer Science Department Cornell University Gradient Descent (GD) Task: min x f (x) GD does iterative updates x t+1 = x t η t f (x t ) Gradient Descent

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Distirbutional robustness, regularizing variance, and adversaries

Distirbutional robustness, regularizing variance, and adversaries Distirbutional robustness, regularizing variance, and adversaries John Duchi Based on joint work with Hongseok Namkoong and Aman Sinha Stanford University November 2017 Motivation We do not want machine-learned

More information

arxiv: v3 [cs.cv] 10 Sep 2018

arxiv: v3 [cs.cv] 10 Sep 2018 The Relationship Between High-Dimensional Geometry and Adversarial Examples arxiv:1801.02774v3 [cs.cv] 10 Sep 2018 Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin

More information

PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach

PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach Tsui-Wei Weng 1, Pin-Yu Chen 2, Lam M. Nguyen 2, Mark S. Squillante 2, Ivan Oseledets 3, Luca Daniel 1 1 MIT, 2 IBM Research,

More information

Machine Learning Basics

Machine Learning Basics Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each

More information

UNIVERSALITY, ROBUSTNESS, AND DETECTABILITY

UNIVERSALITY, ROBUSTNESS, AND DETECTABILITY UNIVERSALITY, ROBUSTNESS, AND DETECTABILITY OF ADVERSARIAL PERTURBATIONS UNDER ADVER- SARIAL TRAINING Anonymous authors Paper under double-blind review ABSTRACT Classifiers such as deep neural networks

More information

what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley

what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley Collaborators Joint work with Samy Bengio, Moritz Hardt, Michael Jordan, Jason Lee, Max Simchowitz,

More information

ENSEMBLE METHODS AS A DEFENSE TO ADVERSAR-

ENSEMBLE METHODS AS A DEFENSE TO ADVERSAR- ENSEMBLE METHODS AS A DEFENSE TO ADVERSAR- IAL PERTURBATIONS AGAINST DEEP NEURAL NET- WORKS Anonymous authors Paper under double-blind review ABSTRACT Deep learning has become the state of the art approach

More information

Lecture Support Vector Machine (SVM) Classifiers

Lecture Support Vector Machine (SVM) Classifiers Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in

More information

arxiv: v4 [math.oc] 5 Jan 2016

arxiv: v4 [math.oc] 5 Jan 2016 Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The

More information

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:

More information

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x))

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard and Mitch Marcus (and lots original slides by

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information

Based on the original slides of Hung-yi Lee

Based on the original slides of Hung-yi Lee Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

From Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018

From Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018 From Binary to Multiclass Classification CS 6961: Structured Prediction Spring 2018 1 So far: Binary Classification We have seen linear models Learning algorithms Perceptron SVM Logistic Regression Prediction

More information

arxiv: v4 [cs.lg] 28 Mar 2016

arxiv: v4 [cs.lg] 28 Mar 2016 Analysis of classifiers robustness to adversarial perturbations Alhussein Fawzi Omar Fawzi Pascal Frossard arxiv:0.090v [cs.lg] 8 Mar 06 Abstract The goal of this paper is to analyze an intriguing phenomenon

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Multiclass Classification-1

Multiclass Classification-1 CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass

More information

Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes

Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes Daniel M. Roy University of Toronto; Vector Institute Joint work with Gintarė K. Džiugaitė University

More information

Linear classifiers Lecture 3

Linear classifiers Lecture 3 Linear classifiers Lecture 3 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin ML Methodology Data: labeled instances, e.g. emails marked spam/ham

More information

Discriminative Learning can Succeed where Generative Learning Fails

Discriminative Learning can Succeed where Generative Learning Fails Discriminative Learning can Succeed where Generative Learning Fails Philip M. Long, a Rocco A. Servedio, b,,1 Hans Ulrich Simon c a Google, Mountain View, CA, USA b Columbia University, New York, New York,

More information

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Yi Zhou Ohio State University Yingbin Liang Ohio State University Abstract zhou.1172@osu.edu liang.889@osu.edu The past

More information

Certified Robustness to Adversarial Examples with Differential Privacy

Certified Robustness to Adversarial Examples with Differential Privacy Certified Robustness to Adversarial Examples with Differential Privacy Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana Columbia University Abstract Adversarial examples

More information

Adversarial Attacks, Regression, and Numerical Stability Regularization

Adversarial Attacks, Regression, and Numerical Stability Regularization Adversarial Attacks, Regression, and Numerical Stability Regularization Andre T. Nguyen and Edward Raff {Nguyen_Andre, Raff_Edward}@bah.com Booz Allen Hamilton Abstract Adversarial attacks against neural

More information

Adversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals

Adversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals Noname manuscript No. (will be inserted by the editor) Adversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals Yaniv Romano Aviad Aberdam Jeremias Sulam Michael

More information

Variations of Logistic Regression with Stochastic Gradient Descent

Variations of Logistic Regression with Stochastic Gradient Descent Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic

More information

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions 2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,

More information

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental

More information

Classification of Hand-Written Digits Using Scattering Convolutional Network

Classification of Hand-Written Digits Using Scattering Convolutional Network Mid-year Progress Report Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan Co-Advisor: Dr. Maneesh Singh (SRI) Background Overview

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

Cutting Plane Training of Structural SVM

Cutting Plane Training of Structural SVM Cutting Plane Training of Structural SVM Seth Neel University of Pennsylvania sethneel@wharton.upenn.edu September 28, 2017 Seth Neel (Penn) Short title September 28, 2017 1 / 33 Overview Structural SVMs

More information

LEARNING WITH A STRONG ADVERSARY

LEARNING WITH A STRONG ADVERSARY LEARNING WITH A STRONG ADVERSARY Ruitong Huang, Bing Xu, Dale Schuurmans and Csaba Szepesvári Department of Computer Science University of Alberta Edmonton, AB T6G 2E8, Canada {ruitong,bx3,daes,szepesva}@ualberta.ca

More information

Learning Kernel Parameters by using Class Separability Measure

Learning Kernel Parameters by using Class Separability Measure Learning Kernel Parameters by using Class Separability Measure Lei Wang, Kap Luk Chan School of Electrical and Electronic Engineering Nanyang Technological University Singapore, 3979 E-mail: P 3733@ntu.edu.sg,eklchan@ntu.edu.sg

More information

DEEP neural networks (DNNs) [2] have been widely

DEEP neural networks (DNNs) [2] have been widely IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, MANUSCRIPT ID 1 Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent

Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Journal of Computational Information Systems 9: 15 (2013) 6251 6258 Available at http://www.jofcis.com Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Xin ZHOU, Conghui ZHU, Sheng

More information