arxiv: v1 [cs.lg] 6 Dec 2018
|
|
- Howard Charles
- 5 years ago
- Views:
Transcription
1 Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, and Ruitong Huang Borealis AI arxiv: v1 [cs.lg] 6 Dec 2018 Abstract We propose Max-Margin Adversarial (MMA) training for directly maximizing the input space margin. This margin maximization is direct, in the sense that the margin s gradient w.r.t. model parameters can be shown to be parallel with the loss gradient at the minimal length perturbation, thus gradient ascent on margins can be performed by gradient descent on losses. We further propose a specific formulation of MMA training to maximize the average margin of training examples in order to train models that are robust to adversarial perturbations. It is implemented by performing adversarial training on a novel adaptive norm projected gradient descent (AN-PGD) attack. Preliminary experimental results demonstrate that our method outperforms the existing state of the art methods. In particular, testing against both whitebox and transfer projected gradient descent attacks on MNIST, our trained model improves the SOTA l ɛ = 0.3 robust accuracy by 2%, while maintaining the SOTA clean accuracy. Furthermore, the same model provides, to the best of our knowledge, the first model that is robust at l ɛ = 0.4, with a robust accuracy of 86.51%. 1 Introduction Despite their impressive performance on various learning tasks, neural networks have been shown to be vulnerable. An otherwise highly accurate network can be completely fooled by an artificially constructed perturbation imperceptible to human perception, known as the adversarial attack (Szegedy et al., 2013; Biggio et al., 2013). Not surprisingly, numerous algorithms in defending adversarial attacks have already been proposed in the literature which, arguably, can be interpreted as different ways in increasing the margins, i.e. the smallest distance from the sample point to the decision boundary induced by the network. Obviously, adversarial robustness is equivalent to large margins. One type of the algorithms is to use regularization in the learning to enforce the Lipschitz constant of the network (Cisse et al., 2017; Ross and Doshi-Velez, 2017; Hein and Andriushchenko, 2017; Tsuzuku et al., 2018), thus a small loss sample point would have a large margin since the loss cannot increase too fast. If the Lipschitz constant is regularized on data points, it is usually too local and not accurate in a neighborhood; if it is controlled globally, the constraint on the model is often too strong that it harms accuracy. So far, such methods seem not able to achieve very robust models. There are also efforts using first-order approximation to estimate and maximize input space margin (Elsayed et al., 2018; Sokolic et al., 2017; Matyasko and Chau, 2017). Similarly to local Lipschitz regularization, the reliance on local information might not provide accurate margin estimation and efficient maximization. Indeed, many defending methods have been broken Preprint. Work in progress. 1
2 Figure 1: Illustration of decision boundary, minimal length perturbation, margin and etc. after their proposal (Carlini and Wagner, 2016, 2017a; Athalye et al., 2018). Adversarial training (Szegedy et al., 2013; Huang et al., 2015; Madry et al., 2017) is one of the few defense methods that has stood the test of time, and in fact, represents the current SOTA results on several datasets including MNIST. However, as illustrated in Section 1.2, the update in adversarial training does not necessarily increase the margins, which in return diminishes the effectiveness of adversarial training. In this paper, we propose to directly maximize the decision boundary on each training sample, called Max-Margin Adversarial (MMA) training, to achieve the greatest possible robustness. Interestingly, we show that our method, MMA training, has a close relationship with adversarial training. More specifically, we prove in this paper that the margin s gradient w.r.t. model parameter is parallel with the loss gradient at the minimal length perturbation. Such property makes SGD possible for the MMA objective, despite the model parameter is involved in the constraints. We further show that MMA training is an improvement over adversarial training, in the sense that it picks the right perturbation magnitude for adversarial training. We test our algorithms on MNIST and CIFAR10, demonstrating that our method outperforms adversarial training. Moreover, MMA training achieves 86.51% robust accuracy on MNIST against the l ɛ = 0.4 PGD attack. To our best knowledge, this is the first robust model in the literature that can still maintain good robustness against such a strong attack. 1.1 Notations We focus on K-class classification problem in this paper, and assume that the output of the model is a score function f θ (x) = (f 1 (x),..., f K (x)) which assign score f i (x) to the ith class. The predicted label of x is then decided by arg max i f i (x). We use the logit margin loss 1 L θ (x) = max j y f j (x) f y (x). Note that when L θ (x) < 0, the classification is correct, and when L θ (x) > 0, the classification is wrong. Given a model f θ and a data pair z = (x, y), one can compute the minimal length perturbation, δ by δ = arg min δ s.t. L θ (x + δ, y) 0, (1) δ as illustrated in Figure 1. Denote the margin of example z by d θ (z) = δ, which is the (minimal) distance from the input x to the decision boundary L θ (z) = 0 in the input space. Note that Eq. (1) is also compatible with misclassified samples, as for a sample z that is misclassified by f, the optimal δ would be 0, hence d θ (z) = 0. Throughout this paper, we use the single word margin to denote input space margin. For other notions of margin, we will use specific phrases, e.g. output space 1 Since the scores (f 1(x),..., f K(x)) output by f are also called logits in neural network literature. 2
3 Figure 2: Fixed-norm adversarial training does not necessarily improve robustness. margin or logit space margin. All the proofs are postponed to Appendix A. 1.2 Adversarial Training Does NOT Necessarily Increase Margin We use an imaginary example in this section to illustrate that one update step of adversarial training does not necessarily increase margin. Adversarial training (Huang et al., 2015) is based on a minmax formulation as min max L θ(x + δ, y) (2) θ δ ɛ where the worst-case loss is minimized under a fixed perturbation magnitude ɛ. Consider a onedimensional example in Figure 2, where the input example x is a scalar. We perturb x in the positive direction with perturbation δ. We assume L(δ) = L θ (x + δ, y) is monotonically on increasing on δ, which means that larger perturbation is always stronger. Let L 0 ( ) denote the original function before an update step, and δ0 denote the corresponding margin (minimal length perturbation). Next, an update is made to the parameter of L 0 ( ), such that L 0 (ɛ) is reduced. After this update, we imagine two different possibilities of the updated loss functions, L α ( ) in blue dotted-dash line, and L β ( ) in red dash line. L α (ɛ) and L β (ɛ) have the same decreased value at ɛ. However, we can see that L α ( ) has an increased margin δα > δ0, and L β( ) has a decreased margin δβ < δ 0. The second case is undesired since it is less robust as the margin decreases. 2 Max-Margin Adversarial Training This section is devoted to the presentation of our method, MMA training. Before presenting it, we still need some preparation. We calculated the gradient of the margin in Section 2.1. Our method MMA training is presented in Section 2.2. Lastly in Section 2.3, we discussed the relationship between MMA training and adversarial training. 3
4 2.1 Gradients of the Margins In this section, we show how to calculate gradients of the margins, d θ (z), on individual examples, as a preparation for presenting MMA training. Computing such gradients for a linear model is relatively easy due to the existence of its closed-form solution, e.g. SVM. But it is not clear for general functions such as neural networks. Recall that d θ (z) = δ = min L(δ,θ) 0 δ. It is easy to see that for a wrongly classified example z, δ is achieved at 0 and thus θ d θ (z) = 0. Therefore in this section, we focus on correctly classified data examples. Denote the Lagrangian as L(δ, λ), where L(δ, λ) = δ + λl(δ, θ). Also for a fixed θ, denote the optimizers of δ and λ by δ and λ. The following theorem provides an efficient way to compute θ d θ (z). The exact implementation of our algorithm is presented in Section 2.2. Theorem 2.1. Assume that ɛ(δ) = δ and L(δ, θ) are C 2 functions almost everywhere 2. Also assume that the matrix M is full rank almost everywhere, where Then, M = ( 2 ɛ(δ ) δ 2 + λ 2 L(δ,θ) L(δ,θ) δ δ 2 L(δ,θ) δ θ d θ (z) L(δ, θ). θ Remark 2.1. The condition on the matrix M is serving as a technical condition to guarantee that the implicit function theorem is applicable and thus the gradient can be computed. 0 ). 2.2 Max-Margin Adversarial Training Motivated by the observation in Section 1.2, MMA training directly uses margins in its objective: max d θ (z i ) J θ (z j ) θ, (3) i S + θ where J θ ( ) is a regular classification loss function, e.g. cross-entropy loss, which can be different from L θ ( ). S + θ = {i : L θ (z i ) < 0} is the set of all correctly classified examples, and S θ = {i : L θ (z i ) 0} is the set of wrongly classified examples. Intuitively, MMA training tries to minimize classification loss on wrongly classified examples in S θ, until they are correctly classified. For already correctly classified examples, MMA training tries to maximize the margin d θ (z i ). Note that we do not maximize margins on wrongly classified examples, since the gradient of their margins are 0, as we discussed in Section 2.1. Given the Eq. (3) and Theorem 2.1, we can implement MMA training using SGD. It is straightforward to calculate θ J θ (z j ). Based on Theorem 2.1, we can calculate the θ d θ (z i ) as the gradient of the loss at the minimal length perturbation δ. Finding the exact δ is intractable in general settings 3. Here we propose an algorithm to give an approximate solution of δ, the Adaptive Norm Projective Gradient Descent Attack (AN-PGD). In AN-PGD, we apply PGD (Madry et al., 2017) twice to find an approximation. In particular, if we know the true margin ɛ = δ of a data point x in advance, we can use PGD with length ɛ, j S θ 2 All the l p norm with p 0, including p =, satisfies this condition. 3 Exceptions exist, for example linear SVM. 4
5 Algorithm 1 Adaptive Norm Projected Gradient Descent Attack. Inputs: (x, y) is the data example. ɛ init is the initial norm constraint used in the first PGD attack. Outputs: δ is the approximate minimal length perturbation. Parameters: ɛ inc is the increment of perturbation length. ɛ max is the maximum perturbation length. PGD(x, y, ɛ) represents PGD (Madry et al., 2017) perturbation δ with maximum perturbation length ɛ. 1: Adversarial example δ 1 = PGD(x, y, ɛ init ) 2: Unit perturbation δ unit = δ 1 δ 1 3: if prediction on x + δ 1 is correct then 4: ɛ new = arg min η L(x + ηδ unit, y), η [ɛ init, min{ɛ init + ɛ inc, ɛ max }] 5: else 6: ɛ new = arg min η L(x + ηδ unit, y), η [0, ɛ init ) 7: end if 8: δ = PGD(x, y, ɛ new ) PGD(x, y, ɛ ), to find an approximation of δ. Since we do not know ɛ, we need to approximate it first. Therefore, we first perform a PGD attack on x with a fixed perturbation length ɛ 0 to get δ 1, then we search along the direction of δ 1 to find a scaled perturbation that gives L = 0, whose length is used to approximate ɛ. Algorithm 1 describes the details of the AN-PGD. 4 With the proposed AN-PGD attack, we can implement MMA training easily as described in Algorithm 2. Note that AN-PGD here only serves as an algorithm to give an approximate solution of δ, and it can be decoupled from the rest parts of MMA training. Other attacks that can serve a similar purpose can also fit into our MMA training framework, for example, the Carlini-Wagner attack (Carlini and Wagner, 2017b) 5, and the Decoupled Direction and Norm (DDN) attack (Rony et al., 2018) which is concurrent with our work. 2.3 Relation to Adversarial Training As suggested by Theorem 2.1, MMA training is closely related to adversarial training. In this section, we prove that MMA training is actually an improvement on adversarial training, in the sense that it picks the right perturbation magnitude ɛ for adversarial training. Given a sample z = (x, y) and a fixed length ɛ, recall that adversarial training learns a model f by minimizing the adversarial loss, min J θ (ɛ) = min max L θ(x + δ, y) θ θ δ ɛ For simplicity, we denote L θ (x + δ, y) by L θ (δ) in this section. The next theorem shows that the optimal f that maximizes the margin around the decision boundary also minimizes the adversarial loss given the perturbation magnitude being the right margin. Theorem 2.2. Assume that l L θ (0) Range(L θ ), let θ MMA = arg max θ min Lθ (δ) l δ. Also let ɛ = min LθMMA (δ) l δ, and θ adv = arg min θ max δ ɛ L θ (δ), then θ MMA also minimizes the adversarial loss, i.e. max L δ ɛ θ MMA (δ) = max L δ ɛ θ adv (δ). 4 The zero-crossing binary search is denoted as arg min η L(η) for brevity. 5 Carlini-Wagner attack specifically might suffer from computational issues due to the very high number of iterations it requires. 5
6 Algorithm 2 Max-Margin Adversarial Training. Inputs: The training set {(x i, y i )}. Outputs: the trained neural network model f θ ( ). Parameters: ɛ contains perturbation lengths of training data. ɛ inc is the increment of perturbation length. ɛ min is the minimum perturbation length. ɛ max is the maximum perturbation length. A(x, y, ɛ init ) represents the approximate minimal length perturbation returned by the algorithm A on the data example (x, y) and at the initial norm ɛ init. 1: Randomly initialize the parameter θ of model f, and initialize every element of ɛ as ɛ min 2: repeat 3: Read minibatch B = {(x 1, y 1 ),..., (x m, y m )} from the training set 4: Make predictions on B and into two: wrongly predicted B 0 and correctly predicted B 1 5: for each (x i, y i ) in B 1 do 6: Retrieve perturbation length ɛ i from ɛ 7: δ i = A(x i, y i, ɛ i ) 8: Update the ɛ i in ɛ as δ i, and put (x i + δ i, y i) into B adv 1 9: end for 10: Calculate the gradient of margin on B adv 1 and the gradient of classification loss on B 0 11: Perform one step gradient step update on θ 12: until training converged or maximum number of steps reached Note that an analogy of Theorem 2.2, that the optimal f that minimizes the adversarial loss also maximizes the margin given the corresponding threshold l, doesn t hold due to the reason that picking a perturbation magnitude ɛ may be overshooting the margin. The next theorem shows that if ɛ is picked tightly, then the above claim is true. Therefore, MMA training can also be interpreted as a way of improving adversarial training by finding the proper ɛ for each training sample, which is exactly the flavor of Algorithm 2. Theorem 2.3. Given ɛ 0, let θ adv = arg min θ max δ ɛ L θ (δ). Also let l = max δ ɛ L θadv (δ), and θ MMA = arg max θ min Lθ (δ) l δ. Further assume that min δ = ɛ, L θadv (δ) l then θ adv also maximizes the margin distance, i.e. (No overshooting assumption) min δ = min δ. L θadv (δ) l L θmma (δ) l Remark 2.2. The above theorems only focus on the case of one single sample, and they cannot be directly extended to the whole training set. In practice, the behaviour of the algorithm could be much more complicated. 3 Preliminary Results on Adversarial Robustness We test the max-margin adversarial (MMA) training on the MNIST and CIFAR10 datasets under l and l 2 perturbations. We compare our models with PGD trained models (Madry et al., 2017) which represent the SOTA robust models in the literature 6. MMA trained algorithm outperforms PGD trained models, and particularly, MMA training is the first method in the literature, to our best knowledge, that can defend l PGD attacks with length 0.4 on MNIST. 6 All the models are trained and tested under the same type of norm. 6
7 (a) MNIST l (b) MNIST l 2 (c) CIFAR10 l (d) CIFAR10 l 2 Figure 3: Robust accuracy vs perturbation length for MNIST and CIFAR10 under l and l 2 perturbations MNIST under l perturbation We follow a similar test protocol to Madry et al. (2017) using the l PGD attack. Our baseline is the PGD trained model with ɛ = Figure 3a shows the robust accuracies under white-box PGD l perturbations with different norm constraints, all with the number of iterations equals 100. We can see that the MMA trained model is better than the PGD trained model at all ɛ s. When ɛ > 0.3, the MMA trained model is significantly better than PGD trained model and is able to achieve good robustness for very large ɛ s, e.g To our best knowledge, this is so far the only algorithm that can defend l PGD attacks with length 0.4 in the literature. We further perform extensive additional experiments to attack the MMA trained model. We test our models with 50 random restarts under both white-box PGD attacks and transfer attacks generated by attacking a PGD model and a standardly trained model. Note that we always pick the strongest attack among random restarts. We also test using 40 iterations under the transfer attack case to avoid the situation where the attacks overfit to the source model. Table 1 shows all the results. Under these considered attacks, when ɛ = 0.3, MMA model yields a 2% robust accuracy improvement over the PGD model (92.86% vs 90.65%). When ɛ = 0.4, MMA training still achieves robust accuracy at 86.51%, while the PGD model is completely cracked. Notice that the strongest attack on MMA trained model is the transfer attack generated from PGD model. This could indicate that the white-box robustness of the MMA model is partially achieved due to a strong degree of flatness in the loss surface about the input data point. Despite this, however, the MMA model still manages to defend against the various transfer attacks tested. We are continuing to benchmark MMA models in ongoing work. 7 We have also attempted to train PGD models with ɛ = 0.4, but training didn t converge. 7
8 Another notable phenomenon is that MMA trained model is even robust to white-box l PGD attack with ɛ = 0.5, when the PGD is performed on the MMA trained model itself. It has the robust accuracy 89.76% under a single attack, and 70.30% under 50 random starts. This is a surprising phenomenon in the sense that a well-crafted perturbation at ɛ = 0.5 should decrease the accuracy to approximately 10%, where a gray image with constant pixel value 0.5 can already achieve. However, as shown in Table 1, when being attacked by transfer attacks from the standardly trained model, the MMA trained model s robust accuracy can be reduced to 10.96%. These observations combined could suggest that, when ɛ is large, PGD attack gets stuck in poor local minima on the MMA trained models, and we are further investigating this interesting phenomenon. MNIST under l 2 perturbation We compare our MMA trained model with models trained on fixed length PGD perturbations at ɛ = 1, 3 and 5, as shown in Figure 3a. We benchmark using the boundary attack (Brendel et al., 2017), as we observe that it is stronger than the PGD l 2 attack, which is consistent with the observation in Anonymous (2019). We can see that MMA model is not always better than the best PGD model at a given perturbation length, but it provides similar performance compared with the best PGD model at all ɛ s. Also, the MMA model outperforms PGD models on large perturbations, e.g. ɛ = 5, even when the PGD model is specifically trained on perturbations with that distortion level. Having robustness against the gradient-free boundary attack, to some degree, also indicates that, the l 2 robustness of the MMA trained model is likely not purely due to flatness in the loss surface about the input data point. CIFAR10 For l perturbation, we compare the MMA model with models trained with PGD ɛ = 8/255 and ɛ = 16/255, 8 as shown in Figure 3c. For l 2 perturbation, we compare the MMA trained model with models trained with PGD ɛ = 0.5 and ɛ = 1.5, 9 as shown in Figure 3d. For both the l and l 2 cases, when MMA models have similar clean accuracy to PGD models (l 2 : PGD ɛ = 0.5 vs MMA, l : PGD ɛ = 8/255 vs MMA), they have similar robust accuracy for small perturbation lengths, and MMA training outperforms PGD training when perturbation length becomes larger. It is possible that models trained with larger PGD attack are more robust at a certain band of ɛ, but this is at the cost of losing clean accuracy. 4 Other Related Works Several other related works also exist in the literature, as reviewed in this section. First-order Large Margin Previous works (Elsayed et al., 2018; Sokolic et al., 2017; Matyasko and Chau, 2017) have attempted to use first-order approximation to estimate the input space margin. For first-order methods, the margin will be accurately estimated when the classification function is linear. MMA s margin estimation is exact when the minimal length perturbation δ can be solved, which is not only satisfied by linear models, but also by a broader range of models, e.g. models that are convex w.r.t. input x. This relaxed condition could potentially enable more accurate margin estimation which improves MMA training s performance. 8 Training with l ɛ = 32/255 did not converge. 9 Training with l 2 ɛ = 4.5 did not converge. 8
9 (Cross-)Lipschitz Regularization On the other spectrum, Tsuzuku et al. (2018) enlarges their margin by controlling the global Lipschitz constant, which in return places a strong constraint on the model and harms its learning capabilities. Instead, our method, alike adversarial training, uses adversarial attacks to estimate the margin to the decision boundary. With a strong method, our estimate is much more precise in the neighborhood around the data point, while being much more flexible due to not relying on a global Lipschitz constraint. Hard-Margin SVM (Vapnik, 2013) in the separable case Assuming that all the training examples are correctly classified and using our notations on general classifiers, the hard-margin SVM objective can be written as: { } arg max min d θ (z i ) s.t. L θ (z i ) < 0, i. (4) θ i On the other hand, under the same separable and correct assumptions, MMA training formulation in Eq. (3) can be written as { } arg max θ d θ (z i ) s.t. L θ (z i ) < 0, i, (5) i which is maximizing the average margin rather than the minimum margin in SVM. Note that the theorem on gradient calculation of the margin in Section 2.1 also applies to the SVM formulation of differentiable functions. Because of this, we can also use SGD to solve the following SVM-style formulation, which we delay to future work: max θ min i S + θ d θ (z i ) j S θ J θ (z j ). (6) 5 Conclusion In this paper, we propose max-margin adversarial (MMA) training. We show that the input space margin s gradient w.r.t. the model parameter can be calculated with the loss gradient on the minimal length perturbation. This enables gradient descent on objectives that involves input space margin. We propose a specific formulation of max-margin adversarial training to maximize the average margin motivated by improving adversarial robustness. Our preliminary results on MNIST and CIFAR10 show that MMA training is very promising for training robust neural networks, and notably, the MMA training model achieves decent robustness (86.51%) against l PGD attack with ɛ = 0.4 which, to our best knowledge, is first time shown in the literature. References Anonymous (2019). Towards the first adversarially robust neural network model on mnist. In Submitted to International Conference on Learning Representations. under review. 9
10 Athalye, A., Carlini, N., and Wagner, D. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arxiv preprint arxiv: Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., and Roli, F. (2013). Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases, pages Springer. Brendel, W., Rauber, J., and Bethge, M. (2017). Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2016). Defensive distillation is not robust to adversarial examples. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2017a). Adversarial examples are not easily detected: Bypassing ten detection methods. arxiv preprint arxiv: Carlini, N. and Wagner, D. (2017b). Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on, pages IEEE. Cisse, M., Bojanowski, P., Grave, E., Dauphin, Y., and Usunier, N. (2017). Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages Elsayed, G. F., Krishnan, D., Mobahi, H., Regan, K., and Bengio, S. (2018). Large margin deep networks for classification. arxiv preprint arxiv: Hein, M. and Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. arxiv preprint arxiv: Huang, R., Xu, B., Schuurmans, D., and Szepesvári, C. (2015). Learning with a strong adversary. arxiv preprint arxiv: Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arxiv preprint arxiv: Matyasko, A. and Chau, L.-P. (2017). Margin maximization for robust classification using deep learning. In Neural Networks (IJCNN), 2017 International Joint Conference on, pages IEEE. Rony, J., Hafemann, L. G., Oliveira, L. S., Ayed, I. B., Sabourin, R., and Granger, E. (2018). Decoupling direction and norm for efficient gradient-based l2 adversarial attacks and defenses. arxiv preprint arxiv: Ross, A. S. and Doshi-Velez, F. (2017). Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. arxiv preprint arxiv: Sokolic, J., Giryes, R., Sapiro, G., and Rodrigues, M. R. (2017). Robust large margin deep neural networks. IEEE Transactions on Signal Processing. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2013). Intriguing properties of neural networks. arxiv preprint arxiv: Tsuzuku, Y., Sato, I., and Sugiyama, M. (2018). Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks. arxiv preprint arxiv: Vapnik, V. (2013). The nature of statistical learning theory. Springer science & business media. 10
11 Table 1: MNIST: Performance of the adversarially trained models against different l adversaries for ɛ = 0.3, 0.4, 0.5. For each ɛ, we show the most successful attack with bold. The source networks used for the attack are: the MMA trained model (MMA) (white-box attack for MMA, transfer attack for PGD), the PGD trained model (PGD) (white-box attack for PGD, transfer attack for MMA), the standardly trained model (STD) (transfer attack for both MMA and PGD). Attack Method ɛ Steps Restarts Source Model Accuracy MMA Model Accuracy PGD Model Natural % 98.63% PGD MMA 96.64% 97.99% PGD MMA 93.64% 96.71% FGSM PGD 95.5% 94.62% PGD PGD 94.65% 92.40% PGD PGD 92.86% 90.65% PGD PGD 94.65% 92.37% PGD PGD 92.87% 90.64% FGSM STD 95.46% 93.45% PGD STD 95.86% 93.97% PGD STD 94.38% 92.53% PGD STD 95.86% 93.97% PGD STD 94.41% 92.55% PGD MMA 95.33% 96.14% PGD MMA 89.99% 89.74% FGSM PGD 94.16% 88.18% PGD PGD 91.18% 9.89% PGD PGD 86.69% 2.83% PGD PGD 91.16% 8.72% PGD PGD 86.51% 2.42% FGSM STD 91.51% 38.23% PGD STD 93.21% 15.38% PGD STD 89.12% 5.23% PGD STD 93.18% 15.17% PGD STD 89.14% 5.07% PGD MMA 89.76% 60.36% PGD MMA 70.30% 26.02% FGSM PGD 88.39% 51.67% PGD PGD 48.05% 0.00% PGD PGD 15.50% 0.00% PGD PGD 45.75% 0.00% PGD PGD 14.20% 0.00% FGSM STD 32.96% 4.45% PGD STD 34.05% 0.00% PGD STD 10.96% 0.00% PGD STD 33.66% 0.00% PGD STD 11.01% 0.00% 11
12 A Proofs A.1 Proof of theorem 2.1 Proof. Recall that z = (x, y) and ɛ(δ) = δ. Here we compute the gradient for d θ (z) in its general form. Consider the following optimization problem: d θ (z) = min δ (θ) ɛ(δ), where (θ) = {δ : L θ (x + δ, y) = 0}, ɛ and g are both C 2 functions 10. Denotes its Lagrangian by L(δ, λ), where L(δ, λ) = ɛ(δ) + λl θ (x + δ, y) For a fixed θ, the optimizer δ and λ must satisfy the first-order conditions (FOC) ɛ(δ) + λ L θ(x + δ, y) δ δ = 0, (7) δ=δ,λ=λ Put the FOC equations in vector form, G((δ, λ), θ) = L θ (x + δ, y) δ=δ = 0. ( ɛ(δ) δ + λ L θ(x+δ,y) δ L θ (x + δ, y) ) δ=δ,λ=λ = 0. Note that G is C 1 continuously differentiable since ɛ and g are C 2 functions. Furthermore, the Jacobian matrix of G w.r.t (δ, λ) is ( ) 2 ɛ(δ ) (δ,λ) G((δ, λ + λ 2 g(θ,δ ) g(θ,δ ) ), θ) = δ 2 δ 2 δ 0 g(θ,δ ) δ which by assumption is full rank. Therefore, by the implicit function theorem, δ and λ can be expressed as a function of θ, denoted by δ (θ) and λ (θ). To further compute θ d θ (z), note that d θ (z) = ɛ(δ (θ)). Thus, θ d θ (z) = ɛ(δ ) δ (θ) δ θ = λ g(θ, δ ) δ δ (θ), (8) θ where the second equality is by Eq. (7). The implicit function theorem also provides a way of computing δ (θ) θ which is complicated involving taking inverse of the matrix (δ,λ) G((δ, λ ), θ). Here we present a relatively simple way to compute this gradient. Note that Differentiate with w.r.t. θ on both sides: Combining Eq. (8) and Eq. (9), g(θ, δ ) θ g(θ, δ (θ)) = 0. + g(θ, δ ) δ (θ) = 0. (9) δ θ θ d θ (z) = λ (θ) g(θ, δ ). (10) θ 10 Note that a simple application of Danskin s theorem would not be valid as the constraint set (θ) depends on the parameter θ. 12
13 A.2 A lemma for later proofs The following lemma helps relate the objective of adversarial training with that of our MMA training. Here, we denote L θ (x + δ, y) as L θ (δ) for simplicity. Lemma A.1. Given (x, y) and θ, assume that L θ (δ) is continuous in δ, then for ɛ 0, and l L θ (0) Range(L θ ), it holds that min δ = ɛ = max L θ(δ) = l ; (11) L θ (δ) l δ ɛ max L θ(δ) = l = δ ɛ min δ ɛ. (12) L θ (δ) l Proof. Eq. (11). We prove this by contradiction. Suppose max δ ɛ L θ (δ) > l. When ɛ = 0, this violates our asssumption l L θ (0) in the theorem. So assume ɛ > 0. Since L θ (δ) is a continuous function defined on a compact set, the maximum is attained by δ such that δ ɛ and L θ ( δ) > l. Note that L θ is continuous and l L θ (0), then there exists δ 0, δ i.e. the line segment connecting 0 and δ, such that δ < ɛ and L θ ( δ) = l. This follows from the intermediate value theorem by restricting L θ (δ) onto 0, δ. This contradicts min Lθ (δ) l δ = ɛ. If max δ ɛ L θ (δ) < l, then {δ : δ ɛ} {δ : L θ (δ) < l}. Every point p {δ : δ ɛ} is in the open set {δ : L θ (δ) < l}, there exists an open ball with some radius r p centered at p such that B rp {δ : L θ (δ) < l}. This forms an open cover for {δ : δ ɛ}. Since {δ : δ ɛ} is compact, there is an open finite subcover U ɛ such that: {δ : δ ɛ} U ɛ {δ : L θ (δ) < l}. Sicne U ɛ is finite, there exists h > 0 such that {δ : δ ɛ + h} {δ : L θ (δ) < l}. Thus {δ : L θ (δ) l} {δ : δ > ɛ + h}, contradicting min Lθ (δ) l δ = ɛ again. Eq. (12). Assume that min Lθ (δ) l δ > ɛ, then {δ : L θ (δ) l} {δ : δ > ɛ}. Taking complementary set of both sides, {δ : δ ɛ} {δ : L θ (δ) < l}. Therefore, by the compactness of {δ : δ ɛ}, max δ ɛ L θ (δ) < l, contradiction. A.3 Proof of theorem 2.2 Proof. By the definition of θ adv, We further prove it is not possible that max L δ ɛ θ MMA (δ) max L δ ɛ θ adv (δ). max L δ ɛ θ MMA (δ) > max L δ ɛ θ adv (δ). If so, by the first implication in Lemma A.1 and the definition of ɛ, we have l = max δ ɛ L θ MMA (δ) > max δ ɛ L θ adv (δ), and thus with similar arguments to the proof of Lemma A.1, contradicting the definition of θ MMA. Therefore, min δ > L θadv (δ) l ɛ = δ (θ MMA ), max δ ɛ L θ MMA (δ) = max δ ɛ L θ adv (δ). 13
14 A.4 Proof of Theorem 2.3 Proof. By the definition of θ MMA, We further prove it is not possible that If so, by the no overshooting assumption, min δ min δ. L θmma (δ) l L θadv (δ) l min δ > min δ. L θmma (δ) l L θadv (δ) l min δ > ɛ. L θmma (δ) l and thus with similar arguments to the proof of the second implication in Lemma A.1, contradicting the definition of θ adv. Therefore, max δ ɛ L θ MMA (δ) < l, max L δ ɛ θ MMA (δ) = max L δ ɛ θ adv (δ). 14
Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training
Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, and Ruitong Huang Borealis AI arxiv:1812.02637v2
More informationAdversarially Robust Optimization and Generalization
Adversarially Robust Optimization and Generalization Ludwig Schmidt MIT UC Berkeley Based on joint works with Logan ngstrom (MIT), Aleksander Madry (MIT), Aleksandar Makelov (MIT), Dimitris Tsipras (MIT),
More informationON THE SENSITIVITY OF ADVERSARIAL ROBUSTNESS
ON THE SENSITIVITY OF ADVERSARIAL ROBUSTNESS TO INPUT DATA DISTRIBUTIONS Anonymous authors Paper under double-blind review ABSTRACT Neural networks are vulnerable to small adversarial perturbations. Existing
More informationTowards ML You Can Rely On. Aleksander Mądry
Towards ML You Can Rely On Aleksander Mądry @aleks_madry madry-lab.ml Machine Learning: The Success Story? Image classification Reinforcement Learning Machine translation Machine Learning: The Success
More informationMeasuring the Robustness of Neural Networks via Minimal Adversarial Examples
Measuring the Robustness of Neural Networks via Minimal Adversarial Examples Sumanth Dathathri sdathath@caltech.edu Stephan Zheng stephan@caltech.edu Sicun Gao sicung@ucsd.edu Richard M. Murray murray@cds.caltech.edu
More informationarxiv: v1 [cs.lg] 30 Oct 2018
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models arxiv:1810.12715v1 [cs.lg] 30 Oct 2018 Sven Gowal sgowal@google.com Rudy Bunel University of Oxford rudy@robots.ox.ac.uk
More informationReinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training
Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu * 1 Uyeong Jang * 2 Jiefeng Chen 2 Lingjiao Chen 2 Somesh Jha 2 Abstract In this paper we study leveraging
More informationarxiv: v1 [stat.ml] 15 Mar 2018
Large Margin Deep Networks for Classification arxiv:1803.05598v1 [stat.ml] 15 Mar 2018 Gamaleldin F. Elsayed Dilip Krishnan Hossein Mobahi Kevin Regan Samy Bengio Google Research {gamaleldin,dilipkay,hmobahi,kevinregan,bengio}@google.com
More informationarxiv: v1 [cs.lg] 24 Jan 2019
Cross-Entropy Loss and Low-Rank Features Have Responsibility for Adversarial Examples Kamil Nar Orhan Ocal S. Shankar Sastry Kannan Ramchandran arxiv:90.08360v [cs.lg] 24 Jan 209 Abstract State-of-the-art
More informationNotes on Adversarial Examples
Notes on Adversarial Examples David Meyer dmm@{1-4-5.net,uoregon.edu,...} March 14, 2017 1 Introduction The surprising discovery of adversarial examples by Szegedy et al. [6] has led to new ways of thinking
More informationTOWARDS DEEP LEARNING MODELS RESISTANT TO ADVERSARIAL ATTACKS
Published as a conference paper at ICLR 8 TOWARDS DEEP LEARNING MODELS RESISTANT TO ADVERSARIAL ATTACKS Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu Department of
More informationLimitations of the Lipschitz constant as a defense against adversarial examples
Limitations of the Lipschitz constant as a defense against adversarial examples Todd Huster, Cho-Yu Jason Chiang, and Ritu Chadha Perspecta Labs, Basking Ridge, NJ 07920, USA. thuster@perspectalabs.com
More informationarxiv: v1 [cs.lg] 15 Nov 2017 ABSTRACT
THE BEST DEFENSE IS A GOOD OFFENSE: COUNTERING BLACK BOX ATTACKS BY PREDICTING SLIGHTLY WRONG LABELS Yannic Kilcher Department of Computer Science ETH Zurich yannic.kilcher@inf.ethz.ch Thomas Hofmann Department
More informationReinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training
Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu * 1 Uyeong Jang * 2 Jiefeng Chen 2 Lingjiao Chen 2 Somesh Jha 2 Abstract In this paper we study leveraging
More informationarxiv: v3 [cs.lg] 22 Mar 2018
arxiv:1710.06081v3 [cs.lg] 22 Mar 2018 Boosting Adversarial Attacks with Momentum Yinpeng Dong1, Fangzhou Liao1, Tianyu Pang1, Hang Su1, Jun Zhu1, Xiaolin Hu1, Jianguo Li2 1 Department of Computer Science
More informationarxiv: v1 [stat.ml] 27 Nov 2018
Robust Classification of Financial Risk arxiv:1811.11079v1 [stat.ml] 27 Nov 2018 Suproteem K. Sarkar suproteemsarkar@g.harvard.edu Daniel Giebisch * danielgiebisch@college.harvard.edu Abstract Kojin Oshiba
More informationarxiv: v3 [cs.lg] 8 Jun 2018
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope Eric Wong 1 J. Zico Kolter 2 arxiv:1711.00851v3 [cs.lg] 8 Jun 2018 Abstract We propose a method to learn deep ReLU-based
More informationADVERSARIAL SPHERES ABSTRACT 1 INTRODUCTION. Workshop track - ICLR 2018
ADVERSARIAL SPHERES Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, & Ian Goodfellow Google Brain {gilmer,lmetz,schsam,maithra,wattenberg,goodfellow}@google.com
More informationCharacterizing and evaluating adversarial examples for Offline Handwritten Signature Verification
1 Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification Luiz G. Hafemann, Robert Sabourin, Member, IEEE, and Luiz S. Oliveira. arxiv:1901.03398v1 [cs.cv] 10
More informationEAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 AI Foundations Lab, IBM T. J. Watson Research Center, Yorktown
More informationECE 595: Machine Learning I Adversarial Attack 1
ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology
More informationarxiv: v2 [stat.ml] 20 Nov 2017
: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 arxiv:1709.04114v2 [stat.ml] 20 Nov 2017 AI Foundations Lab,
More informationECE 595: Machine Learning I Adversarial Attack 1
ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology
More informationIs Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao
More information9 Classification. 9.1 Linear Classifiers
9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive
More informationLinear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.
Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also
More informationSSCNets A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks
A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks Hammad Tariq*, Hassan Ali*, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman,
More informationarxiv: v1 [cs.lg] 30 Jan 2019
A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance Adi Shamir 1, Itay Safran 1, Eyal Ronen 2, and Orr Dunkelman 3 arxiv:1901.10861v1 [cs.lg] 30 Jan 2019 1 Computer
More informationLecture 11. Linear Soft Margin Support Vector Machines
CS142: Machine Learning Spring 2017 Lecture 11 Instructor: Pedro Felzenszwalb Scribes: Dan Xiang, Tyler Dae Devlin Linear Soft Margin Support Vector Machines We continue our discussion of linear soft margin
More informationLarge Margin Deep Networks for Classification
Large Margin Deep Networks for Classification Gamaleldin F. Elsayed Google Research Dilip Krishnan Google Research Hossein Mobahi Google Research Kevin Regan Google Research Samy Bengio Google Research
More informationarxiv: v2 [cs.cr] 11 Sep 2017
MagNet: a Two-Pronged Defense against Adversarial Examples arxiv:1705.09064v2 [cs.cr] 11 Sep 2017 ABSTRACT Dongyu Meng ShanghaiTech University mengdy@shanghaitech.edu.cn Deep learning has shown impressive
More informationarxiv: v1 [cs.cv] 27 Nov 2018
Universal Adversarial Training arxiv:1811.11304v1 [cs.cv] 27 Nov 2018 Ali Shafahi ashafahi@cs.umd.edu Abstract Mahyar Najibi najibi@cs.umd.edu Larry S. Davis lsd@umiacs.umd.edu Standard adversarial attacks
More informationMachine Learning. Support Vector Machines. Fabio Vandin November 20, 2017
Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training
More informationAdversarial Examples
Adversarial Examples presentation by Ian Goodfellow Deep Learning Summer School Montreal August 9, 2015 In this presentation. - Intriguing Properties of Neural Networks. Szegedy et al., ICLR 2014. - Explaining
More informationarxiv: v1 [math.oc] 7 Dec 2018
arxiv:1812.02878v1 [math.oc] 7 Dec 2018 Solving Non-Convex Non-Concave Min-Max Games Under Polyak- Lojasiewicz Condition Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee University of Southern California
More informationNegative Momentum for Improved Game Dynamics
Negative Momentum for Improved Game Dynamics Gauthier Gidel Reyhane Askari Hemmat Mohammad Pezeshki Gabriel Huang Rémi Lepriol Simon Lacoste-Julien Ioannis Mitliagkas Mila & DIRO, Université de Montréal
More informationarxiv: v2 [stat.ml] 4 Dec 2018
Large Margin Deep Networks for Classification Gamaleldin F. Elsayed Google Research Dilip Krishnan Google Research Hossein Mobahi Google Research Kevin Regan Google Research arxiv:1803.05598v2 [stat.ml]
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationThe Limitations of Model Uncertainty in Adversarial Settings
The Limitations of Model Uncertainty in Adversarial Settings Kathrin Grosse 1, David Pfaff 1, Michael T. Smith 2 and Michael Backes 3 1 CISPA, Saarland Informatics Campus 2 University of Sheffield 3 CISPA
More informationCSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18
CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H
More informationarxiv: v4 [cs.cr] 27 Jan 2018
A Learning and Masking Approach to Secure Learning Linh Nguyen, Sky Wang, Arunesh Sinha University of Michigan, Ann Arbor {lvnguyen,skywang,arunesh}@umich.edu arxiv:1709.04447v4 [cs.cr] 27 Jan 2018 ABSTRACT
More informationarxiv: v1 [cs.lg] 4 Mar 2019
A Fundamental Performance Limitation for Adversarial Classification Abed AlRahman Al Makdah, Vaibhav Katewa, and Fabio Pasqualetti arxiv:1903.01032v1 [cs.lg] 4 Mar 2019 Abstract Despite the widespread
More informationAdversarial Examples Generation and Defense Based on Generative Adversarial Network
Adversarial Examples Generation and Defense Based on Generative Adversarial Network Fei Xia (06082760), Ruishan Liu (06119690) December 15, 2016 1 Abstract We propose a novel generative adversarial network
More informationMachine Learning for NLP
Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline
More informationPreliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1
90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression
More informationTowards stability and optimality in stochastic gradient descent
Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction
More informationUniversal Adversarial Networks
1 Universal Adversarial Networks Jamie Hayes University College London j.hayes@cs.ucl.ac.uk Abstract Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally
More informationOnline Passive-Aggressive Algorithms
Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il
More informationDistillation as a Defense to Adversarial Perturbations against Deep Neural Networks
Accepted to the 37th IEEE Symposium on Security & Privacy, IEEE 2016. San Jose, CA. Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks 1 Nicolas Papernot, Patrick McDaniel,
More informationLecture 9: Large Margin Classifiers. Linear Support Vector Machines
Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation
More informationImportance Reweighting Using Adversarial-Collaborative Training
Importance Reweighting Using Adversarial-Collaborative Training Yifan Wu yw4@andrew.cmu.edu Tianshu Ren tren@andrew.cmu.edu Lidan Mu lmu@andrew.cmu.edu Abstract We consider the problem of reweighting a
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationarxiv: v1 [cs.lg] 30 Nov 2018
Adversarial Examples as an Input-Fault Tolerance Problem Angus Galloway 1,2, Anna Golubeva 3,4, and Graham W. Taylor 1,2 arxiv:1811.12601v1 [cs.lg] Nov 2018 1 School of Engineering, University of Guelph
More informationFormal Guarantees on the Robustness of a Classifier against Adversarial Manipulation
Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation Matthias Hein and Maksym Andriushchenko Department of Mathematics and Computer Science Saarland University, Saarbrücken
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationQuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks
QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Hassan Ali *, Hammad Tariq *, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed * and Muhammad Shafique
More informationLecture 8. Instructor: Haipeng Luo
Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine
More informationAdversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials
Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials 1. Contents The supplementary materials contain auxiliary experiments for the empirical analyses
More informationRobustness of classifiers: from adversarial to random noise
Robustness of classifiers: from adversarial to random noise Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard École Polytechnique Fédérale de Lausanne Lausanne, Switzerland {alhussein.fawzi,
More informationDifferentiable Abstract Interpretation for Provably Robust Neural Networks
Matthew Mirman 1 Timon Gehr 1 Martin Vechev 1 Abstract We introduce a scalable method for training robust neural networks based on abstract interpretation. We present several abstract transformers which
More informationMini-Course 1: SGD Escapes Saddle Points
Mini-Course 1: SGD Escapes Saddle Points Yang Yuan Computer Science Department Cornell University Gradient Descent (GD) Task: min x f (x) GD does iterative updates x t+1 = x t η t f (x t ) Gradient Descent
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationDistirbutional robustness, regularizing variance, and adversaries
Distirbutional robustness, regularizing variance, and adversaries John Duchi Based on joint work with Hongseok Namkoong and Aman Sinha Stanford University November 2017 Motivation We do not want machine-learned
More informationarxiv: v3 [cs.cv] 10 Sep 2018
The Relationship Between High-Dimensional Geometry and Adversarial Examples arxiv:1801.02774v3 [cs.cv] 10 Sep 2018 Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin
More informationPROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach
PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach Tsui-Wei Weng 1, Pin-Yu Chen 2, Lam M. Nguyen 2, Mark S. Squillante 2, Ivan Oseledets 3, Luca Daniel 1 1 MIT, 2 IBM Research,
More informationMachine Learning Basics
Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each
More informationUNIVERSALITY, ROBUSTNESS, AND DETECTABILITY
UNIVERSALITY, ROBUSTNESS, AND DETECTABILITY OF ADVERSARIAL PERTURBATIONS UNDER ADVER- SARIAL TRAINING Anonymous authors Paper under double-blind review ABSTRACT Classifiers such as deep neural networks
More informationwhat can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley
what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley Collaborators Joint work with Samy Bengio, Moritz Hardt, Michael Jordan, Jason Lee, Max Simchowitz,
More informationENSEMBLE METHODS AS A DEFENSE TO ADVERSAR-
ENSEMBLE METHODS AS A DEFENSE TO ADVERSAR- IAL PERTURBATIONS AGAINST DEEP NEURAL NET- WORKS Anonymous authors Paper under double-blind review ABSTRACT Deep learning has become the state of the art approach
More informationLecture Support Vector Machine (SVM) Classifiers
Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in
More informationarxiv: v4 [math.oc] 5 Jan 2016
Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x))
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard and Mitch Marcus (and lots original slides by
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationOptimization and Gradient Descent
Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function
More informationBased on the original slides of Hung-yi Lee
Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationFrom Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018
From Binary to Multiclass Classification CS 6961: Structured Prediction Spring 2018 1 So far: Binary Classification We have seen linear models Learning algorithms Perceptron SVM Logistic Regression Prediction
More informationarxiv: v4 [cs.lg] 28 Mar 2016
Analysis of classifiers robustness to adversarial perturbations Alhussein Fawzi Omar Fawzi Pascal Frossard arxiv:0.090v [cs.lg] 8 Mar 06 Abstract The goal of this paper is to analyze an intriguing phenomenon
More informationPredictive analysis on Multivariate, Time Series datasets using Shapelets
1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationMulticlass Classification-1
CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass
More informationDeep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes
Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes Daniel M. Roy University of Toronto; Vector Institute Joint work with Gintarė K. Džiugaitė University
More informationLinear classifiers Lecture 3
Linear classifiers Lecture 3 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin ML Methodology Data: labeled instances, e.g. emails marked spam/ham
More informationDiscriminative Learning can Succeed where Generative Learning Fails
Discriminative Learning can Succeed where Generative Learning Fails Philip M. Long, a Rocco A. Servedio, b,,1 Hans Ulrich Simon c a Google, Mountain View, CA, USA b Columbia University, New York, New York,
More informationCharacterization of Gradient Dominance and Regularity Conditions for Neural Networks
Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Yi Zhou Ohio State University Yingbin Liang Ohio State University Abstract zhou.1172@osu.edu liang.889@osu.edu The past
More informationCertified Robustness to Adversarial Examples with Differential Privacy
Certified Robustness to Adversarial Examples with Differential Privacy Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana Columbia University Abstract Adversarial examples
More informationAdversarial Attacks, Regression, and Numerical Stability Regularization
Adversarial Attacks, Regression, and Numerical Stability Regularization Andre T. Nguyen and Edward Raff {Nguyen_Andre, Raff_Edward}@bah.com Booz Allen Hamilton Abstract Adversarial attacks against neural
More informationAdversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals
Noname manuscript No. (will be inserted by the editor) Adversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals Yaniv Romano Aviad Aberdam Jeremias Sulam Michael
More informationVariations of Logistic Regression with Stochastic Gradient Descent
Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic
More informationRecurrent Neural Networks with Flexible Gates using Kernel Activation Functions
2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,
More informationGeometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat
Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental
More informationClassification of Hand-Written Digits Using Scattering Convolutional Network
Mid-year Progress Report Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan Co-Advisor: Dr. Maneesh Singh (SRI) Background Overview
More informationComparison of Modern Stochastic Optimization Algorithms
Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,
More informationCutting Plane Training of Structural SVM
Cutting Plane Training of Structural SVM Seth Neel University of Pennsylvania sethneel@wharton.upenn.edu September 28, 2017 Seth Neel (Penn) Short title September 28, 2017 1 / 33 Overview Structural SVMs
More informationLEARNING WITH A STRONG ADVERSARY
LEARNING WITH A STRONG ADVERSARY Ruitong Huang, Bing Xu, Dale Schuurmans and Csaba Szepesvári Department of Computer Science University of Alberta Edmonton, AB T6G 2E8, Canada {ruitong,bx3,daes,szepesva}@ualberta.ca
More informationLearning Kernel Parameters by using Class Separability Measure
Learning Kernel Parameters by using Class Separability Measure Lei Wang, Kap Luk Chan School of Electrical and Electronic Engineering Nanyang Technological University Singapore, 3979 E-mail: P 3733@ntu.edu.sg,eklchan@ntu.edu.sg
More informationDEEP neural networks (DNNs) [2] have been widely
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, MANUSCRIPT ID 1 Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationLarge Scale Semi-supervised Linear SVM with Stochastic Gradient Descent
Journal of Computational Information Systems 9: 15 (2013) 6251 6258 Available at http://www.jofcis.com Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Xin ZHOU, Conghui ZHU, Sheng
More information