ECE 595: Machine Learning I Adversarial Attack 1
|
|
- Eugene Brown
- 5 years ago
- Views:
Transcription
1 ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32
2 Outline Examples of Adversarial Attack Basic Terminology Multi-Class Classification Geometry of Objective Geometry of Constraint Targeted VS Untargeted White Box VS Black Box Minimum-Norm Attack / Deep Fool Maximum-Allowable Attack / FGSM, IFGSM, PGD Regularized Attack / CW Random Noise Attack 2 / 32
3 Today s Deep Neural Networks 3 / 32
4 Adversarial Attack: Example 1 It is not difficult to fool a classifier The perturbation could be perceptually not noticeable Goodfellow et al. Explaining and Harnessing Adversarial Examples, 4 / 32
5 Adversarial Attack: Example 2 This paper actually appears one year before Goodfellow s 2014 paper. Szegedy et al. Intriguing properties of neural networks 5 / 32
6 Adversarial Attack: Example 3 Targeted Attack Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics, 6 / 32
7 Adversarial Attack: Example 4 One-pixel Attack One pixel attack for fooling deep neural networks 7 / 32
8 Adversarial Attack: Example 5 Adding a patch LaVAN: Localized and Visible Adversarial Noise, 8 / 32
9 Adversarial Attack: Example 6 The Michigan / Berkeley Stop Sign Robust Physical-World Attacks on Deep Learning Models 9 / 32
10 Adversarial Attack: Example 7 The MIT 3D Turtle Synthesizing Robust Adversarial Examples / 32
11 Adversarial Attack: Example 8 Google Toaster Adversarial Patch / 32
12 Adversarial Attack: Example 9 CMU Glass Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition / 32
13 Adversarial Attack: A Survey Adversarial Examples: Attacks and Defenses for Deep Learning 13 / 32
14 Definition 14 / 32
15 Definition Definition (Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An adversarial attack is a mapping A : R d R d such that the perturbed data x = A(x 0) is misclassified as C t. 14 / 32
16 Definition Definition (Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An adversarial attack is a mapping A : R d R d such that the perturbed data x = A(x 0) is misclassified as C t. 14 / 32
17 Definition 15 / 32
18 Definition Definition (Additive Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An additive adversarial attack is an addition of a perturbation r R d such that the perturbed data is misclassified as C t. x = x 0 + r 15 / 32
19 Definition Definition (Additive Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An additive adversarial attack is an addition of a perturbation r R d such that the perturbed data is misclassified as C t. x = x 0 + r 15 / 32
20 The Multi-Class Problem 16 / 32
21 The Multi-Class Problem Approach 1: One-on-One 16 / 32
22 The Multi-Class Problem Approach 1: One-on-One 16 / 32
23 The Multi-Class Problem Approach 1: One-on-One 16 / 32
24 The Multi-Class Problem Approach 1: One-on-One 16 / 32
25 The Multi-Class Problem Approach 1: One-on-One 16 / 32
26 The Multi-Class Problem Approach 1: One-on-One 16 / 32
27 The Multi-Class Problem Approach 1: One-on-One 16 / 32
28 The Multi-Class Problem Approach 1: One-on-One 16 / 32
29 The Multi-Class Problem Approach 1: One-on-One 17 / 32
30 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j 17 / 32
31 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j Give me a point, check which class has more votes 17 / 32
32 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j Give me a point, check which class has more votes There is an undetermined region 17 / 32
33 The Multi-Class Problem 18 / 32
34 The Multi-Class Problem Approach 2: One-on-All 18 / 32
35 The Multi-Class Problem Approach 2: One-on-All 18 / 32
36 The Multi-Class Problem Approach 2: One-on-All 18 / 32
37 The Multi-Class Problem Approach 2: One-on-All 18 / 32
38 The Multi-Class Problem Approach 2: One-on-All 18 / 32
39 The Multi-Class Problem Approach 2: One-on-All 18 / 32
40 The Multi-Class Problem Approach 2: One-on-All 18 / 32
41 The Multi-Class Problem Approach 2: One-on-All 18 / 32
42 The Multi-Class Problem Approach 2: One-on-All 18 / 32
43 The Multi-Class Problem 19 / 32
44 The Multi-Class Problem Approach 2: One-on-All 19 / 32
45 The Multi-Class Problem Approach 2: One-on-All 19 / 32
46 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i 19 / 32
47 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i Give me a point, check which class has no conflict 19 / 32
48 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i Give me a point, check which class has no conflict There are undetermined regions 19 / 32
49 The Multi-Class Problem 20 / 32
50 The Multi-Class Problem Approach 3: Linear Machine 20 / 32
51 The Multi-Class Problem Approach 3: Linear Machine 20 / 32
52 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. 20 / 32
53 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. You give me x, I compute g 1 (x), g 2 (x),..., g K (x). 20 / 32
54 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. You give me x, I compute g 1 (x), g 2 (x),..., g K (x). If g i (x) g j (x) for all j i, then x belongs to class i. 20 / 32
55 Correct Classification 21 / 32
56 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. 21 / 32
57 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to. 21 / 32
58 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) / 32
59 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) / 32
60 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0,. 21 / 32
61 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to and is also equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0, max{g j (x)} g i (x) 0 j i. 21 / 32
62 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to and is also equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0, max{g j (x)} g i (x) 0 j i In adversarial attack: I want to move you to class t: max j t {g j (x)} g t (x) / 32
63 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization (1) 22 / 32
64 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization subject to max j t {g j (x)} g t (x) 0, (1) 22 / 32
65 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. 22 / 32
66 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. 22 / 32
67 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. So the constraint needs to be satisfied. 22 / 32
68 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. So the constraint needs to be satisfied. But I also want to minimize the attack strength. This gives the objective. 22 / 32
69 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization (2) 23 / 32
70 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) 23 / 32
71 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. 23 / 32
72 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η 23 / 32
73 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η I want to make g t (x) as big as possible 23 / 32
74 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η I want to make g t (x) as big as possible So I want to minimize max j t {g j (x)} g t (x) 23 / 32
75 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization (3) 24 / 32
76 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization + λ (max j t {g j (x)} g t (x)) (3) 24 / 32
77 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. 24 / 32
78 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. Combine the two parts via regularization 24 / 32
79 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. Combine the two parts via regularization By adjusting (ɛ, η, λ), all three will give the same optimal value. 24 / 32
80 Understanding the Geometry: Objective Function 25 / 32
81 Understanding the Geometry: Objective Function 25 / 32
82 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. 25 / 32
83 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. l 1 -norm: ϕ(x) = x x 0 1, which is a convex surrogate of the l 0 -norm. 25 / 32
84 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. l 1 -norm: ϕ(x) = x x 0 1, which is a convex surrogate of the l 0 -norm. l -norm: ϕ(x) = x x 0, which minimizes the maximum element of the perturbation. 25 / 32
85 Understanding the Geometry: Constraint 26 / 32
86 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} 26 / 32
87 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} We can write Ω as Ω = x g 1 (x) g t (x) 0 g 2 (x) g t (x) 0. g k (x) g t (x) 0 26 / 32
88 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} We can write Ω as Ω = x g 1 (x) g t (x) 0 g 2 (x) g t (x) 0. g k (x) g t (x) 0 Remark: If you want to replace max by i, then i is a function of x: Ω = { x g i (x)(x) g t (x) 0 }. 26 / 32
89 Understanding the Geometry: Constraint 27 / 32
90 Linear Classifier 28 / 32
91 Linear Classifier Let us take a closer look at the linear case. 28 / 32
92 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. 28 / 32
93 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. The decision boundary between the i-th class and the t-th class is therefore g(x) = (w i w t) T x + w i,0 w t,0 = / 32
94 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. The decision boundary between the i-th class and the t-th class is therefore g(x) = (w i w t) T x + w i,0 w t,0 = 0. The constraint set Ω is w T 1 w T t. w T t 1 w T t w T t+1 w T t. w T k w T t x + w 1,0 w t,0. w t 1,0 w t,0 w t+1,0 w t,0 0 A T x b. w k,0 w t,0 28 / 32
95 Linear Classifier 29 / 32
96 Linear Classifier You can show Ω = {A T x b} is convex. 29 / 32
97 Linear Classifier You can show Ω = {A T x b} is convex. But the complement Ω c = {A T x > b} is not convex. 29 / 32
98 Linear Classifier You can show Ω = {A T x b} is convex. But the complement Ω c = {A T x > b} is not convex. So targeted attack is easier to analyze than untargeted attack. 29 / 32
99 Attack: The Simplest Example The optimization is: 30 / 32
100 Attack: The Simplest Example The optimization is: subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then 30 / 32
101 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, 30 / 32
102 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, This is a quadratic programming problem. 30 / 32
103 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, This is a quadratic programming problem. We will discuss how to solve this problem analytically. 30 / 32
104 Attack: Another Simple Example 31 / 32
105 Attack: Another Simple Example You can change the l 2 to l / 32
106 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, 31 / 32
107 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, This problem is the same as minimize r r 1 subject to A T r b, where b = b A T x 0 31 / 32
108 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, This problem is the same as minimize r r 1 subject to A T r b, where b = b A T x 0 We can turn this into a linear programming problem minimize r +,r subject to r + + r [A T A T ] [ ] r + b, r + 0, and r 0. r 31 / 32
109 Any Question? 32 / 32
ECE 595: Machine Learning I Adversarial Attack 1
ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology
More informationNotes on Adversarial Examples
Notes on Adversarial Examples David Meyer dmm@{1-4-5.net,uoregon.edu,...} March 14, 2017 1 Introduction The surprising discovery of adversarial examples by Szegedy et al. [6] has led to new ways of thinking
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Support Vector Machine (SVM) Hamid R. Rabiee Hadi Asheri, Jafar Muhammadi, Nima Pourdamghani Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Introduction
More informationIs Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao
More informationAdversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials
Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials 1. Contents The supplementary materials contain auxiliary experiments for the empirical analyses
More informationAdversarially Robust Optimization and Generalization
Adversarially Robust Optimization and Generalization Ludwig Schmidt MIT UC Berkeley Based on joint works with Logan ngstrom (MIT), Aleksander Madry (MIT), Aleksandar Makelov (MIT), Dimitris Tsipras (MIT),
More informationTowards ML You Can Rely On. Aleksander Mądry
Towards ML You Can Rely On Aleksander Mądry @aleks_madry madry-lab.ml Machine Learning: The Success Story? Image classification Reinforcement Learning Machine translation Machine Learning: The Success
More informationLearning Linear Detectors
Learning Linear Detectors Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Detection versus Classification Bayes Classifiers Linear Classifiers Examples of Detection 3 Learning: Detection
More informationDeep Learning: Approximation of Functions by Composition
Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation
More informationarxiv: v1 [cs.cv] 2 Aug 2016
A study of the effect of JPG compression on adversarial images arxiv:1608.00853v1 [cs.cv] 2 Aug 2016 Gintare Karolina Dziugaite Department of Engineering University of Cambridge Daniel M. Roy Department
More informationAdversarial Examples
Adversarial Examples presentation by Ian Goodfellow Deep Learning Summer School Montreal August 9, 2015 In this presentation. - Intriguing Properties of Neural Networks. Szegedy et al., ICLR 2014. - Explaining
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationEAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 AI Foundations Lab, IBM T. J. Watson Research Center, Yorktown
More informationSSCNets A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks
A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks Hammad Tariq*, Hassan Ali*, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman,
More informationarxiv: v1 [stat.ml] 27 Nov 2018
Robust Classification of Financial Risk arxiv:1811.11079v1 [stat.ml] 27 Nov 2018 Suproteem K. Sarkar suproteemsarkar@g.harvard.edu Daniel Giebisch * danielgiebisch@college.harvard.edu Abstract Kojin Oshiba
More informationDiscriminative Learning and Big Data
AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture
More informationEnhancing Robustness of Machine Learning Systems via Data Transformations
Enhancing Robustness of Machine Learning Systems via Data Transformations Arjun Nitin Bhagoji Princeton University Daniel Cullina Princeton University Bink Sitawarin Princeton University Prateek Mittal
More informationarxiv: v2 [stat.ml] 20 Nov 2017
: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples Pin-Yu Chen1, Yash Sharma2, Huan Zhang3, Jinfeng Yi4, Cho-Jui Hsieh3 1 arxiv:1709.04114v2 [stat.ml] 20 Nov 2017 AI Foundations Lab,
More informationBagging and Other Ensemble Methods
Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationMeasuring the Robustness of Neural Networks via Minimal Adversarial Examples
Measuring the Robustness of Neural Networks via Minimal Adversarial Examples Sumanth Dathathri sdathath@caltech.edu Stephan Zheng stephan@caltech.edu Sicun Gao sicung@ucsd.edu Richard M. Murray murray@cds.caltech.edu
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationSupport Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:
More informationarxiv: v1 [stat.ml] 15 Mar 2018
Large Margin Deep Networks for Classification arxiv:1803.05598v1 [stat.ml] 15 Mar 2018 Gamaleldin F. Elsayed Dilip Krishnan Hossein Mobahi Kevin Regan Samy Bengio Google Research {gamaleldin,dilipkay,hmobahi,kevinregan,bengio}@google.com
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationarxiv: v1 [cs.lg] 4 Mar 2019
A Fundamental Performance Limitation for Adversarial Classification Abed AlRahman Al Makdah, Vaibhav Katewa, and Fabio Pasqualetti arxiv:1903.01032v1 [cs.lg] 4 Mar 2019 Abstract Despite the widespread
More informationSupport Vector Machine (continued)
Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 3: Linear Models I (LFD 3.2, 3.3) Cho-Jui Hsieh UC Davis Jan 17, 2018 Linear Regression (LFD 3.2) Regression Classification: Customer record Yes/No Regression: predicting
More informationarxiv: v1 [cs.lg] 6 Dec 2018
Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, and Ruitong Huang Borealis AI arxiv:1812.02637v1
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationLinear Models for Classification
Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Linear
More informationRegularization in Neural Networks
Regularization in Neural Networks Sargur Srihari 1 Topics in Neural Network Regularization What is regularization? Methods 1. Determining optimal number of hidden units 2. Use of regularizer in error function
More informationCharacterizing and evaluating adversarial examples for Offline Handwritten Signature Verification
1 Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification Luiz G. Hafemann, Robert Sabourin, Member, IEEE, and Luiz S. Oliveira. arxiv:1901.03398v1 [cs.cv] 10
More informationAdversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals
Noname manuscript No. (will be inserted by the editor) Adversarial Noise Attacks of Deep Learning Architectures Stability Analysis via Sparse Modeled Signals Yaniv Romano Aviad Aberdam Jeremias Sulam Michael
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationMachine Learning (CSE 446): Multi-Class Classification; Kernel Methods
Machine Learning (CSE 446): Multi-Class Classification; Kernel Methods Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 12 Announcements HW3 due date as posted. make sure
More informationSingle layer NN. Neuron Model
Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent
More informationarxiv: v3 [cs.lg] 22 Mar 2018
arxiv:1710.06081v3 [cs.lg] 22 Mar 2018 Boosting Adversarial Attacks with Momentum Yinpeng Dong1, Fangzhou Liao1, Tianyu Pang1, Hang Su1, Jun Zhu1, Xiaolin Hu1, Jianguo Li2 1 Department of Computer Science
More informationNeural networks and optimization
Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional
More informationLearning Methods for Linear Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2
More informationNonlinear Support Vector Machines through Iterative Majorization and I-Splines
Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationChapter 14 Combining Models
Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients
More informationConvex Optimization and Support Vector Machine
Convex Optimization and Support Vector Machine Problem 0. Consider a two-class classification problem. The training data is L n = {(x 1, t 1 ),..., (x n, t n )}, where each t i { 1, 1} and x i R p. We
More informationUniversal Adversarial Networks
1 Universal Adversarial Networks Jamie Hayes University College London j.hayes@cs.ucl.ac.uk Abstract Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally
More informationClassification of Hand-Written Digits Using Scattering Convolutional Network
Mid-year Progress Report Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan Co-Advisor: Dr. Maneesh Singh (SRI) Background Overview
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationRegML 2018 Class 8 Deep learning
RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x
More informationThe Perceptron. Volker Tresp Summer 2014
The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a
More informationConvex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)
ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality
More informationChemometrics: Classification of spectra
Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture
More informationBayesian Decision Theory
Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationMore about the Perceptron
More about the Perceptron CMSC 422 MARINE CARPUAT marine@cs.umd.edu Credit: figures by Piyush Rai and Hal Daume III Recap: Perceptron for binary classification Classifier = hyperplane that separates positive
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationHierarchical Boosting and Filter Generation
January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers
More informationarxiv: v3 [cs.lg] 8 Jun 2018
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope Eric Wong 1 J. Zico Kolter 2 arxiv:1711.00851v3 [cs.lg] 8 Jun 2018 Abstract We propose a method to learn deep ReLU-based
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationENSEMBLE METHODS AS A DEFENSE TO ADVERSAR-
ENSEMBLE METHODS AS A DEFENSE TO ADVERSAR- IAL PERTURBATIONS AGAINST DEEP NEURAL NET- WORKS Anonymous authors Paper under double-blind review ABSTRACT Deep learning has become the state of the art approach
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction
More informationLecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants
Advanced Machine Learning Lecture 2 Neural Networks 24..206 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI) 28..
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 4: Optimization (LFD 3.3, SGD) Cho-Jui Hsieh UC Davis Jan 22, 2018 Gradient descent Optimization Goal: find the minimizer of a function min f (w) w For now we assume f
More informationSparse Kernel Machines - SVM
Sparse Kernel Machines - SVM Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Support
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More informationRobustness of classifiers: from adversarial to random noise
Robustness of classifiers: from adversarial to random noise Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard École Polytechnique Fédérale de Lausanne Lausanne, Switzerland {alhussein.fawzi,
More informationWhy Should I Trust You? Speaker: Philip-W. Grassal Date: 05/03/18
Why Should I Trust You? Speaker: Philip-W. Grassal Date: 05/03/18 Why should I trust you? Why Should I Trust You? - Philip Grassal 2 Based on... Title: Why Should I Trust You? Explaining the Predictions
More informationPrimal-Dual Interior-Point Methods
Primal-Dual Interior-Point Methods Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 Outline Today: Primal-dual interior-point method Special case: linear programming
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationIPAM Summer School Optimization methods for machine learning. Jorge Nocedal
IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep
More informationarxiv: v2 [stat.ml] 23 May 2017 Abstract
The Space of Transferable Adversarial Examples Florian Tramèr 1, Nicolas Papernot 2, Ian Goodfellow 3, Dan Boneh 1, and Patrick McDaniel 2 1 Stanford University, 2 Pennsylvania State University, 3 Google
More informationCS5670: Computer Vision
CS5670: Computer Vision Noah Snavely Lecture 5: Feature descriptors and matching Szeliski: 4.1 Reading Announcements Project 1 Artifacts due tomorrow, Friday 2/17, at 11:59pm Project 2 will be released
More informationClassification and Pattern Recognition
Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationUVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 9: Classifica8on with Support Vector Machine (cont.
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 9: Classifica8on with Support Vector Machine (cont.) Yanjun Qi / Jane University of Virginia Department of Computer Science
More informationDistirbutional robustness, regularizing variance, and adversaries
Distirbutional robustness, regularizing variance, and adversaries John Duchi Based on joint work with Hongseok Namkoong and Aman Sinha Stanford University November 2017 Motivation We do not want machine-learned
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationSupport Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification
More informationOPTIMIZATION METHODS IN DEEP LEARNING
Tutorial outline OPTIMIZATION METHODS IN DEEP LEARNING Based on Deep Learning, chapter 8 by Ian Goodfellow, Yoshua Bengio and Aaron Courville Presented By Nadav Bhonker Optimization vs Learning Surrogate
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationPattern Recognition and Machine Learning. Perceptrons and Support Vector machines
Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3 - MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3
More informationConvex Optimization in Classification Problems
New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley elghaoui@eecs.berkeley.edu 1
More informationRobust Kernel-Based Regression
Robust Kernel-Based Regression Budi Santosa Department of Industrial Engineering Sepuluh Nopember Institute of Technology Kampus ITS Surabaya Surabaya 60111,Indonesia Theodore B. Trafalis School of Industrial
More informationLearning to Attack: Adversarial Transformation Networks
Learning to Attack: Adversarial Transformation Networks Shumeet Baluja and Ian Fischer Google Research Google, Inc. Abstract With the rapidly increasing popularity of deep neural networks for image recognition
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationAdversarial Examples Generation and Defense Based on Generative Adversarial Network
Adversarial Examples Generation and Defense Based on Generative Adversarial Network Fei Xia (06082760), Ruishan Liu (06119690) December 15, 2016 1 Abstract We propose a novel generative adversarial network
More informationThe Perceptron. Volker Tresp Summer 2016
The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters
More informationConsistency of Nearest Neighbor Methods
E0 370 Statistical Learning Theory Lecture 16 Oct 25, 2011 Consistency of Nearest Neighbor Methods Lecturer: Shivani Agarwal Scribe: Arun Rajkumar 1 Introduction In this lecture we return to the study
More informationMathematical introduction to Compressed Sensing
Mathematical introduction to Compressed Sensing Lesson 1 : measurements and sparsity Guillaume Lecué ENSAE Mardi 31 janvier 2016 Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 1 / 31
More informationMachine Learning Basics
Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each
More information