ECE 595: Machine Learning I Adversarial Attack 1

Size: px

Start display at page:

Download "ECE 595: Machine Learning I Adversarial Attack 1"

Eugene Brown
5 years ago
Views:

1 ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32

2 Outline Examples of Adversarial Attack Basic Terminology Multi-Class Classification Geometry of Objective Geometry of Constraint Targeted VS Untargeted White Box VS Black Box Minimum-Norm Attack / Deep Fool Maximum-Allowable Attack / FGSM, IFGSM, PGD Regularized Attack / CW Random Noise Attack 2 / 32

3 Today s Deep Neural Networks 3 / 32

4 Adversarial Attack: Example 1 It is not difficult to fool a classifier The perturbation could be perceptually not noticeable Goodfellow et al. Explaining and Harnessing Adversarial Examples, 4 / 32

5 Adversarial Attack: Example 2 This paper actually appears one year before Goodfellow s 2014 paper. Szegedy et al. Intriguing properties of neural networks 5 / 32

6 Adversarial Attack: Example 3 Targeted Attack Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics, 6 / 32

7 Adversarial Attack: Example 4 One-pixel Attack One pixel attack for fooling deep neural networks 7 / 32

8 Adversarial Attack: Example 5 Adding a patch LaVAN: Localized and Visible Adversarial Noise, 8 / 32

9 Adversarial Attack: Example 6 The Michigan / Berkeley Stop Sign Robust Physical-World Attacks on Deep Learning Models 9 / 32

10 Adversarial Attack: Example 7 The MIT 3D Turtle Synthesizing Robust Adversarial Examples / 32

11 Adversarial Attack: Example 8 Google Toaster Adversarial Patch / 32

12 Adversarial Attack: Example 9 CMU Glass Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition / 32

13 Adversarial Attack: A Survey Adversarial Examples: Attacks and Defenses for Deep Learning 13 / 32

14 Definition 14 / 32

15 Definition Definition (Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An adversarial attack is a mapping A : R d R d such that the perturbed data x = A(x 0) is misclassified as C t. 14 / 32

16 Definition Definition (Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An adversarial attack is a mapping A : R d R d such that the perturbed data x = A(x 0) is misclassified as C t. 14 / 32

17 Definition 15 / 32

18 Definition Definition (Additive Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An additive adversarial attack is an addition of a perturbation r R d such that the perturbed data is misclassified as C t. x = x 0 + r 15 / 32

19 Definition Definition (Additive Adversarial Attack) Let x 0 R d be a data point belong to class C i. Define a target class C t. An additive adversarial attack is an addition of a perturbation r R d such that the perturbed data is misclassified as C t. x = x 0 + r 15 / 32

20 The Multi-Class Problem 16 / 32

21 The Multi-Class Problem Approach 1: One-on-One 16 / 32

22 The Multi-Class Problem Approach 1: One-on-One 16 / 32

23 The Multi-Class Problem Approach 1: One-on-One 16 / 32

24 The Multi-Class Problem Approach 1: One-on-One 16 / 32

25 The Multi-Class Problem Approach 1: One-on-One 16 / 32

26 The Multi-Class Problem Approach 1: One-on-One 16 / 32

27 The Multi-Class Problem Approach 1: One-on-One 16 / 32

28 The Multi-Class Problem Approach 1: One-on-One 16 / 32

29 The Multi-Class Problem Approach 1: One-on-One 17 / 32

30 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j 17 / 32

31 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j Give me a point, check which class has more votes 17 / 32

32 The Multi-Class Problem Approach 1: One-on-One Class i VS Class j Give me a point, check which class has more votes There is an undetermined region 17 / 32

33 The Multi-Class Problem 18 / 32

34 The Multi-Class Problem Approach 2: One-on-All 18 / 32

35 The Multi-Class Problem Approach 2: One-on-All 18 / 32

36 The Multi-Class Problem Approach 2: One-on-All 18 / 32

37 The Multi-Class Problem Approach 2: One-on-All 18 / 32

38 The Multi-Class Problem Approach 2: One-on-All 18 / 32

39 The Multi-Class Problem Approach 2: One-on-All 18 / 32

40 The Multi-Class Problem Approach 2: One-on-All 18 / 32

41 The Multi-Class Problem Approach 2: One-on-All 18 / 32

42 The Multi-Class Problem Approach 2: One-on-All 18 / 32

43 The Multi-Class Problem 19 / 32

44 The Multi-Class Problem Approach 2: One-on-All 19 / 32

45 The Multi-Class Problem Approach 2: One-on-All 19 / 32

46 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i 19 / 32

47 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i Give me a point, check which class has no conflict 19 / 32

48 The Multi-Class Problem Approach 2: One-on-All Class i VS not Class i Give me a point, check which class has no conflict There are undetermined regions 19 / 32

49 The Multi-Class Problem 20 / 32

50 The Multi-Class Problem Approach 3: Linear Machine 20 / 32

51 The Multi-Class Problem Approach 3: Linear Machine 20 / 32

52 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. 20 / 32

53 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. You give me x, I compute g 1 (x), g 2 (x),..., g K (x). 20 / 32

54 The Multi-Class Problem Approach 3: Linear Machine Every point in the space gets assigned a class. You give me x, I compute g 1 (x), g 2 (x),..., g K (x). If g i (x) g j (x) for all j i, then x belongs to class i. 20 / 32

55 Correct Classification 21 / 32

56 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. 21 / 32

57 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to. 21 / 32

58 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) / 32

59 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) / 32

60 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0,. 21 / 32

61 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to and is also equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0, max{g j (x)} g i (x) 0 j i. 21 / 32

62 Correct Classification The statement: If g i (x) g j (x) for all j i, then x belongs to class i. is equivalent to and is also equivalent to g 1 (x) g i (x) 0 g 2 (x) g i (x) 0 g k (x) g i (x) 0, max{g j (x)} g i (x) 0 j i In adversarial attack: I want to move you to class t: max j t {g j (x)} g t (x) / 32

63 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization (1) 22 / 32

64 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization subject to max j t {g j (x)} g t (x) 0, (1) 22 / 32

65 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. 22 / 32

66 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. 22 / 32

67 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. So the constraint needs to be satisfied. 22 / 32

68 Minimum Norm Attack Definition (Minimum Norm Attack) The minimum norm attack finds a perturbed data x by solving the optimization minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, (1) where can be any norm specified by the user. I want to make you to class C t. So the constraint needs to be satisfied. But I also want to minimize the attack strength. This gives the objective. 22 / 32

69 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization (2) 23 / 32

70 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) 23 / 32

71 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. 23 / 32

72 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η 23 / 32

73 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η I want to make g t (x) as big as possible 23 / 32

74 Maximum Allowable Attack Definition (Maximum Allowable Attack) The maximum allowable attack finds a perturbed data x by solving the optimization minimize max j t {g j (x)} g t (x) x (2) subject to x x 0 η, where can be any norm specified by the user, and η > 0 denotes the magnitude of the attack. I want to bound my attack x x 0 η I want to make g t (x) as big as possible So I want to minimize max j t {g j (x)} g t (x) 23 / 32

75 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization (3) 24 / 32

76 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization + λ (max j t {g j (x)} g t (x)) (3) 24 / 32

77 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. 24 / 32

78 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. Combine the two parts via regularization 24 / 32

79 Regularization-based Attack Definition (Regularization-based Attack) The regularization-based attack finds a perturbed data x by solving the optimization minimize x x x 0 + λ (max j t {g j (x)} g t (x)) (3) where can be any norm specified by the user, and λ > 0 is a regularization parameter. Combine the two parts via regularization By adjusting (ɛ, η, λ), all three will give the same optimal value. 24 / 32

80 Understanding the Geometry: Objective Function 25 / 32

81 Understanding the Geometry: Objective Function 25 / 32

82 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. 25 / 32

83 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. l 1 -norm: ϕ(x) = x x 0 1, which is a convex surrogate of the l 0 -norm. 25 / 32

84 Understanding the Geometry: Objective Function l 0 -norm: ϕ(x) = x x 0 0, which gives the most sparse solution. Useful when we want to limit the number of attack pixels. l 1 -norm: ϕ(x) = x x 0 1, which is a convex surrogate of the l 0 -norm. l -norm: ϕ(x) = x x 0, which minimizes the maximum element of the perturbation. 25 / 32

85 Understanding the Geometry: Constraint 26 / 32

86 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} 26 / 32

87 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} We can write Ω as Ω = x g 1 (x) g t (x) 0 g 2 (x) g t (x) 0. g k (x) g t (x) 0 26 / 32

88 Understanding the Geometry: Constraint The constraint set is Ω = {x max j t {g j(x)} g t (x) 0} We can write Ω as Ω = x g 1 (x) g t (x) 0 g 2 (x) g t (x) 0. g k (x) g t (x) 0 Remark: If you want to replace max by i, then i is a function of x: Ω = { x g i (x)(x) g t (x) 0 }. 26 / 32

89 Understanding the Geometry: Constraint 27 / 32

90 Linear Classifier 28 / 32

91 Linear Classifier Let us take a closer look at the linear case. 28 / 32

92 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. 28 / 32

93 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. The decision boundary between the i-th class and the t-th class is therefore g(x) = (w i w t) T x + w i,0 w t,0 = / 32

94 Linear Classifier Let us take a closer look at the linear case. Each discriminant function takes the form g i (x) = w T i x + w i,0. The decision boundary between the i-th class and the t-th class is therefore g(x) = (w i w t) T x + w i,0 w t,0 = 0. The constraint set Ω is w T 1 w T t. w T t 1 w T t w T t+1 w T t. w T k w T t x + w 1,0 w t,0. w t 1,0 w t,0 w t+1,0 w t,0 0 A T x b. w k,0 w t,0 28 / 32

95 Linear Classifier 29 / 32

96 Linear Classifier You can show Ω = {A T x b} is convex. 29 / 32

97 Linear Classifier You can show Ω = {A T x b} is convex. But the complement Ω c = {A T x > b} is not convex. 29 / 32

98 Linear Classifier You can show Ω = {A T x b} is convex. But the complement Ω c = {A T x > b} is not convex. So targeted attack is easier to analyze than untargeted attack. 29 / 32

99 Attack: The Simplest Example The optimization is: 30 / 32

100 Attack: The Simplest Example The optimization is: subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then 30 / 32

101 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, 30 / 32

102 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, This is a quadratic programming problem. 30 / 32

103 Attack: The Simplest Example The optimization is: minimize x x 0 x subject to max j t {g j (x)} g t (x) 0, If you do l 2 -norm attack linear classifiers, then the attack is given by minimize x x x 0 2 subject to A T x b, This is a quadratic programming problem. We will discuss how to solve this problem analytically. 30 / 32

104 Attack: Another Simple Example 31 / 32

105 Attack: Another Simple Example You can change the l 2 to l / 32

106 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, 31 / 32

107 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, This problem is the same as minimize r r 1 subject to A T r b, where b = b A T x 0 31 / 32

108 Attack: Another Simple Example You can change the l 2 to l 1. Then you have minimize x x x 0 1 subject to A T x b, This problem is the same as minimize r r 1 subject to A T r b, where b = b A T x 0 We can turn this into a linear programming problem minimize r +,r subject to r + + r [A T A T ] [ ] r + b, r + 0, and r 0. r 31 / 32

109 Any Question? 32 / 32

ECE 595: Machine Learning I Adversarial Attack 1

ECE 595: Machine Learning I Adversarial Attack 1 Spring 2019 Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 32 Outline Examples of Adversarial Attack Basic Terminology