Toward Adversarial Learning as Control

Size: px

Start display at page:

Download "Toward Adversarial Learning as Control"

Elijah Sullivan
5 years ago
Views:

1 Toward Adversarial Learning as Control Jerry Zhu University of Wisconsin-Madison The 2nd ARO/IARPA Workshop on Adversarial Machine Learning May 9, 2018

2 Test time attacks Given classifier f : X Y, x X Attacker finds x X : min x s.t. x x f(x ) f(x).

3 Large margin defense against test time attacks Defender finds f F: min f s.t. f f f (x ) = f(x), training x, x Ball(x, ɛ).

4 Heuristic implementation of large margin defense Repeat: (x, x ) OracleAttacker(f) Add (x, f(x)) to (X, Y ) f A(X, Y )

5 Training set poisoning attacks Given learner A : (X Y) F, data (X, Y ), goal Φ : F bool

6 Training set poisoning attacks Given learner A : (X Y) F, data (X, Y ), goal Φ : F bool Attacker finds poisoned data (X, Y ) min (X,Y ),f (X, Y ) (X, Y ) s.t. f = A(X, Y ) Φ(f) = true.

7 defense = poisoning = machine teaching [An Overview of Machine Teaching. ArXiv , 2018]

8 defense = poisoning = machine teaching = control [An Overview of Machine Teaching. ArXiv , 2018]

9 Attacking a sequential learner A =SGD Learner A (plant): starts at w 0 R d w t w t 1 η l(w t 1, x t, y t )

10 Attacking a sequential learner A =SGD Learner A (plant): starts at w 0 R d w t w t 1 η l(w t 1, x t, y t ) Attacker: designs (x 1, y 1 )... (x T, y T ) (control signal) wants to drive w T to some w optionally minimizes T

11 Nonlinear discrete-time optimal control...even for simple linear regression: l(w, x, y) = 1 2 (x w y) 2 w t w t 1 η(x t w t 1 y t )x t

12 Nonlinear discrete-time optimal control...even for simple linear regression: l(w, x, y) = 1 2 (x w y) 2 w t w t 1 η(x t w t 1 y t )x t Continuous version: ẇ(t) = (y(t) w(t) x(t))x(t) x(t) 1, y(t) 1, t Attack goal is to drive w(t) from w 0 to w in minimum time.

13 Greedy heuristic min x t,y t,w t w t w s.t. x t 1, y t 1 w t = w t 1 η(x t w t 1 y t )x t... or further constrain x t in the direction w w t 1 [Liu, Dai, Humayun, Tay, Yu, Smith, Rehg, Song. ICML 17]

14 Discrete-time optimal control min x 1:T,y 1:T,w 1:T T s.t. x t 1, y t 1, t = 1... T w t = w t 1 η(x t w t 1 y t )x t, w T = w. t = 1... T

15 Controlling SGD squared loss T = 2 (DTOC) vs. T = 3 (greedy) w 0 = (0, 1, 0), w = (1, 0, 0), x 1, y 1, η = 0.55

16 Controlling SGD squared loss (2) T = 37 (DTOC) vs. T = 55 (greedy) w 0 = (0, 5), w = (1, 0), x 1, y 1, η = 0.05

17 Controlling SGD logistic loss T = 2 (DTOC) vs. T = 3 (greedy) w 0 = (0, 1), w = (1, 0), x 1, y 1, η = 1.25

18 Controlling SGD hinge loss T = 2 (DTOC) vs. T = 16 (greedy) w 0 = (0, 1), w = (1, 0), x 100, y 1, η = 0.01

19 Detoxifying a poisoned training set Given poisoned (X, Y ), a small trusted ( X, Ỹ ) Estimate detox (X, Y ): min (X,Y ),f (X, Y ) (X, Y ) s.t. f = A(X, Y ) f( X) = Ỹ f(x) = Y.

20 Detoxifying a poisoned training set [Zhang, Zhu, Wright. AAAI 2018]

21 Training set camouflage: Attack on perceived intention Alice Eve Bob Too obvious.

22 Training set camouflage: Attack on perceived intention f = A ( ) Alice f Eve Bob Too suspicious.

23 Training set camouflage: Attack on perceived intention Alice Eve Bob Less suspicious to Eve Bob learns f = A( ) f good at man vs. woman! f f.

24 Alice s camouflage problem Given: sensitive data S (e.g. man vs. woman) public data P (e.g. the whole MNIST 1 s and 7 s) Eve s detection function Φ (e.g. two-sample test) Bob s learning algorithm A and loss l

25 Alice s camouflage problem Given: sensitive data S (e.g. man vs. woman) public data P (e.g. the whole MNIST 1 s and 7 s) Eve s detection function Φ (e.g. two-sample test) Bob s learning algorithm A and loss l Find D: min D P s.t. (x,y) S l(a(d), x, y) Φ thinks D, P from the same distribution.

26 Camouflage examples

27 Camouflage examples Sample of Sensitive Set Sample of Camouflaged Training Set Class Article Class Article Christianity...Christ that often causes Baseball...The Angels won their critical of themselves... Brewers today before 33, I ve heard it said... interested in finding out of Christs life and ministry... to get two tickets... Atheism...This article attempts to Hockey... user and not necessarily introduction to atheism... the game summary for......science is wonderful...tuesday, and the isles/caps to question scientific... what does ESPN do...

28 Attack on stochastic multi-armed bandit K-armed bandit ad placement, news recommendation, medical treatment... suboptimal arm pulled o(t ) times Attack goal: make the bandit algorithm almost always pull suboptimal arm (say arm K)

29 Shaping attack 1: Input: bandit algorithm A, target arm K 2: for t = 1, 2,... do 3: Bandit algorithm A chooses arm I t to pull. 4: World produces pre-attack reward r 0 t. 5: Attacker decides the attacking cost α t. 6: Attacker gives r t = r 0 t α t to the bandit algorithm A. 7: end for α t chosen to make ˆµ It look sufficiently small compared to ˆµ K.

30 Shaping attack For ɛ-greedy algorithm: Target arm K is pulled at least ( T ) ( T ɛ t K 3 log δ t=1 ) ( T ) ɛ t t=1 times; Cumulative attack cost is (( T K ) α t = Ô i log T + σ ) log T. t=1 i=1 Similar theorem for UCB1.

31 Shaping attack

32 Acknowledgments Collaborators: Scott Alfeld, Ross Boczar, Yuxin Chen, Kwang-Sung Jun, Laurent Lessard, Lihong Li, Po-Ling Loh, Yuzhe Ma, Rob Nowak, Ayon Sen, Adish Singla, Ara Vartanian, Stephen Wright, Xiaomin Zhang, Xuezhou Zhang Funding: NSF, AFOSR, University of Wisconsin

Machine Teaching and its Applications

Machine Teaching and its Applications Jerry Zhu University of Wisconsin-Madison Jan. 8, 2018 1/64 Introduction 2/64 Machine teaching Given target model θ, learner A Find the best training set D so that