Linear Classification

Size: px

Start display at page:

Download "Linear Classification"

Anis Carter
6 years ago
Views:

1 Linear Classification by Prof. Seungchul Lee isystems Design Lab UNIS able of Contents I.. Supervised Learning II.. Classification III. 3. Perceptron I. 3.. Linear Classifier II. 3.. Perceptron Algorithm III Iterations of Perceptron IV Perceptron loss function V he best hyperplane separator? VI Python Example

2 . Supervised Learning. Classification where y is a discrete value develop the classification algorithm to determine which class a new input should fall into start with binary class problems Later look at multiclass classification problem, although this is just an extension of binary classification We could use linear regression hen, threshold the classifier output (i.e. anything over some value is yes, else no) linear regression with thresholding seems to work We will learn perceptron support vector machine logistic regression

3 3. Perceptron x For input x = x d ω weights ω = ω d 'attributes of a customer' Approve credit if Deny credit if d ω i x i i= d ω i x i i= > threshold, < threshold. d h(x) = sign (( ) threshold) = sign (( ) + ) ω i x i i= = Introduce an artificial coordinate x 0 : In vector form, the perceptron implements d h(x) = sign ( ) ω i x i i=0 h(x) = sign ( ω x) d ω i x i ω 0 i=

4 Hyperplane Separates a D-dimensional space into two half-spaces Defined by an outward pointing normal vector ω ω is orthogonal to any vector lying on the hyperplane assume the hyperplane passes through origin, ω x = 0 with x 0 =

5 3.. Linear Classifier represent the decision boundary by a hyperplane ω he linear classifier is a way of combining expert opinion. In this case, each opinion is made by a binary "expert" Goal: to learn the hyperplane ω using the training data 3.. Perceptron Algorithm he perceptron implements h(x) = sign ( ω x) Given the training set (, ), (, ),, (, ) where {, } x y x y x N y N y i ) pick a misclassified point sign ( ω x n ) y n ) and update the weight vector ω ω + y n x n

6 Why perceptron updates work? = + Let's look at a misclassified positive example ( ) perceptron (wrongly) thinks ω old x n < 0 y n updates would be ω new ω new x n = + = + ω old y n x n ω old x n = ( + = + ω old x n ) x n ω old x n x n x n hus ω new x n is less negative than ω old x n

3.3. Iterations of Perceptron. Randomly assign ω. One iteration of the PLA (perceptron learning algorithm) where (x, y) is a misclassified training point t =,, 3,, ω ω + yx 3.

7 3.3. Iterations of Perceptron. Randomly assign ω. One iteration of the PLA (perceptron learning algorithm) where (x, y) is a misclassified training point t =,, 3,, ω ω + yx 3. At iteration pick a misclassified point from x y x y x N y N (, ), (, ),, (, ) 4. and run a PLA iteration on it 5. hat's it! 3.4. Perceptron loss function m L(ω) = max {0, ( )} n= Loss = 0 on examples where perceptron is correct, i.e., Loss > 0 on examples where perceptron misclassified, i.e., y n y n ω x n ( ) > 0 y n ω x n ( ) < 0 ω x n note: sign ( ω x n ) y n is equivalent to y n ( ω x n ) < 0

8 3.5. he best hyperplane separator? Perceptron finds one of the many possible hyperplanes separating the data if one exists Of the many possible choices, which one is the best? Utilize distance information as well Intuitively we want the hyperplane having the maximum margin Large margin leads to good generalization on the test data we will see this formally when we cover Support Vector Machine 3.6. Python Example ω = ω ω ω 3 ( x () ) x y ( x () ) = = = ( x (3) ) ( x (m) ) y () y () y (3) y (m) x () x () x (3) x (m) x () x () x (3) x (m) In []: import numpy as np import matplotlib.pyplot as plt % matplotlib inline In []: #training data gerneration m = 00 x = 8*np.random.rand(m, ) x = 7*np.random.rand(m, ) - 4 g0 = 0.8*x + x - 3 g = g0 - g = g0 +

9 In [3]: C = np.where(g >= 0) C = np.where(g < 0) print(c) (array([ 0,, 5, 8, 6, 8, 9,, 3, 6, 30, 3, 33, 4, 46, 55, 5 9, 6, 64, 67, 7, 77, 83, 99]), array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])) In [4]: C = np.where(g >= 0)[0] C = np.where(g < 0)[0] print(c.shape) print(c.shape) (4,) (45,) In [6]: plt.figure(figsize=(0, 6)) plt.plot(x[c], x[c], 'ro', label='c') plt.plot(x[c], x[c], 'bo', label='c') plt.title('linearly seperable classes', fontsize=5) plt.legend(loc='upper left', fontsize=5) plt.xlabel(r'$x_$', fontsize=0) plt.ylabel(r'$x_$', fontsize=0) plt.show()

10 x y ( x () ) ( x () ) = = = ( x (3) ) ( x (m) ) y () y () y (3) y (m) x () x () x (3) x (m) x () x () x (3) x (m) In [6]: X = np.hstack([np.ones([c.shape[0],]), x[c], x[c]]) X = np.hstack([np.ones([c.shape[0],]), x[c], x[c]]) X = np.vstack([x, X]) y = np.vstack([np.ones([c.shape[0],]), -np.ones([c.shape[0],])]) X = np.asmatrix(x) y = np.asmatrix(y) where (x, y) is a misclassified training point ω = ω ω ω 3 ω ω + yx In [7]: w = np.ones([3,]) w = np.asmatrix(w) n_iter = y.shape[0] for k in range(n_iter): for i in range(n_iter): if y[i,0]!= np.sign(x[i,:]*w)[0,0]: w += y[i,0]*x[i,:]. print(w) [[-9. ] [.97353] [ ]] g(x) = ω x + ω 0 = ω x + ω x + ω 0 = 0 x = ω ω 0 x ω ω

11 In [8]: xp = np.linspace(0,8,00).reshape(-,) xp = - w[,0]/w[,0]*xp - w[0,0]/w[,0] plt.figure(figsize=(0, 6)) plt.scatter(x[c], x[c], c='r', s=50, label='c') plt.scatter(x[c], x[c], c='b', s=50, label='c') plt.plot(xp, xp, c='k', label='perceptron') plt.xlim([0,8]) plt.xlabel('$x_$', fontsize = 0) plt.ylabel('$x_$', fontsize = 0) plt.legend(loc = 4, fontsize = 5) plt.show()

12 In [9]: # animation import matplotlib.animation as animation % matplotlib qt5 fig = plt.figure(figsize=(0, 6)) ax = fig.add_subplot(,, ) plot_c, = ax.plot(x[c], x[c], 'go', label='c') plot_c, = ax.plot(x[c], x[c], 'bo', label='c') plot_perceptron, = ax.plot([], [], 'k', label='perceptron') ax.set_xlim(0, 8) ax.set_ylim(-3.5, 4.5) ax.set_xlabel(r'$x_$', fontsize=0) ax.set_ylabel(r'$x_$', fontsize=0) ax.legend(fontsize=5, loc='upper left') n_iter = y.shape[0] def init(): plot_perceptron.set_data(xp, xp) return plot_perceptron, def animate(i): global w idx = i%n_iter if y[idx,0]!= np.sign(x[idx,:]*w)[0,0]: w += y[idx,0]*x[idx,:]. xp = - w[,0]/w[,0]*xp - w[0,0]/w[,0] plot_perceptron.set_data(xp, xp) return plot_perceptron, w = np.ones([3,]) xp = np.linspace(0,8,00).reshape(-,) xp = - w[,0]/w[,0]*xp - w[0,0]/w[,0] ani = animation.funcanimation(fig, animate, np.arange(0, n_iter**), init_func=init, interval=0, repeat=false) plt.show() In [0]: %%javascript $.getscript(' js')

Perceptron. by Prof. Seungchul Lee Industrial AI Lab POSTECH. Table of Contents

Perceptron. by Prof. Seungchul Lee Industrial AI Lab POSTECH. Table of Contents Perceptron by Prof. Seungchul Lee Industrial AI Lab http://isystems.unist.ac.kr/ POSTECH Table of Contents I.. Supervised Learning II.. Classification III. 3. Perceptron I. 3.. Linear Classifier II. 3..