T Assignment-08/2011

Similar documents
Linear Support Vector Machine. Classification. Linear SVM. Huiping Cao. Huiping Cao, Slide 1/26

Advanced Introduction to Machine Learning

Advanced Topics in Machine Learning, Summer Semester 2012

Machine Learning Practice Page 2 of 2 10/28/13

MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N.

Statistical Methods for SVM

HW5 Solutions. Gabe Hope. March 2017

Warm up: risk prediction with logistic regression

Support vector machines Lecture 4

Support Vector Machine I

Classifier Complexity and Support Vector Classifiers

Relevance Vector Machines

Formulation with slack variables

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 5

Coordinate Update Algorithm Short Course The Package TMAC

Proof and Implementation of Algorithmic Realization of Learning Using Privileged Information (LUPI) Paradigm: SVM+

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Statistical Machine Learning from Data

Vector Fields and Solutions to Ordinary Differential Equations using MATLAB/Octave

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

Statistical Methods for Data Mining

Lecture 7: Kernels for Classification and Regression

Jeff Howbert Introduction to Machine Learning Winter

10-701/ Machine Learning - Midterm Exam, Fall 2010

Support Vector Machines

CIS 520: Machine Learning Oct 09, Kernel Methods

Least Mean Squares Regression

Experiment 1: Linear Regression

CPSC 340: Machine Learning and Data Mining

Homework 4. Convex Optimization /36-725

minimize x subject to (x 2)(x 4) u,

Discriminative Learning and Big Data

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

Lecture 10: A brief introduction to Support Vector Machine

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Warm up. Regrade requests submitted directly in Gradescope, do not instructors.

Classification with Perceptrons. Reading:

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Set 2 January 12 th, 2018

Linear and Logistic Regression. Dr. Xiaowei Huang

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

Homework 3. Convex Optimization /36-725

QUADRATIC PROGRAMMING SOLVER FOR NON-NEGATIVE MATRIX FACTORIZATION WITH SPARK

Multiple Kernel Learning

Introduction to Support Vector Machines

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Support Vector and Kernel Methods

Lecture 9: Multi Kernel SVM

The definitions and notation are those introduced in the lectures slides. R Ex D [h

Lecture Notes on Support Vector Machine

6.867 Machine learning

Introduction to Logistic Regression

c 4, < y 2, 1 0, otherwise,

COMS 4771 Introduction to Machine Learning. Nakul Verma

Lecture 6. Regression

Bits of Machine Learning Part 1: Supervised Learning

Support Vector Machines for Classification and Regression

CPSC 540: Machine Learning

Neural networks and support vector machines

Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron

This is an author-deposited version published in : Eprints ID : 17710

Security Analytics. Topic 6: Perceptron and Support Vector Machine

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri

Convex optimization COMS 4771

CS798: Selected topics in Machine Learning

LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Least Mean Squares Regression. Machine Learning Fall 2018

Linear Learning Machines

ML (cont.): SUPPORT VECTOR MACHINES

Learning by constraints and SVMs (2)

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Linear & nonlinear classifiers

ECE Homework Set 2

Understanding SVM (and associated kernel machines) through the development of a Matlab toolbox

Machine Learning for Signal Processing Bayes Classification

Artificial Neural Network II MATLAB Neural Network Toolbox

Foundation of Intelligent Systems, Part I. SVM s & Kernel Methods

Mathematical Programming for Multiple Kernel Learning

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)

Geometrical intuition behind the dual problem

Machine Learning for NLP

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

Machine Learning and Data Mining. Support Vector Machines. Kalev Kask

Kernelized Perceptron Support Vector Machines

Lecture 3: Pattern Classification. Pattern classification

Machine Learning, Midterm Exam

Kernel Methods and Support Vector Machines

Lecture Support Vector Machine (SVM) Classifiers

Support Vector Machine (continued)

CA-SVM: Communication-Avoiding Support Vector Machines on Distributed System

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Support Vector Machines

where x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H:

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

CSC 411: Lecture 04: Logistic Regression

Optimization and Gradient Descent

Transcription:

T-61.6040 Assignment-08/2011 Santosh Tirunagari (245577) santosh.tirunagari@aalto.fi December 18, 2011 AIM Implementation group Lasso MKL algorithm discussed on Iris data using p=1 and p=2 as parameters. k GAU = (x i, x j ) = exp( x i x j 2 2/s 2 ), C = 1, and s = 1. (1) Datasets The datasets given are Iris training and test data with 4 features Training set consisting of 90 and test set consisting of 60 samples each. Implementation Load and zscore normalize both the train and test data using zscore matlab command. Zscore normalization can also be done using the method described in the lecture. As the distribution is same the statistics of the training set and the test set would be same. X = z s c o r e ( dlmread ( i r i s t r a i n X. txt ) ) ; X t = z s c o r e ( dlmread ( i r i s t e s t X. txt ) ) ; Y = ( dlmread ( i r i s t r a i n y. txt ) ) ; Y t = ( dlmread ( i r i s t e s t y. txt ) ) ; Create a label list for one versus all aproach method. The true values are given -1s for one class and the rest are denoted as 1s y1 = ones ( 9 0, 1 ) ; y2 = ones ( 9 0, 1 ) ; y3 = ones ( 9 0, 1 ) ; y1t = ones ( 6 0, 1 ) ; 1

y2t = ones ( 6 0, 1 ) ; y3t = ones ( 6 0, 1 ) ; y1 ( 1 : 3 0, 1 ) = 1; y2 ( 3 1 : 6 0, 1 ) = 1; y3 ( 6 1 : 9 0, 1 ) = 1; y1t ( 1 : 2 0, 1 ) = 1; y2t ( 2 1 : 4 0, 1 ) = 1; y3t ( 4 1 : 6 0, 1 ) = 1; Calculate kernel function for each feature. f o r i = 1 : 1 : 4 KF{ i } = kro (X( :, i ),X( :, i ) ) ; Calculate Keta value P = 1 ; eta = ones ( 1, 4 ). / 4 ; ceta = [ ] ; x = 1/(P+1); y = (P/(P+1)); C =1; Keta = z e r o s ( 9 0, 9 0 ) ; Keta = Keta + eta ( k ). KF{k } ; yv = {y1 y2 y3 } ; yvt = { y1t y2t y3t } ; Solve SVM optimization problem in standard QP form using quadprog function in matlab and find alpha values. H = ( yv{ i } yv{ i } ). Keta ; A = [ ] ; Aeq = yv{ i } ; l = z e r o s ( 9 0, 1 ) ; c = 1 ones ( 9 0, 1 ) ; b = [ ] ; beq = 0 ; u = 1 ones (90, 1 ) ; warning o f f ; o p t i o n s = optimset ( Algorithm, i n t e r i o r point convex, MaxIter, 1 5 0 0 ) ; % s o l v e quadprog to f i n d alpha v a l u e s alpha = quadprog (H, c, A, b, Aeq, beq, l, u, [ ], o p t i o n s ) ; % c a l c u l a t e near boundary c o e f f i c i e n t s alpha ( alpha < C 0. 0 0 1 ) = 0 ; alpha ( alpha > C 0. 9 9 9 ) = C; 2

Calculate the new weight norm and update eta values till the threshold or the maximum iterations till 100. i t e r = 0 ; obj = 1000; old = 500; while abs ( old obj ) > 0.001 && i t e r < 100; old = obj ; % > c a l c u l a t e alpha v a l u e s using above code > % cw = z e r o s ( 4, 1 ) ; f o r i t = 1 : 1 : 4 cw( i t, 1 ) = eta ( i t ). ˆ 2 ( ( alpha. yv{ i }) KF{ i t }) ( alpha. yv{ i } ) ; eta = (cw. ˆ x ). / ( ( sum ( ( cw. ˆ y ) ) ). ˆ ( 1 /P ) ) ; Keta = z e r o s ( 9 0, 9 0 ) ; Keta = Keta + eta ( k ). KF{k } ; i t e r = i t e r +1; obj = o b j e c t i v e ( Keta,KF, yv{ i }, alpha ) ; Calculate Keta values and solve svm problem for finding alpha values Q = yv{ i } yv{ i } ; Keta = z e r o s ( 9 0, 9 0 ) ; Keta = Keta + eta ( k ). KF{k } ; H = Q. Keta ; A = [ ] ; Aeq = yv{ i } ; l = z e r o s ( 9 0, 1 ) ; c = 1 ones ( 9 0, 1 ) ; b = [ ] ; beq = 0 ; u = 1 ones (90, 1 ) ; o p t i o n s = optimset ( Algorithm, i n t e r i o r point convex, MaxIter, 1 5 0 0 ) ; % s o l v e quadprog to f i n d alpha v a l u e s alpha = quadprog (H, c, A, b, Aeq, beq, l, u, [ ], o p t i o n s ) ; Compute the near boundary coefficients and move near boundary alpha values to boundaries alpha ( alpha < C 0. 0 0 1 ) = 0 ; alpha ( alpha > C 0. 9 9 9 ) = C; Compute value for bias parameter b sv = f i n d ( alpha >0 & alpha<c( k ) ) ; 3

sv one = z e r o s ( 2 0 0, 1 ) ; sv one ( sv, 1 ) = 1 ; b = sv one ( yv{ i } (( alpha. yv{ i }) Keta ) ) / sum( sv one ) Compute the decision function on training set and calculate the confusion matrix temp = bsxfun ( @plus, Keta ( sv, : ) ( alpha ( sv, : ). yv{ i }( sv, : ) ), b ) ; r e s =temp ; r e s ( res >=0) = 1 ; r e s ( res <0) = 1; conf { i } = confusionmat ( yv{ i }, r e s ) ; Similarly compute the Keta value for the test data KFt1{k} = kro ( X t ( :, k ),X( sv, k ) ) ; Ketat1 = z e r o s ( l e n g t h ( sv ), 6 0 ) ; Ketat1 = Ketat1 + eta ( k ). KFt1{k } ; Compute the decision function on test set and calculate the confusion matrix. temp = bsxfun ( @plus, Ketat1 ( alpha ( sv, : ). yv{ i }( sv, : ) ), b ) ; r e s t = temp ; % t h r e s h o l d i n g r e s t ( r e s t >=0) = 1 ; r e s t ( r e s t <0) = 1; c o n f t { i } = confusionmat ( yvt { i }, r e s t ) ; Threshold the decisions to get proper +1 and 1 r e s ( res >=0) = 1 ; r e s ( res <0) = 1; Plot the bar graphs showing the eta values. bar ( eta ) ; x l a b e l ( Kernel ) ; y l a b e l ( Weight ) ; a x i s ( [ 1 4 0. 0 1. 0 ] ) t i t l e ( [ num2str ( ( i ) ), v e r s u s o t h e r s ( p=1) ] ) ; % p r i n t f i g u r e fnout = [ p1, num2str ( ( i ) ),. jpg ] ; p r i n t ( djpeg, r150, fnout ) ; function for calculating the objective function f u n c t i o n val = o b j e c t i v e ( Keta,KF, yv, alpha ) val1 = z e r o s ( 4, 1 ) ; 4

f o r x = 1 : 1 : 4 val1 ( x ) = ( alpha. yv ) KF{x } ( alpha. yv ) ; val1 = max( val1 ) ; val2 = ( alpha. yv ) Keta ( alpha. yv ) ; val = ( abs ( val1 val2 ) ) ; Similarly compute the same steps for parameter P =2; Running of code for this implementation in matlab: ex8 p1 and ex8 p2 respectively. Results Tables below show the confusion matrices computed. Table 1: Confusion matrix for 1 vs rest for training set p=1 30 0 0 60 Table 2: Confusion matrix for 2 vs rest for training set p=1 Table 3: Confusion matrix for 3 vs rest for training set p=1 Table 4: Confusion matrix for 1 vs rest for test set p=1 Table 5: Confusion matrix for 2 vs rest for test set p=1 2 38 5

Table 6: Confusion matrix for 3 vs rest for test set p=1 18 2 Figure 1 shows the bar plot of eta values generated for p=1 and Figure 2 shows the bar plot of eta values generated for p=2. Table 7: Confusion matrix for 1 vs rest for training set p=1 30 0 0 60 Table 8: Confusion matrix for 2 vs rest for training set p=1 Table 9: Confusion matrix for 3 vs rest for training set p=1 Table 10: Confusion matrix for 1 vs rest for test set p=1 Table 11: Confusion matrix for 2 vs rest for test set p=1 2 38 References [ 2 ] http : / / en. w i k i p e d i a. org / wiki / Support vector machine [ 3 ] http : / /www. mathworks. se / help / toolbox /optim/ug/ quadprog. html 6

Table 12: Confusion matrix for 3 vs rest for test set p=1 18 2 Figure 1: Bar plot of eta values utilized in implementing kernel with p =1. Table 13: Confusion matrix for 1 vs rest for training set p=2 30 0 0 60 Table 14: Confusion matrix for 2 vs rest for training set p=2 7

Figure 2: Bar plot of eta values utilized in implementing kernel with p = 2. 8

Table 15: Confusion matrix for 3 vs rest for training set p=2 Table 16: Confusion matrix for 1 vs rest for test set p=2 Table 17: Confusion matrix for 2 vs rest for test set p=2 2 38 Table 18: Confusion matrix for 3 vs rest for test set p=2 18 2 Attachments Matlab code ex8 p1.m,ex8 p2 and datasets have been archived along with this report. 9