Machine Learning: and 15781, 2003 Assignment 4

ahne Learnng: 070 and 578, 003 Assgnment 4. VC Dmenson 30 onts Consder the spae of nstane X orrespondng to all ponts n the D x, plane. Gve the VC dmenson of the followng hpothess spaes. No explanaton requred. a H r the set of all axes-parallel retangles n the H r { a < x < b < < d a, b,, d R} lassfed as postve. Answer: 4 * x, plane. That s,. onts nsde the retangle are * * * b H a lke a, but nludng all retangles not just the ones parallel to the axes of oordnate sstem. Answer: at least 5 * * * * *

H d real-valued, depth- deson trees. For example, the followng trees are n H d. x x >x0 / \ <x0 >x0 / \ <x0 / \ / \ / \ / \ / \ / \ _ >0 / \ <0 / \ >0 / \ <0 / \ > / \ < / \ / \ / \ / \ + - - + - + Answer : at least 5 assume attrbutes used n st and nd splts must be dfferent Answer : at least 6 otherwse

e Zero onts H hpotheses wth the form f θ sn α, x,α R where f θ z ff z > 0 and 0 otherwse. You an onsder ths queston n D spae. Zero onts: Onl Do Ths For Fun If You Would Lke To And If You Have Tme. Answer: nfnte read Burges tutoral for detal. Support Vetor ahne 45 onts The followng queston requres ou to use atlab, but t has been desgned to be just as eas for a atlab nove as for a atlab expert. lease read the appendx for how to use atlab. We wll nvestgate Support Vetor ahne wth two to datasets. The fle d.lean ontans 00 examples eah of whh has a real-valued nput attrbute x and a lass label. The data set s generated n the followng wa: x ~ p where p 0. 5 ff x, otherwse p 0. x 0 x < 0 In addton, we add nose to ths dataset b negatng the lass of some examples that produes the nos dataset d.nose. Tranng a SV nvolves settng up a onvex quadrat optmzaton problem and solvng t. atlab s a ommon mathematal programmng language that faltates ths b provdng quadrat programmng funtons, qp or quadprog. A atlab program, svm.m, has been prepared for ou that trans the SV on eah of the datasets and outputs results nludng margn wdth, tranng error, number of support vetors and number of mslassfatons. In ths assgnment, ou are asked to nvestgate the mpat of the trade-off weght C on margn wdth and tranng error. Consderng the objet funton for non-separable ase: L w + C ξ the margn wdth s defned as: argn w, and the tranng set error s defned as: Error. a Read the program and omplete the settng up of Hessan matrx H, and the lower and upper bounds for Lagrange multpler α Alpha. Note that the Hessan matrx H s exatl the Q matrx n our sldes. H, j X X j Y Y j LB zerosnsample, UB C n * ones nsample, ξ

b easure the mpat of trade-off weght C on the margn wdth and tranng error wth the lean dataset d.lean. Turn n plots, showng how margn wdth and tranng error var wth C, nludng the values 0.0, 0.,, 5, 0, 0, 50, 00, 00, 500, 000, 000, 5000, 0000 and nf. easure the mpat of trade-off weght C on the margn wdth and tranng error wth the nose dataset d.nose. Turn n plots, showng how margn wdth and tranng error var wth C, nludng the values 0.0, 0.,, 5, 0, 0, 50, 00, 00, 500, 000, 000, 5000, 0000 and nf. d Brefl explan our fndngs. Some hnts: when C s small C 0.0, 0.,, 5 beause w α x and 0 α C > w s small > large margn w. lease note that the margn value alulated from the formula w ma be wrong when C s small. when C gets too large C nf The nrease of margn and tranng error for nos data when C nf s probabl due to numeral nstablt. oreover, settng C to nf s equvalent to assume the dataset s separable whh sn t true for the nos data. 3. K Nearest Neghbor n Regresson 5 onts Suppose we have a real-valued tranng dataset { x,, x R, R} whh s generated usng the followng dstrbuton: ~ N, σ where s unknown to us. Note that we assume the varane σ s known. x ~ where for 0 x, otherwse 0. Our task s to ompare the performane of the followng two regresson algorthms. Alg: Use axmum Lkelhood Estmaton LE to learn from the dataset. The LE assumpton results n. For an nput x, the output s smpl. Alg: Use - Nearest Neghbor to predt. Namel, value assoated wth the tranng set datapont x that s the nearest neghbor of x. If there s a te among multple tranng dataponts for beng the nearest neghbor of x, then we just randoml selet one of them., where s the output a Assume. What s the expeted squared error of Alg and Alg on the tranng set? For Alg, lm x?

For Alg, lm x? For Alg : Answer: lm lm lm x lm σ For Alg : Answer : Beause x s real-valued ontnuous varable, so theoretall x x j 0 for an two tranng examples x, and x j, j. Namel, no two tranng examples have the same x value. Therefore, lm lm x Answer : When we use a omputer to generate x, we wll fnd that x beomes dsrete due to the aura loss of the omputer. oreover, there wll be nfnte number of tranng examples wth the same x value when. In ths ase, lm lm σ x [ + lm 0 + ] b Assume. What s the expeted squared error of Alg and Alg for predtng the output of a future data pont x, generated n the same wa as tranng data. For Alg, E? For Alg, E? For Alg : Answer: E E σ For Alg : Answer:

,,, σ + + d d d d d d d d d d x E