Linear Classifiers III

Size: px
Start display at page:

Download "Linear Classifiers III"

Transcription

1 Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer

2 Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models Logistic Regressio Regularized Empirical Risk Miimizatio Kerel Perceptro, Support Vector Machie Ridge Regressio, LASSO Represeter Theorem Dualized Perceptro, Dual SVM Mercer Map Learig with Structured Iput & Output Taxoomy, Sequeces, Rakig, Maschielles Lere Decoder, Cuttig Plae Algorithm 2

3 Review: Liear Models Liear Classifiers: Biary classifier f θ x = φ x T θ + b Multiclass classifier f θ x, y = φ x T θ y + b y Maschielles Lere May learig methods miimize the sum of loss fuctios over the traiig data plus a regularizer. argmi θ l f θ x i, y i + c Ω θ Choice of loss & regularizer gives differet methods Logistic regressio, Perceptro, SVM 3

4 Review: Feature Mappigs All cosidered liear methods ca be made oliear by meas of feature mappig φ. Better separatios ca be obtaied i feature space Maschielles Lere φ x 1, x 2 = x 1 x 2, x 1 2, x 2 2 Hyperplae i feature space correspods to a oliear surface i origial space 4

5 Dual Form Liear Model: Motivatio The feature mappig φ x ca be high dimesioal. The size of estimated parameter vector θ depeds o the dimesioality of φ could be ifiite! Maschielles Lere Computatio of φ x is expesive. φ must be computed for each traiig poit x i & for each predictio x. This icurs high computatioal & memory requiremets. How ca we adapt liear methods to efficietly icorporate high dimesioal φ? 5

6 Dual Form Liear Model Represeter Theorem: If g is strictly mootoically icreasig, the the θ that miimizes has the form θ = L θ = l f θ x i, y i + g f θ 2 α i φ x i, with α i R. Maschielles Lere f θ x = α i φ x i T φ x Ier product is a measure for similarity betwee samples Geerally θ is ay vector i Φ, but we show it must be i the spa of the data. 6

7 Represeter Theorem: Proof Orthogoal Decompositio: L θ = l f θ x i, y i θ = θ + θ, with θ Θ = α i φ x i α i R ad θ Θ = θ Θ θ T θ = 0 θ Θ + g f θ 2 Maschielles Lere 7

8 Represeter Theorem: Proof Orthogoal Decompositio: θ = θ + θ, with θ Θ = α i φ x i α i R ad θ Θ = θ Θ θ T θ = 0 θ Θ For ay traiig poit x i it follows that f θ x i = θ T φ x i + θ T φ x i = θ T φ x i Why is θ T φ x i = 0? L θ = l f θ x i, y i + g f θ 2 Maschielles Lere 8

9 Represeter Theorem: Proof Orthogoal Decompositio: L θ = l f θ x i, y i θ = θ + θ, with θ Θ = α i φ x i α i R ad θ Θ = θ Θ θ T θ = 0 θ Θ For ay traiig poit x i it follows that f θ x i = θ T φ x i + θ T φ x i = θ T φ x i Why is θ T φ x i = 0? Thus, l f θ x i, y i is idepedet of θ. Fially from g θ 2 g θ 2, it follows θ = 0. + g f θ 2 Maschielles Lere g θ 2 = g θ + θ 2 = g θ θ 2 2 g θ 2 Sice θ T θ = 0 (Pythagoras Theorem) Sice g is strictly mootoically icreasig. 9

10 Represeter Theorem Give traiig data T = x 1, y 1,, x, y ad feature mappig φ x, we costruct a liear fuctio f θ x = θ T φ x ; ie., we fid a hyperplae θ. The hyperplae θ, which miimizes L θ = l f θ x i, y i + g f θ 2, ca be represeted as f θ x = θ T φ x = f α x = α i φ x i T φ x Primal view: f θ x = θ T φ x Hypothesis θ has as may parameters as the dimesioality of φ x. Dual view: f α x = α i φ x i T φ x Hypothesis has as may parameters α i as samples. Maschielles Lere 10

11 Represeter Theorem Primal view: f θ x = θ T φ x Hypothesis θ has as may parameters as the dimesioality of φ x. Good if there are may samples with few attributes. Maschielles Lere Dual view: f α x = α i φ x T i φ x Hypothesis has as may parameters α i as samples. Good if there are few samples with high dimesioality. The represetatio φ x ca eve be ifiite dimesioal, as log as the ier product ca be efficietly computed: ie., by a kerel fuctio. 11

12 Dual Form of a Liear Model A parameter vector θ, which miimizes a regularized loss fuctio, is always a liear combiatio of traiig samples: θ = α i φ x i The dual form α has as may parameters α i as there are traiig samples.. Dual decisio fuctio: f α x = α i φ x i T φ x The primal form θ has as may parameters θ i as the dimesioality of the feature mappig φ x. Primal decisio fuctio: f θ x = θ T φ x The dual form is advatageous if there are few samples ad may attributes. Maschielles Lere 12

13 Maschielles Lere DUAL PERCEPTRON 13

14 Dualized Perceptro Perceptro classificatio: f θ x i = θ T φ x i Perceptro: the algo. halts whe the followig holds for all samples y i f θ x i > 0 = sample lies o the correct side of the hyperplae. Update step: θ = θ + y i φ x i Maschielles Lere 14

15 Dualized Perceptro Perceptro classificatio: f θ x i = θ T φ x i T f α x i = j=1 α j φ x j φ xi Perceptro: the algo. halts whe the followig holds for all samples Maschielles Lere T y i f θ x i > 0 y i j=1 α j φ x j φ xi > 0 = sample lies o the correct side of the hyperplae. Update step: θ = θ + y i φ x i 15

16 Dualized Perceptro Perceptro classificatio: f θ x i = θ T φ x i T f α x i = j=1 α j φ x j φ xi Perceptro: the algo. halts whe the followig holds for all samples T y i f θ x i > 0 y i j=1 α j φ x j φ xi > 0 = sample lies o the correct side of the hyperplae. Update step: θ = θ + y i φ x i α ew j=1 j φ x j = j=1 α old j φ x j α ew i φ x i = α old i φ x i + y i φ x i α ew i = α old i + y i y i φ x i - - Maschielles Lere 16

17 Dualized Perceptro Algorithm Perceptro(Istaces x i, y i ) Set α = 0 DO FOR i = 1,, IF y i f α x i 0 Maschielles Lere THEN α i = α i + y i END WHILE α chages RETURN α Decisio fuctio: f α x = α i φ x i T φ x 17

18 Dualized Perceptro Perceptro loss, o regularizer Dual form of the decisio fuctio: f α x = α i φ x T i φ x Dual form of the update rule: Maschielles Lere If y i f α x i 0, the α i = α i + y i Equivalet to the primal form of the perceptro Advatageous to use istead of the primal perceptro if there are few samples ad φ x is high dimesioal. 18

19 Maschielles Lere DUAL SUPPORT VECTOR MACHINE 19

20 Dualized Support Vector Machie + 1 2λ θt θ Primal: mi max 0,1 y i φ x T i θ θ Equivalet optimizatio problem with side costraits: mi θ,ξ λ ξ i θt θ such that y i φ x i T θ 1 ξ i ad ξ i 0 Maschielles Lere Goal: dual formulizatio of the optimizatio problem 20

21 Dualized Support Vector Machie Optimizatio problem with side costraits: mi θ,ξ λ ξ i θt θ such that y i φ x i T θ 1 ξ i ad ξ i 0 Lagrage fuctio with Lagrage-Multipliers β 0 ad β 0 0 for the side costraits: L θ, ξ, β, β 0 = λ ξ i + θt θ 2 β i y i φ x T i θ 1 + ξ i Goal fuctio: Z θ, ξ Side costraits: g θ, ξ 0 Lagrage fuctio: Z θ, ξ βg θ, ξ Optimizatio problem without side costraits: mi max L θ, ξ, β, β0 θ,ξ β,β0 β i 0 ξ i Maschielles Lere 21

22 Dualized Support Vector Machie Lagrage fuctio: L θ, ξ, β, β 0 = λ ξ i + θt θ 2 β i y i φ x T i θ 1 + ξ i Sice it is covex i θ, ξ, the strog duality theorem gives: mi θ,ξ max L θ, ξ, β, β0 β,β0 max mi L θ, ξ, β, β,β 0 β0 θ,ξ Miimum: set the derivative of L w.r.t. θ, ξ to zero β i 0 ξ i Maschielles Lere L θ, ξ, β, θ β0 = 0 θ = β i y i ξ i L θ, ξ, β, β 0 = 0 λ = β i + β i 0 α i φ x i Relatio betwee primal ad dual parameters The Represeter Theorem. 22

23 Dualized Support Vector Machie Substitute the derived parameters ito the Lagrage fuctio: L θ, ξ, β, β 0 = 1 2 θ T θ β i y i φ x i T θ 1 + ξ i β i 0 ξ i + λ ξ i θ = λ = β i + β i 0 β i y i φ x i Maschielles Lere 23

24 Dualized Support Vector Machie Substitute the derived parameters ito the Lagrage fuctio: L θ, ξ, β, β 0 = 1 2 β i y i φ x i β i y i φ x i T β j y j φ x j j=1 T j=1 1 + ξ i β j y j φ x j θ = β i 0 ξ i λ = β i + β i 0 β i y i φ x i + λ ξ i Maschielles Lere 24

25 Dualized Support Vector Machie Substitute the derived parameters ito the Lagrage fuctio: L θ, ξ, β, β 0 = 1 2 β i y i φ x i β i y i φ x i T β j y j φ x j j=1 T j=1 1 + ξ i β j y j φ x j θ = β i 0 ξ i λ = β i + β i 0 β i y i φ x i + λ ξ i Maschielles Lere = 1 2 β i β j y i y j φ x i T φ x j i,j=1 β i β j y i y j φ x i T φ x j + β i β i + β i 0 ξ i + λ ξ i i,j=1 =λ 25

26 Dualized Support Vector Machie Substitute the derived parameters ito the Lagrage fuctio: L θ, ξ, β, β 0 = 1 2 β i y i φ x i β i y i φ x i T β j y j φ x j j=1 T j=1 1 + ξ i β j y j φ x j θ = β i 0 ξ i λ = β i + β i 0 β i y i φ x i + λ ξ i Maschielles Lere = 1 β 2 i β j y i y j φ x T i φ x j i,j=1 β i β j y i y j φ x T 0 i φ x j + β i β i + β i i,j=1 1 = β i 2 =λ i,j=1 β i β j y i y j φ x i T φ x j ξ i + λ ξ i 26

27 Dualized Support Vector Machie Substitutig the derived parameters ito the Lagrage fuctio: 1 L β = β i 2 i,j=1 β i β j y i y j φ x i θ = T φ x j Sice β 0 0 & 1λ = β + β 0 it follows: 0 β i λ. λ = β i + β i 0 β i y i φ x i Maschielles Lere Optimizatio criterio of the dual SVM: max β 1 β i 2 i,j=1 β i β j y i y j φ x i T φ x j L1-Regularizer of β (sparse) such that Large if β i, β j > 0 0 β i λ for similar samples of differet classes. 27

28 Dualized Support Vector Machie λ = β i + β i 0 Optimizatio criterio of the dual SVM: max β 1 β i 2 i,j=1 such that 0 β i λ β i β j y i y j φ x i T φ x j β i : y i φ x i T θ 1 ξ i β i 0 : ξ i 0 A Lagrage multiplier is greater tha 0 exactly whe its correspodig costrait is fulfilled with equality. β i = 0 β i 0 = λ: we have y i φ x i T θ > 1 ξ i & ξ i = 0. (Distace to the hyperplae exceeds the margi) β i = λ β i 0 = 0: we have y i φ x i T θ = 1 ξ i & ξ i > 0. (Sample violates the margi) 0 < β i < λ: we have y i φ x i T θ = 1 ξ i & ξ i = 0. (Sample lies o the margi) 28 Maschielles Lere

29 Dualized Support Vector Machie A Lagrage multiplier is greater tha 0 exactly whe its correspodig costrait is fulfilled with equality. β i = 0 β i 0 = λ: we have y i φ x i T θ > 1 ξ i & ξ i = 0. (Distace to the hyperplae exceeds the margi) Maschielles Lere 29

30 Dualized Support Vector Machie A Lagrage multiplier is greater tha 0 exactly whe its correspodig costrait is fulfilled with equality. β i = λ β i 0 = 0: we have y i φ x i T θ = 1 ξ i & ξ i > 0. (Sample violates the margi) Maschielles Lere 30

31 Dualized Support Vector Machie A Lagrage multiplier is greater tha 0 exactly whe its correspodig costrait is fulfilled with equality. 0 < β i < λ: we have y i φ x i T θ = 1 ξ i & ξ i = 0. (Sample lies o the margi) Maschielles Lere 31

32 Dualized Support Vector Machie Optimizatio criterio of the dual SVM: max β 1 β i 2 i,j=1 such that 0 β i λ β i β j y i y j φ x i T φ x j Maschielles Lere Optimizatio over parameters β. Solutio foud with QP-Solver i O 2. Sparse solutio. Samples oly appear as pairwise ier products. 32

33 Dualized Support Vector Machie Primal ad dual optimizatio problem have the same solutio. θ = x i SV β i y i φ x i Dual form of the decisio fuctio: Support Vectors: β i > 0 Maschielles Lere f β x = β i y i φ x i T φ x Primal SVM: x i SV Solutio is a Vector θ i the space of the attributes. Dual SVM: The same solutio is represeted as weights β i of the samples. 33

34 Dualized Support Vector Machie Hige loss, L2-regularizatio Dual form of the decisio fuctio: f β x = β i y i φ x i T φ x x i SV Dual form of the optimizatio problem: max β β i 1 2 i,j=1 such that β i β j y i y j φ x i 0 β i λ T φ x j Primal ad dual optimizatio problems have idetical solutios but differet forms The dual is advatageous if there are few samples ad φ x is high dimesioal. Maschielles Lere 34

35 Kerel Support Vector Machie Optimizatio criterio of the kerel SVM: max β such that 1 β i 2 Decisio fuctio: i,j=1 0 β i λ β i β j y i y j k x i, x j Ier product fuctio Maschielles Lere f β x = β i y i k x i, x x i SV Samples oly iteract through the kerel fuctio k x i, x j. The feature Mappig φ o loger appears i the optimizatio problem or decisio fuctio. 35

36 Maschielles Lere KERNELS 36

37 Kerels ad Kerel Methods The feature mappig φ x ca be high dimesioal. Number of estimated parameters θ depeds o φ. Computatio of φ x is expesive. Previously: give φ x, φ x T φ x measures the similarity betwee samples. May methods ca be formulated so that samples oly appear as pairwise ier products. Idea: Replace ier product with ay similarity measure k x, x = φ x T φ x ad map samples oly implicitly. For which fuctios k does there exist a mappig φ x, so that k represets a ier product? 37 Maschielles Lere

38 Kerel Fuctios: Motivatio Ca we simply chose k x, x to be ay fuctio? We eed k to be a ier product i some feature space else, we lose meaig & covexity Optimizatio criterio of the kerel SVM: max 0 β λ 1 β i 2 i,j=1 β i β j y i y j k x i, x j max 0 β λ 1T β 1 2 y β T K y β K ij = k x i, x j Maschielles Lere This optimizatio is covex (with a uique solutio) if K is positive semi-defiite (o-egative eigevalues). 38

39 Recap: Positive Defiiteess A matrix K is called positive semi-defiite (PSD) if x x T Kx 0 holds for all x. It is called positive defiite if equality holds oly at x = 0. Maschielles Lere A fuctio k is called positive semi-defiite (PSD) if z x k x, x z x dxdx 0 holds for all cotiuous fuctios z. 39

40 Recap: Positive Defiiteess A matrix K is called positive semi-defiite if x x T Kx 0 Example: a covariace matrix Σ 1 N x; μ, Σ = 2π m Σ e 1 2 x μ T Σ 1 x μ Positive defiite matrices are ivertible ad its iverse is also positive defiite. 3 Maschielles Lere 2 1 x x

41 Recap: Positive Defiiteess A matrix K is called positive semi-defiite if x x T Kx 0 Example: a covariace matrix Σ 1 N x; μ, Σ = 2π m Σ e Positive defiiteess implies a orm: x = x T Σ 1 x 1 2 x μ T Σ 1 x μ 3 Maschielles Lere 2 1 x x

42 Recap: Positive Defiiteess A matrix K is called positive semi-defiite if x x T Kx 0 Example: a covariace matrix Σ 1 N x; μ, Σ = 2π m Σ e Positive defiiteess implies a orm: x = x T Σ 1 x Mahalaobis distace: d x, x = x x T Σ 1 x x 1 2 x μ T Σ 1 x μ x Maschielles Lere x

43 Kerels Theorem: For every positive defiite fuctio k there exists a mappig φ x such that k x, x = φ x T φ x for all x ad x. Maschielles Lere This mappig is ot uique. For example, cosider φ 1 x = x ad φ 2 x = x. φ 1 x T φ 1 x = x T x = x T x = φ 2 x T φ 2 x Gram matrix or kerel matrix K; with K ij = k x i, x j Matrix of ier products = pairwise similarity betwee samples; a matrix. k x, x ist PSD iff K is a PSD matrix for every dataset 43

44 Kerels Theorem: For every positive defiite fuctio k there exists a mappig φ x such that k x, x = φ x T φ x for all x ad x. Maschielles Lere Costructive Proofs: Reproducig Kerel Hilbert Space (RKHS). Idea: Defie mappig as fuctio φ x = k x,. Defie ier product, betwee fuctios. Show k x, x = k x,, k x,. Mercer mappig. Idea: Decompositio of k i terms of its eigefuctios. Practically relevat: fiite case. 44

45 Maschielles Lere MERCER MAP 45

46 Mercer Map Eigevalue decompositio: Every symmetric matrix K ca be decomposed i terms of its eigevectors u i ad eigevalues λ i : K = UΛU 1, with Λ = λ 1 0 & U = 0 λ u 1 u Maschielles Lere If K is positive semi-defiite, the λ i R 0+ The eigevectors are orthoormal (u i T u i = 1 ad u i T u j = 0) ad U is orthogoal: U T = U 1. 46

47 Mercer Map Thus it holds: Eigevalue decompositio K = UΛU T = UΛ 1/2 Λ 1/2 U T = UΛ 1/2 UΛ 1/2 T Diagoal matrix with λ i Maschielles Lere Feature mappig for used traiig data ca the be defied as φ x 1 φ x = UΛ 1/2 T 47

48 Mercer Map Feature mappig for used traiig data ca the be defied as φ x 1 φ x = UΛ 1/2 T Kerel matrix betwee traiig ad test data K test = Φ X trai T Φ X test = UΛ 1/2 Φ X test Equatio results i a mappig of the test data: Maschielles Lere Φ X test = UΛ 1/2 1 K test Φ X test = Λ 1/2 U T K test U T = U 1 48

49 Mercer Map Useful if a learig problem is give as a kerel fuctio but learig should take place i the primal. Maschielles Lere For example if the kerel matrix will be too large (quadratic memory cosumptio!) Better motivated! 49

50 Kerel Fuctios Polyomial kerels: k poly x i, x j = x i T x j + 1 p Radial basis fuctios: k RBF x i, x j = e γ x i x j 2 Sigmoid kerels, Strig kerels (eg., for classificatio of gee sequeces). Graph kerels for learig with structured istaces. Maschielles Lere Further Literature: B.Schölkopf, A.J.Smola: Learig with Kerels

51 Polyomial Kerels Kerel fuctio: k poly x i, x j = x T i x j + 1 p Which trasformatio φ correspods to this kerel? Example: 2-D iput space, p = 2. Maschielles Lere 51

52 Polyomial Kerels Kerel: k poly x i, x j = x i T x j + 1 p, 2D-iput, p = 2. k poly x i, x j = x i T x j = x i1 x i2 x j1 x j = x i1 x j1 + x i2 x j Maschielles Lere 52

53 Polyomial Kerels Kerel: k poly x i, x j = x i T x j + 1 p, 2D-iput, p = 2. k poly x i, x j = x i T x j = x i1 x i2 x j1 x j2 + 1 = x 2 i1 x 2 j1 + x 2 i2 x j2 2 = x i1 x j1 + x i2 x j x i1 x j1 x i2 x j2 + 2x i1 x j1 + 2x i2 x j x j1 2 x j2 Maschielles Lere = 2 x i1 2 x i2 2x i1 x i2 2x i1 2x i2 1 φ x i T All moomials of degree 2 over iput attributes 2x j1 x j2 2x j1 2x j2 1 φ x j 53

54 Polyomial Kerels Kerel: k poly x i, x j = x i T x j + 1 p, 2D-iput, p = 2. k poly x i, x j = x i T x j = x i1 x i2 x j1 x j2 + 1 = x 2 i1 x 2 j1 + x 2 i2 x j2 2 = x i1 x j1 + x i2 x j x i1 x j1 x i2 x j2 + 2x i1 x j1 + 2x i2 x j x j1 2 x j2 Maschielles Lere = 2 x i1 2 x i2 2x i1 x i2 2x i1 2x i2 1 φ x i T All moomials of degree 2 over iput attributes = x i x i 2x i 1 T x j x j 2x j 1 2x j1 x j2 2x j1 2x j2 1 φ x j 54

55 RBF Kerel Kerel: k RBF x i, x j = exp γ x i x j 2 Which trasformatio φ correspods to this kerel? Maschielles Lere 55

56 Kerels Kerel fuctio k x, x = φ x T φ x computes the ier product of the feature mappig of 2 istaces. The kerel fuctio ca ofte be computed without a explicit represetatio φ x. Eg, polyomial kerel: k poly x i, x j = x i T x j + 1 p Maschielles Lere Ifiite-dimesioal feature mappigs are possible Eg., RBF kerel: k RBF x i, x j = e γ x i x j 2 For every positive defiite kerel there is a feature mappig φ x such that k x, x = φ x T φ x. For a give kerel matrix, the Mercer map provides a feature mappig. 56

57 Summary Represeter Theorem: f θ x = α i φ x i T φ x Samples oly iteract through ier products Kerel Perceptro Kerel SVM Perceptro(Istaces x i, y i ) Set α = 0 DO FOR i = 1,, END IF y i f α x i 0 THEN α i = α i + y i WHILE α chages RETURN α Kerel Fuctios: positive defiite fuctios k x, x are a ier product for some feature space. max β Liear model: f θ x = α i k x i, x such that β i 1 2 i,j=1 0 β i λ β i β j y i y j k x i, x j f β x = β i y i k x i, x x i SV Maschielles Lere Feature mappigs are doe implicitly 57

58 Frohe Weihachte & Eie Gute Rutsch Maschielles Lere Next Time: Kerels for structured data & learig for structured outputs 58

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 0 Scribe: Ade Forrow Oct. 3, 05 Recall the followig defiitios from last time: Defiitio: A fuctio K : X X R is called a positive symmetric

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

Questions and answers, kernel part

Questions and answers, kernel part Questios ad aswers, kerel part October 8, 205 Questios. Questio : properties of kerels, PCA, represeter theorem. [2 poits] Let F be a RK defied o some domai X, with feature map φ(x) x X ad reproducig kerel

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Intelligent Systems I 08 SVM

Intelligent Systems I 08 SVM Itelliget Systems I 08 SVM Stefa Harmelig & Philipp Heig 12. December 2013 Max Plack Istitute for Itelliget Systems Dptmt. of Empirical Iferece 1 / 30 Your feeback Ejoye most Laplace approximatio gettig

More information

4. Linear Classification. Kai Yu

4. Linear Classification. Kai Yu 4. Liear Classificatio Kai Y Liear Classifiers A simplest classificatio model Help to derstad oliear models Argably the most sefl classificatio method! 2 Liear Classifiers A simplest classificatio model

More information

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min) Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 14, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 14, 2014 1 / 49 Outlie Admiistratio

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Symmetric Matrices and Quadratic Forms

Symmetric Matrices and Quadratic Forms 7 Symmetric Matrices ad Quadratic Forms 7.1 DIAGONALIZAION OF SYMMERIC MARICES SYMMERIC MARIX A symmetric matrix is a matrix A such that. A = A Such a matrix is ecessarily square. Its mai diagoal etries

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Istitute of Techology 6.867 Machie Learig, Fall 6 Problem Set : Solutios. (a) (5 poits) From the lecture otes (Eq 4, Lecture 5), the optimal parameter values for liear regressio give the

More information

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3.

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3. Closed Leotief Model Chapter 6 Eigevalues I a closed Leotief iput-output-model cosumptio ad productio coicide, i.e. V x = x = x Is this possible for the give techology matrix V? This is a special case

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Abstract Vector Spaces. Abstract Vector Spaces

Abstract Vector Spaces. Abstract Vector Spaces Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

State Space Representation

State Space Representation Optimal Cotrol, Guidace ad Estimatio Lecture 2 Overview of SS Approach ad Matrix heory Prof. Radhakat Padhi Dept. of Aerospace Egieerig Idia Istitute of Sciece - Bagalore State Space Represetatio Prof.

More information

KERNEL MODELS AND SUPPORT VECTOR MACHINES

KERNEL MODELS AND SUPPORT VECTOR MACHINES COMPUAIONAL INELLIGENCE Vol. I - Kerel Models ad Support Vector Machies - K azushi Ikeda KERNEL MODELS AND SUPPOR VECOR MACHINES Kazushi Ikeda Nara Istitute of Sciece ad echology, Ikoma, Nara, Japa Keywords:

More information

CALCULATION OF FIBONACCI VECTORS

CALCULATION OF FIBONACCI VECTORS CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors 5 Eigevalues ad Eigevectors 5.3 DIAGONALIZATION DIAGONALIZATION Example 1: Let. Fid a formula for A k, give that P 1 1 = 1 2 ad, where Solutio: The stadard formula for the iverse of a 2 2 matrix yields

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Inverse Matrix. A meaning that matrix B is an inverse of matrix A. Iverse Matrix Two square matrices A ad B of dimesios are called iverses to oe aother if the followig holds, AB BA I (11) The otio is dual but we ofte write 1 B A meaig that matrix B is a iverse of matrix

More information

Filter banks. Separately, the lowpass and highpass filters are not invertible. removes the highest frequency 1/ 2and

Filter banks. Separately, the lowpass and highpass filters are not invertible. removes the highest frequency 1/ 2and Filter bas Separately, the lowpass ad highpass filters are ot ivertible T removes the highest frequecy / ad removes the lowest frequecy Together these filters separate the sigal ito low-frequecy ad high-frequecy

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

Supplemental Material: Proofs

Supplemental Material: Proofs Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

Continuous Models for Eigenvalue Problems

Continuous Models for Eigenvalue Problems Cotiuous Models for Eigevalue Problems Li-Zhi Liao (http://www.math.hkbu.edu.hk/~liliao) Departmet of Mathematics Hog Kog Baptist Uiversity Dedicate this talk to the memory of Gee Golub Program for Gee

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

Solutions to home assignments (sketches)

Solutions to home assignments (sketches) Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

More information

b i u x i U a i j u x i u x j

b i u x i U a i j u x i u x j M ath 5 2 7 Fall 2 0 0 9 L ecture 1 9 N ov. 1 6, 2 0 0 9 ) S ecod- Order Elliptic Equatios: Weak S olutios 1. Defiitios. I this ad the followig two lectures we will study the boudary value problem Here

More information

A brief introduction to linear algebra

A brief introduction to linear algebra CHAPTER 6 A brief itroductio to liear algebra 1. Vector spaces ad liear maps I what follows, fix K 2{Q, R, C}. More geerally, K ca be ay field. 1.1. Vector spaces. Motivated by our ituitio of addig ad

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs

More information

Overview. Structured learning for feature selection and prediction. Motivation for feature selection. Outline. Part III:

Overview. Structured learning for feature selection and prediction. Motivation for feature selection. Outline. Part III: Overview Structured learig for feature selectio ad predictio Yookyug Lee Departmet of Statistics The Ohio State Uiversity Part I: Itroductio to Kerel methods Part II: Learig with Reproducig Kerel Hilbert

More information

CMSE 820: Math. Foundations of Data Sci.

CMSE 820: Math. Foundations of Data Sci. Lecture 17 8.4 Weighted path graphs Take from [10, Lecture 3] As alluded to at the ed of the previous sectio, we ow aalyze weighted path graphs. To that ed, we prove the followig: Theorem 6 (Fiedler).

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Lecture 2 October 11

Lecture 2 October 11 Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

More information

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods TMA4205 Numerical Liear Algebra The Poisso problem i R 2 : diagoalizatio methods September 3, 2007 c Eiar M Røquist Departmet of Mathematical Scieces NTNU, N-749 Trodheim, Norway All rights reserved A

More information

Matrix Algebra from a Statistician s Perspective BIOS 524/ Scalar multiple: ka

Matrix Algebra from a Statistician s Perspective BIOS 524/ Scalar multiple: ka Matrix Algebra from a Statisticia s Perspective BIOS 524/546. Matrices... Basic Termiology a a A = ( aij ) deotes a m matrix of values. Whe =, this is a am a m colum vector. Whe m= this is a row vector..2.

More information

Lecture 8: October 20, Applications of SVD: least squares approximation

Lecture 8: October 20, Applications of SVD: least squares approximation Mathematical Toolkit Autum 2016 Lecturer: Madhur Tulsiai Lecture 8: October 20, 2016 1 Applicatios of SVD: least squares approximatio We discuss aother applicatio of sigular value decompositio (SVD) of

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Chimica Inorganica 3

Chimica Inorganica 3 himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule

More information

Mathematical Foundations -1- Sets and Sequences. Sets and Sequences

Mathematical Foundations -1- Sets and Sequences. Sets and Sequences Mathematical Foudatios -1- Sets ad Sequeces Sets ad Sequeces Methods of proof 2 Sets ad vectors 13 Plaes ad hyperplaes 18 Liearly idepedet vectors, vector spaces 2 Covex combiatios of vectors 21 eighborhoods,

More information

Introduction to Optimization Techniques

Introduction to Optimization Techniques Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis - Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Machine Learning. Ilya Narsky, Caltech

Machine Learning. Ilya Narsky, Caltech Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set

More information

Math 203A, Solution Set 8.

Math 203A, Solution Set 8. Math 20A, Solutio Set 8 Problem 1 Give four geeral lies i P, show that there are exactly 2 lies which itersect all four of them Aswer: Recall that the space of lies i P is parametrized by the Grassmaia

More information

CHAPTER 3. GOE and GUE

CHAPTER 3. GOE and GUE CHAPTER 3 GOE ad GUE We quicly recall that a GUE matrix ca be defied i the followig three equivalet ways. We leave it to the reader to mae the three aalogous statemets for GOE. I the previous chapters,

More information

Homework Set #3 - Solutions

Homework Set #3 - Solutions EE 15 - Applicatios of Covex Optimizatio i Sigal Processig ad Commuicatios Dr. Adre Tkaceko JPL Third Term 11-1 Homework Set #3 - Solutios 1. a) Note that x is closer to x tha to x l i the Euclidea orm

More information

Hilbert Space and Least-squares Collocation

Hilbert Space and Least-squares Collocation Hilbert Space ad Least-squares Collocatio Lecture : Discrete, Mixed Boudary-Value Problem i Physical Geodesy Lecture : Hilbert Spaces, Reproducig Kerels, ad Fuctioals Lecture 3: Miimum Norm Solutio to

More information

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0. THE SOLUTION OF NONLINEAR EQUATIONS f( ) = 0. Noliear Equatio Solvers Bracketig. Graphical. Aalytical Ope Methods Bisectio False Positio (Regula-Falsi) Fied poit iteratio Newto Raphso Secat The root of

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

2D DSP Basics: 2D Systems

2D DSP Basics: 2D Systems - Digital Image Processig ad Compressio D DSP Basics: D Systems D Systems T[ ] y = T [ ] Liearity Additivity: If T y = T [ ] The + T y = y + y Homogeeity: If The T y = T [ ] a T y = ay = at [ ] Liearity

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

Review Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = =

Review Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = = Review Problems ICME ad MS&E Refresher Course September 9, 0 Warm-up problems. For the followig matrices A = 0 B = C = AB = 0 fid all powers A,A 3,(which is A times A),... ad B,B 3,... ad C,C 3,... Solutio:

More information

Orthogonal transformations

Orthogonal transformations Orthogoal trasformatios October 12, 2014 1 Defiig property The squared legth of a vector is give by takig the dot product of a vector with itself, v 2 v v g ij v i v j A orthogoal trasformatio is a liear

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Lecture #20. n ( x p i )1/p = max

Lecture #20. n ( x p i )1/p = max COMPSCI 632: Approximatio Algorithms November 8, 2017 Lecturer: Debmalya Paigrahi Lecture #20 Scribe: Yua Deg 1 Overview Today, we cotiue to discuss about metric embeddigs techique. Specifically, we apply

More information

PC5215 Numerical Recipes with Applications - Review Problems

PC5215 Numerical Recipes with Applications - Review Problems PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,

More information

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading : ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

Real Numbers R ) - LUB(B) may or may not belong to B. (Ex; B= { y: y = 1 x, - Note that A B LUB( A) LUB( B)

Real Numbers R ) - LUB(B) may or may not belong to B. (Ex; B= { y: y = 1 x, - Note that A B LUB( A) LUB( B) Real Numbers The least upper boud - Let B be ay subset of R B is bouded above if there is a k R such that x k for all x B - A real umber, k R is a uique least upper boud of B, ie k = LUB(B), if () k is

More information

FFTs in Graphics and Vision. The Fast Fourier Transform

FFTs in Graphics and Vision. The Fast Fourier Transform FFTs i Graphics ad Visio The Fast Fourier Trasform 1 Outlie The FFT Algorithm Applicatios i 1D Multi-Dimesioal FFTs More Applicatios Real FFTs 2 Computatioal Complexity To compute the movig dot-product

More information

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities Polyomials with Ratioal Roots that Differ by a No-zero Costat Philip Gibbs The problem of fidig two polyomials P(x) ad Q(x) of a give degree i a sigle variable x that have all ratioal roots ad differ by

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm 8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

More information

Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation

Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x

More information

A collocation method for singular integral equations with cosecant kernel via Semi-trigonometric interpolation

A collocation method for singular integral equations with cosecant kernel via Semi-trigonometric interpolation Iteratioal Joural of Mathematics Research. ISSN 0976-5840 Volume 9 Number 1 (017) pp. 45-51 Iteratioal Research Publicatio House http://www.irphouse.com A collocatio method for sigular itegral equatios

More information

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7 Statistical Machie Learig II Sprig 2017, Learig Theory, Lecture 7 1 Itroductio Jea Hoorio jhoorio@purdue.edu So far we have see some techiques for provig geeralizatio for coutably fiite hypothesis classes

More information

15.081J/6.251J Introduction to Mathematical Programming. Lecture 21: Primal Barrier Interior Point Algorithm

15.081J/6.251J Introduction to Mathematical Programming. Lecture 21: Primal Barrier Interior Point Algorithm 508J/65J Itroductio to Mathematical Programmig Lecture : Primal Barrier Iterior Poit Algorithm Outlie Barrier Methods Slide The Cetral Path 3 Approximatig the Cetral Path 4 The Primal Barrier Algorithm

More information

Chapter 4. Fourier Series

Chapter 4. Fourier Series Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,

More information