Machine Learning for Signal Processing Linear Gaussian Models
|
|
- Alexina Pearson
- 5 years ago
- Views:
Transcription
1 Machne Learnng for Sgnal Processng Lnear Gaussan Models Class Oct 204 Instructor: Bhksha Raj 755/8797
2 Recap: MAP stmators MAP (Mamum A Posteror: Fnd a best guess for (statstcall, gven knon = argma Y P(Y 755/8797 2
3 Recap: MAP estmaton and are jontl Gaussan z [ z] z C C Var ( z Czz C [( ( ] C C P( z N( z, Czz ep 0.5( z z ( z z 2 C zz z s Gaussan 755/8797 3
4 MAP estmaton: Gaussan PDF Y F X 755/8797 4
5 MAP estmaton: he Gaussan at a partcular value of X 0 755/8797 5
6 Condtonal Probablt of P( N( C C (, C C C C [ ] C C ( Var( C C C C he condtonal probablt of gven s also Gaussan he slce n the fgure s Gaussan he mean of ths Gaussan s a functon of he varance of reduces f s knon Uncertant s reduced 755/8797 6
7 MAP estmaton: he Gaussan at a partcular value of X Most lkel value F 0 755/8797 7
8 MAP stmaton of a Gaussan RV ˆ arg ma P( [ ] 0 755/8797 8
9 Its also a mnmum-mean-squared error estmate Mnmze error: Dfferentatng and equatng to 0: 755/ ] ˆ ˆ [ ] ˆ [ 2 rr ] [ 2ˆ ˆ ˆ ] [ ] 2ˆ ˆ ˆ [ rr 0 ˆ ] [ 2 ˆ 2ˆ. d d rr d ] [ ˆ he MMS estmate s the mean of the dstrbuton
10 For the Gaussan: MAP = MMS Most lkel value s also he MAN value Would be true of an smmetrc dstrbuton 755/8797 0
11 A Lkelhood Perspectve = + s a nos readng of a rror e s Gaussan stmate A from a e ~ e N(0, 2 I Y [ 2... N ] X [ 2... N ] 755/8797 9
12 he Lkelhood of the data a e e ~ N(0, 2 I Probablt of observng a specfc, gven, for a partcular matr a P( ; a N( ; a Probablt of collecton:, P( Y X; a N( ; a, 2 I Assumng IID for convenence (not necessar 2 I 755/ Y [ 2... N ] X [ 2... N ]
13 A Mamum Lkelhood stmate Mamzng the log probablt s dentcal to mnmzng the least squared error 755/ e a (0, 2 I ~ e N ]... [ ]... [ 2 2 N N X Y D P ep (2 ( a X Y C P ; ( log a X a Y trace C P ( ( 2 ( log 2 X a Y X a Y X,a Y
14 A problem th regressons A XX - XY ML ft s senstve rror s squared Small varatons n data large varatons n eghts Outlers affect t adversel Unstable If dmenson of X >= no. of nstances (XX s not nvertble 755/
15 MAP estmaton of eghts a =a X+e Assume eghts dran from a Gaussan P(a = N(0, 2 I Ma. Lkelhood estmate Mamum a posteror estmate aˆ arg ma a X aˆ arg maa log P( Y log P( a Y, X X; a arg ma a log P( Y 755/ e X, a P( a
16 MAP estmaton of eghts Smlar to ML estmate th an addtonal term 755/ (, ( log arg ma, ( log arg ma ˆ a X a Y X Y a a A A P P P P(a = N(0, 2 I Log P(a = C log a 2 C P ( ( 2 ( log 2 X a Y X a Y X,a Y a a X a Y X a Y a A trace C ( ( 2 log ' arg ma ˆ
17 MAP estmate of eghts dl 2a XX 2X 2I da 0 a XX I - XY quvalent to dagonal loadng of correlaton matr Improves condton number of correlaton matr Can be nverted th greater stablt Wll not affect the estmaton from ell-condtoned data Also called khonov Regularzaton Dual form: Rdge regresson MAP estmate of eghts Not to be confused th MAP estmate of Y 755/
18 MAP estmate prors Left: Gaussan Pror on W Rght: Laplacan Pror 755/
19 MAP estmaton of eghts th laplacan pror Assume eghts dran from a Laplacan P(a = l - ep(-l - a Mamum a posteror estmate aˆ arg ma a C' trace ( Y a X ( Y a X l a No closed form soluton Quadratc programmng soluton requred Non-trval 755/
20 MAP estmaton of eghts th laplacan pror Assume eghts dran from a Laplacan P(a = l - ep(-l - a Mamum a posteror estmate aˆ arg ma a C' trace ( Y a X ( Y a X l a Identcal to L regularzed least-squares estmaton 755/
21 L -regularzed LS aˆ arg ma a C' trace ( Y a X ( Y a X l a No closed form soluton Quadratc programmng solutons requred Dual formulaton aˆ arg maa C ' trace ( Y a X ( Y a X subject to a t LASSO Least absolute shrnkage and selecton operator 755/
22 LASSO Algorthms Varous conve optmzaton algorthms LARS: Least angle regresson Pathse coordnate descent.. Matlab code avalable from eb 755/
23 Regularzed least squares Image Credt: bshran Regularzaton results n selecton of suboptmal (n least-squares sense soluton One of the loc outsde center khonov regularzaton selects shortest soluton L regularzaton selects sparsest soluton 755/8797 3
24 LASSO and Compressve Sensng Y = X a Gven Y and X, estmate sparse W LASSO: X = eplanator varable Y = dependent varable a = eghts of regresson CS: X = measurement matr Y = measurement a = data 755/
25 MAP / ML / MMS General statstcal estmators All used to predct a varable, based on other parameters related to t.. Most common assumpton: Data are Gaussan, all RVs are Gaussan Other probablt denstes ma also be used.. For Gaussans relatonshps are lnear as e sa.. 755/
26 Gaussans and more Gaussans.. Lnear Gaussan Models.. But frst a recap 755/
27 A Bref Recap D C B D BC Prncpal component analss: Fnd the K bases that best eplan the gven data Fnd B and C such that the dfference beteen D and BC s mnmum Whle constranng that the columns of B are orthonormal 755/
28 Remember genfaces Appromate ever face f as f = f, V + f,2 V 2 + f,3 V f,k V k stmate V to mnmze the squared error rror s uneplaned b V.. V k rror s orthogonal to genfaces 755/
29 Karhunen Loeve vs. PCA genvectors of the Correlaton matr: Prncpal drectons of tghtest ellpse centered on orgn Drectons that retan mamum energ 755/
30 Karhunen Loeve vs. PCA genvectors of the Correlaton matr: Prncpal drectons of tghtest ellpse centered on orgn Drectons that retan mamum energ genvectors of the Covarance matr: Prncpal drectons of tghtest ellpse centered on data Drectons that retan mamum varance 755/
31 Karhunen Loeve vs. PCA genvectors of the Correlaton matr: Prncpal drectons of tghtest ellpse centered on orgn Drectons that retan mamum energ genvectors of the Covarance matr: Prncpal drectons of tghtest ellpse centered on data Drectons that retan mamum varance 755/
32 Karhunen Loeve vs. PCA genvectors of the Correlaton matr: Prncpal drectons of tghtest ellpse centered on orgn Drectons that retan mamum energ genvectors of the Covarance matr: Prncpal drectons of tghtest ellpse centered on data Drectons that retan mamum varance 755/
33 Karhunen Loeve vs. PCA If the data are naturall centered at orgn, KL == PCA Follong sldes refer to PCA! Assume data centered at orgn for smplct Not essental, as e ll see.. 755/8797 4
34 Remember genfaces Appromate ever face f as f = f, V + f,2 V 2 + f,3 V f,k V k stmate V to mnmze the squared error rror s uneplaned b V.. V k rror s orthogonal to genfaces 755/
35 gen Representaton 0 = + e e Illustraton assumng 3D space K-dmensonal representaton rror s orthogonal to representaton Weght and error are specfc to data nstance 755/
36 Representaton rror s at 90 o to the egenface = 2 + e 2 90 o 2 e 2 Illustraton assumng 3D space K-dmensonal representaton rror s orthogonal to representaton Weght and error are specfc to data nstance 755/
37 Representaton 0 All data th the same representaton V le a plane orthogonal to V K-dmensonal representaton rror s orthogonal to representaton 755/
38 Wth 2 bases rror s at 90 o to the egenfaces 0,0 = e e 2 Illustraton assumng 3D space K-dmensonal representaton rror s orthogonal to representaton Weght and error are specfc to data nstance 755/
39 Wth 2 bases rror s at 90 o to the egenfaces = e e 2 Illustraton assumng 3D space K-dmensonal representaton rror s orthogonal to representaton Weght and error are specfc to data nstance 755/
40 rror s at 90 o to the egenfaces e 2 In Vector Form K-dmensonal representaton X = V + 2 V 2 + e V 2 2 D 2 V rror s orthogonal to representaton Weght and error are specfc to data nstance X V V e 755/
41 rror s at 90 o to the egenface e 2 In Vector Form X = V + 2 V 2 + e 2 V e 22 V 2 D 2 V K-dmensonal representaton s a D dmensonal vector V s a D K matr s a K dmensonal vector e s a D dmensonal vector 755/
42 Learnng PCA For the gven data: fnd the K-dmensonal subspace such that t captures most of the varance n the data Varance n remanng subspace s mnmal 755/
43 Constrants rror s at 90 o to the egenface V e 2 22 V 2 D 2 e 2 V V V = I : gen vectors are orthogonal to each other For ever vector, error s orthogonal to gen vectors e V = 0 Over the collecton of data Average = Dagonal : gen representatons are uncorrelated Determnant e e = mnmum: rror varance s mnmum Mean of error s 0 755/8797 5
44 A Statstcal Formulaton of PCA rror s at 90 o to the egenface V e 22 e 2 2 V 2 D 2 V e ~ N(0, B ~ N(0, s a random varable generated accordng to a lnear relaton s dran from an K-dmensonal Gaussan th dagonal covarance e s dran from a 0-mean (D-K-rank D-dmensonal Gaussan stmate V (and B gven eamples of 755/
45 Lnear Gaussan Models!! V e ~ N(0, B e ~ N(0, s a random varable generated accordng to a lnear relaton s dran from a Gaussan e s dran from a 0-mean Gaussan stmate V gven eamples of In the process also estmate 755/8797 B and 53
46 Lnear Gaussan Models!! V e ~ N(0, B e ~ N(0, s a random varable generated accordng to a lnear relaton s dran from a Gaussan e s dran from a 0-mean Gaussan stmate V gven eamples of In the process also estmate 755/8797 B and 54
47 Lnear Gaussan Models μ V e ~ N(0, B e ~ N(0, Observatons are lnear functons of to uncorrelated Gaussan random varables A eght varable An error varable e rror not correlated to eght: [e ] = 0 Learnng LGMs: stmate parameters of the model gven nstances of he problem of learnng the dstrbuton of a Gaussan RV 755/
48 LGMs: Probablt Denst μ V e ~ N(0, B e ~ N(0, he mean of : [ ] μ V[ ] [ e] μ he Covarance of : [ [ ] [ ] ] VBV 755/
49 he probablt of μ V e e ~ N(0, B ~ N(0, ~ N( μ, VBV P( ep 0.5 D 2 VBV μ VBV μ s a lnear functon of Gaussans: s also Gaussan Its mean and varance are as gven 755/
50 stmatng the varables of the μ V model e e ~ N(0, B ~ N(0, ~ N( μ, VBV stmatng the varables of the LGM s equvalent to estmatng P( he varables are, V, B and 755/
51 stmatng the model μ V e e ~ N(0, B ~ N(0, ~ N( μ, VBV he model s ndetermnate: V = VCC - = (VC(C - We need etra constrants to make the soluton unque Usual constrant : B = I Varance of s an dentt matr 755/
52 stmatng the varables of the μ V model e ~ N(0, I e ~ N(0, ~ N( μ, VV stmatng the varables of the LGM s equvalent to estmatng P( he varables are, V, and 755/
53 he Mamum Lkelhood stmate ~ N( μ, VV Gven tranng set, 2,.. N, fnd, V, he ML estmate of does not depend on the covarance of the Gaussan μ N 755/8797 6
54 Centered Data We can safel assume centered data = 0 If the data are not centered, center t stmate mean of data Whch s the mamum lkelhood estmate Subtract t from the data 755/
55 Smplfed Model V e ~ N(0, I e ~ N(0, ~ N(0, VV stmatng the varables of the LGM s equvalent to estmatng P( he varables are V, and 755/
56 stmatng the model V e ~ N(0, VV Gven a collecton of terms, 2,.. N stmate V and s unknon for each But f assume e kno for each, then hat do e get: 755/
57 stmatng the Parameters V e P( e N(0, P( N( V, P( 2 D ep 0.5( V ( V We ll use a mamum-lkelhood estmate he log-lkelhood of.. N knong ther s log P(.. N.. N 0.5N log 0.5 ( V ( V 755/
58 Mamzng the log-lkelhood Dfferentatng.r.t. V and settng to 0 755/ N LL ( ( 0.5 log 0.5 V V 0 ( 2 V V Dfferentatng.r.t. - and settng to 0 N V
59 stmatng LGMs: If e kno But n realt e don t kno the for each So ho to deal th ths? M.. 755/ V N V e V (0, ( P N e
60 Recall M Instance from blue dce Instance from red dce Dce unknon Collecton of blue numbers Collecton of red numbers Collecton of blue numbers Collecton of red numbers Collecton of blue numbers Collecton of red numbers We fgured out ho to compute parameters f e kne the mssng nformaton hen e fragmented the observatons accordng to the posteror probablt P(z and counted as usual In effect e took the epectaton th respect to the a posteror probablt of the mssng data: P(z 755/
61 M for LGMs Replace unseen data terms th epectatons taken.r.t. P( 755/ V N V e V (0, ( P N e N N V ] [ ] [ ] [ V
62 M for LGMs Replace unseen data terms th epectatons taken.r.t. P( 755/ V N V e V (0, ( P N e N N V ] [ ] [ ] [ V
63 pected Value of gven V e P( N(0, e P( N(0, I P( N(0, VV and are jontl Gaussan! s Gaussan s Gaussan he are lnearl related z P( z N(, z Czz 755/8797 7
64 pected Value of gven C V [( ( ] e P( N(0, VV P( N(0, I z V P( z N( z, Czz C C zz z zz C C VV V 0 C C V I and are jontl Gaussan! 755/
65 he condtonal epectaton of gven z P( z s a Gaussan 755/ , ( ( ( C C C C C C N P I C V V VV zz zz C C C C C 0 z (, ( ( ( V VV V VV V I N P VV V ( ] [ Var ] [ ] [ ( ] [ I ] [ ] [ ( ] [ V VV V
66 LGM: he complete M algorthm Intalze V and step: M step: 755/ VV V ( ] [ I ] [ ] [ ( ] [ V VV V ] [ ] [ V N N V ] [
67 So hat have e acheved mploed a complcated M algorthm to learn a Gaussan PDF for a varable What have e ganed??? Net class: PCA Sensble PCA M algorthms for PCA Factor Analss FA for feature etracton 755/
68 LGMs : Applcaton Learnng prncpal components V e ~ N(0, I e ~ N(0, Fnd drectons that capture most of the varaton n the data rror s orthogonal to these varatons 755/
69 LGMs : Applcaton 2 Learnng th nsuffcent data FULL COV FIGUR he full covarance matr of a Gaussan has D 2 terms Full captures the relatonshps beteen varables Problem: Needs a lot of data to estmate robustl 755/
70 o be contnued.. Other applcatons.. Net class 755/
Machine Learning for Signal Processing Linear Gaussian Models
Machne Learnng for Sgnal rocessng Lnear Gaussan Models lass 2. 2 Nov 203 Instructor: Bhsha Raj 2 Nov 203 755/8797 HW3 s up. Admnstrva rojects please send us an update 2 Nov 203 755/8797 2 Recap: MA stmators
More informationMachine Learning for Signal Processing Applications of Linear Gaussian Models
Machne Learnng for Sgnal Processng Applcatons of Lnear Gaussan Models Class 8. 3 Nov 207 Instructor Najm Dehak In collaboraton th Prof Bhksha Raj Recap: MAP stmators MAP (Mamum A Posteror): Fnd most probable
More informationMachine Learning for Signal Processing Regression and Prediction
Machne Learnng for Sgnal Processng Regresson and Predcton Class 4. 5 Oct 06 Instructor: Bhsha Raj 755/8797 A Common Problem Can ou spot the gltches? 755/8797 How to f ths problem? Gltches n audo Must be
More informationDiscriminative classifier: Logistic Regression. CS534-Machine Learning
Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng 2 Logstc Regresson Gven tranng set D stc regresson learns the condtonal dstrbuton We ll assume onl to classes and a parametrc form for here s
More informationMachine Learning for Signal Processing Regression and Prediction
Machne Learnng for Sgnal Processng Regresson and Predcton Class 4. 7 Oct 0 Instructor: Bhsha Raj 7 Oct 03 755/8797 f... D df Matr Identtes df d d df d d... df dd dd he dervatve of a scalar functon w.r.t.
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationDiscriminative classifier: Logistic Regression. CS534-Machine Learning
Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng robablstc Classfer Gven an nstance, hat does a probablstc classfer do dfferentl compared to, sa, perceptron? It does not drectl predct Instead,
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for
More informatione i is a random error
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown
More informationGenerative classification models
CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationClassification learning II
Lecture 8 Classfcaton learnng II Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Logstc regresson model Defnes a lnear decson boundar Dscrmnant functons: g g g g here g z / e z f, g g - s a logstc functon
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationβ0 + β1xi and want to estimate the unknown
SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition
EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationThe conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above
The conjugate pror to a Bernoull s A) Bernoull B) Gaussan C) Beta D) none of the above The conjugate pror to a Gaussan s A) Bernoull B) Gaussan C) Beta D) none of the above MAP estmates A) argmax θ p(θ
More informationOutline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil
Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationA Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009
A utoral on Data Reducton Lnear Dscrmnant Analss (LDA) hreen Elhaban and Al A Farag Unverst of Lousvlle, CVIP Lab eptember 009 Outlne LDA objectve Recall PCA No LDA LDA o Classes Counter eample LDA C Classes
More informationLecture 10: Dimensionality reduction
Lecture : Dmensonalt reducton g The curse of dmensonalt g Feature etracton s. feature selecton g Prncpal Components Analss g Lnear Dscrmnant Analss Intellgent Sensor Sstems Rcardo Guterrez-Osuna Wrght
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationLinear discriminants. Nuno Vasconcelos ECE Department, UCSD
Lnear dscrmnants Nuno Vasconcelos ECE Department UCSD Classfcaton a classfcaton problem as to tpes of varables e.g. X - vector of observatons features n te orld Y - state class of te orld X R 2 fever blood
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More information15-381: Artificial Intelligence. Regression and cross validation
15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput
More informationMachine Learning for Signal Processing
Machne Learnng for Sgnal Processng Lecture 6: Optmzaton Data-drven Representatons Abelno Jmenez Generalzes to nequalty constrants Optmzaton problem wth constrants mn s. t. g ( x) Lagrange multplers l ³
More informationMIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern
More informationA Bayesian Approach to Stein-Optimal Covariance Matrix Estimation
A Bayesan Approach to Sten-Optmal Covarance Matrx Estmaton Ben Gllen Calforna Insttute of Technology Unversty of Calforna Irvne Econometrcs Semnar November 5 013 Estmatng Large Covarance Matrces Central
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More information18-660: Numerical Methods for Engineering Design and Optimization
8-66: Numercal Methods for Engneerng Desgn and Optmzaton n L Department of EE arnege Mellon Unversty Pttsburgh, PA 53 Slde Overve lassfcaton Support vector machne Regularzaton Slde lassfcaton Predct categorcal
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationParameter estimation class 5
Parameter estmaton class 5 Multple Ve Geometr Comp 9-89 Marc Pollefes Content Background: Projectve geometr (D, 3D), Parameter estmaton, Algortm evaluaton. Sngle Ve: Camera model, Calbraton, Sngle Ve Geometr.
More informationResource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis
Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques
More informationxp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ
CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and
More informationSupport Vector Machines
Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationCSE 546 Midterm Exam, Fall 2014(with Solution)
CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationGenerative and Discriminative Models. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
Generatve and Dscrmnatve Models Je Tang Department o Computer Scence & Technolog Tsnghua Unverst 202 ML as Searchng Hpotheses Space ML Methodologes are ncreasngl statstcal Rule-based epert sstems beng
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More informationβ0 + β1xi. You are interested in estimating the unknown parameters β
Revsed: v3 Ordnar Least Squares (OLS): Smple Lnear Regresson (SLR) Analtcs The SLR Setup Sample Statstcs Ordnar Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals)
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More informationChapter 14 Simple Linear Regression
Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng
More informationCorrelation and Regression. Correlation 9.1. Correlation. Chapter 9
Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,
More informationMaximum Likelihood Estimation (MLE)
Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationProbabilistic Classification: Bayes Classifiers. Lecture 6:
Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.
More informationEmpirical Methods for Corporate Finance. Identification
mprcal Methods for Corporate Fnance Identfcaton Causalt Ultmate goal of emprcal research n fnance s to establsh a causal relatonshp between varables.g. What s the mpact of tangblt on leverage?.g. What
More informationPattern Classification
Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher
More informationSupport Vector Machines
CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationSpace of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics
/7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space
More informationLECTURE 9 CANONICAL CORRELATION ANALYSIS
LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationLecture 3 Specification
Lecture 3 Specfcaton 1 OLS Estmaton - Assumptons CLM Assumptons (A1) DGP: y = X + s correctly specfed. (A) E[ X] = 0 (A3) Var[ X] = σ I T (A4) X has full column rank rank(x)=k-, where T k. Q: What happens
More informationβ0 + β1xi. You are interested in estimating the unknown parameters β
Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationProbabilistic Classification: Bayes Classifiers 2
CSC Machne Learnng Lecture : Classfcaton II September, Sam Rowes Probablstc Classfcaton: Baes Classfers Generatve model: p(, ) = p()p( ). p() are called class prors. p( ) are called class-condtonal feature
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationSupport Vector Machines
/14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x
More informationRockefeller College University at Albany
Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More informationMachine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso
Machne Learnng Data Mnng CS/CS/EE 155 Lecture 4: Regularzaton, Sparsty Lasso 1 Recap: Complete Ppelne S = {(x, y )} Tranng Data f (x, b) = T x b Model Class(es) L(a, b) = (a b) 2 Loss Functon,b L( y, f
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationLogistic Regression Maximum Likelihood Estimation
Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall
More informationThe Ordinary Least Squares (OLS) Estimator
The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal
More information6 Supplementary Materials
6 Supplementar Materals 61 Proof of Theorem 31 Proof Let m Xt z 1:T : l m Xt X,z 1:t Wethenhave mxt z1:t ˆm HX Xt z 1:T mxt z1:t m HX Xt z 1:T + mxt z 1:T HX We consder each of the two terms n equaton
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationRegression Analysis. Regression Analysis
Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationLaboratory 3: Method of Least Squares
Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationChapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of
Chapter 7 Generalzed and Weghted Least Squares Estmaton The usual lnear regresson model assumes that all the random error components are dentcally and ndependently dstrbuted wth constant varance. When
More informationb ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere
Fall Analyss of Epermental Measurements B. Esensten/rev. S. Errede Some mportant probablty dstrbutons: Unform Bnomal Posson Gaussan/ormal The Unform dstrbuton s often called U( a, b ), hch stands for unform
More informationMean Field / Variational Approximations
Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but
More informationTHE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for
More informationCell Biology. Lecture 1: 10-Oct-12. Marco Grzegorczyk. (Gen-)Regulatory Network. Microarray Chips. (Gen-)Regulatory Network. (Gen-)Regulatory Network
5.0.202 Genetsche Netzwerke Wntersemester 202/203 ell ology Lecture : 0-Oct-2 Marco Grzegorczyk Gen-Regulatory Network Mcroarray hps G G 2 G 3 2 3 metabolte metabolte Gen-Regulatory Network Gen-Regulatory
More informationFall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede
Fall 0 Analyss of Expermental easurements B. Esensten/rev. S. Errede We now reformulate the lnear Least Squares ethod n more general terms, sutable for (eventually extendng to the non-lnear case, and also
More information