Tema 5: Aprendizaje NO Supervisado: CLUSTERING Unsupervised Learning: CLUSTERING. Febrero-Mayo 2005
|
|
- Camilla Bridges
- 6 years ago
- Views:
Transcription
1 Tema 5: Apredzae NO Supervsado: CLUSTERING Usupervsed Learg: CLUSTERING Febrero-Mayo 2005
2 SUPERVISED METHODS: LABELED Data Base Labeled Data Base Dvded to Tra ad Test Choose Algorthm: MAP, ML, K-Nearest LD, SVC NN, Tree,... Trag the algorthm or determg the fucto Evaluatg The Classfer Reducg the space dmeso d: by Lear Methods as PCA, MDA, ICA Reducg the space dmeso d: Feature Selecto (Idepedet Algorthm Mache Learg) 2
3 UNSUPERVISED METHODS: No LABELED Data Base No Labeled Data Base Choose Algorthm: Clusters talzat o E-Step: Classfyg samples M-Step: Updatg Parameters or Evaluatg Crtero Fuctos Reducg the space dmeso d: by Lear Methods as PCA, ICA Reducg the space dmeso d: Feature Selecto (Idepedet Algorthm Mache Learg) 3
4 LOOKING FOR STRUCTURE INSIDE THE DATA Parametrc Methods: They assume some f.d.p. for the clusters. No Parametrc Methods: Formal Clusterg Procedures 4
5 INDEX (Parametrc Methods) MIXTURE DENSITIES AND IDENTIFIABILITY 2 MAXIMUM LIKELIHOOD ESTIMATES: EM 3 K-Meas Clusterg 5
6 MIXTURE DENSITIES AND IDENTIFIABILITY Assumptos:. The samples come from a kow umber c of classes 2. Pror probabltes for each class are kow (Mxg Parameters). 3. The form of the classcodtoal probabltes destes are kow 4. The values for parameters are ukow 5. The category labels are ukow: UNSUPERVISED { ω } Pr ; =.. c f x ω, θ { x ω, } θ θ ; =.. c 6
7 MIXTURE DENSITIES AND IDENTIFIABILITY MIXTURE DENSITY: f xθ c { xθ} f { xω, θ } Pr{ ω } = = x ω, θ. For the momet t s assumed that oly parameter vector θ s ukow. 2. Necessary codtos for detfablty: f xθ { x θ} f { x θ' } xθ θ θ = : θc 7
8 MIXTURE DENSITIES AND IDENTIFIABILITY Example: detfablty problem: BINARY (SYMMETRIC) CHANNEL P 0 + P = P θ0 = Pr{ x = 0ω0} = Pr { ω }( bt = 0) x = { ω } P = Pr ( bt = ) { x ω } θ = Pr = 0 { x ω } θ = Pr = { x ω } θ = Pr = 0 0 x = 8
9 MIXTURE DENSITIES AND IDENTIFIABILITY Example: detfablty problem: BINARY (SYMMETRIC) CHANNEL Parameter Vector: θ θ 0 = θ MIXTURE DENSITY (PROBABILITY) { } x( ) x x x θ = Pθ θ + Pθ ( θ ) x Pr
10 2 MAXIMUM LIKELIHOOD ESTIMATES Lkelhood of the statstcal depedet observed D samples: = { x }, x2,.. x D xk k k k k = k = c ( θ) = ( x θ) ; = l ( ) x x θ f D f l f k ( x θ) ( x θ ) Pr ( ) f = x f ω ω k xk k = Assumg statstcal depedece betwee lθ =, θ { ˆ} ( ˆ ) θ, ML Pr solutos ω, s oe l of the multple, 0;.. xk θ θ f x x k k ω θ = = c k = solutos of: 0
11 2 MAXIMUM LIKELIHOOD ESTIMATES Demo:, l ( ) k = f x θ ( x θ) θ k= fx k = k = k = θ θ x x k k k k k ( x θ) k ( x θ) = f = f x c f (, ) x x Pr( ) k k ω θ ω = ( f (, ) Pr( )) x xk ω θ ω = θ k { ω x θ} ( x ω θ ) Pr, l f, = 0; =.. c k θ x k k k
12 2 MAXIMUM LIKELIHOOD ESTIMATES Geeralzg to the ukow pror probablty case: (No demo s cluded here). To compute pror probablty estmates 2. To compute vector parameter estmates 3. To compute codtoed probablty for classes. k = { } { } ω ˆ ˆ = ω xk θ Pr ˆ Pr, k = { ω ˆ} ( ˆ ) xk θ θ x xk ω θ ˆPr, l f, ; =.. c { ω x θˆ } ˆPr, = k c = f x f k x k ( x ˆ ) ˆ k ω, θ Pr{ ω} k ( x ˆ ) ˆ k ω, θ Pr{ ω} 2
13 2 MAXIMUM LIKELIHOOD ESTIMATES For Gaussa Dstrbutos: 2 Σ l, l d k (2 ) 2 π ( ( )) T fx xk ω θ = 2 ( x k µ ) Σ ( x k µ ) Parameters to estmate: ( ) θ= θ,..; θ θ = c ( µ, Σ ) 3
14 2 MAXIMUM LIKELIHOOD ESTIMATES ML s solved applyg the SOFT Expectato-Maxmzato algorthm: Soft Assgmet. Iteratos stop whe the p.d.f. does ot vary.. Expectato (E-Step) { ω x θˆ } ˆPr, 2. Maxmzato (M-Step) = k c { ω} ˆ { ω } = 2 2 ( ( ) ( )) T ˆ ˆ 2 Pr{ ω } ˆ ˆ exp ˆ k k Σ x µ Σ x µ ( ( ) ( )) T ˆ ˆ 2 Pr{ ω } ˆ ˆ exp ˆ k k Σ x µ Σ x µ { ω ˆ} { ω ˆ} { ω ˆ}( )( ) Pr ˆ x, θ x Pr ˆ x, θ x µ ˆ x µ ˆ Pr ˆ Pr ˆ, ; ; ˆ k k k k k k= k= ˆ = xk θ µ = Σ = k = Pr ˆ, Pr ˆ ˆ xk θ { ω xk, θ} k= k= T 4
15 2 MAXIMUM LIKELIHOOD ESTIMATES Pb: Startg Pot: Bra Images; Full DataBase vs Labeled Data Base Pr>0.95 µ [ ] [ ] { ω } 0; dm: dx 0; dm: dxd Pr =,2,3 5
16 2 MAXIMUM LIKELIHOOD ESTIMATES E-step: For a gve x k estmate: { ω x θˆ } ˆPr, = k c = f x f k x ( x ˆ ) ˆ k ω, θ Pr{ ω} ˆ ( x, ) Pr ˆ { } k k ω θ ω µ [ ] µ [ ] µ 3 [ ] 2 x M-STEP: Parameters are updated (ML estmato) 6
17 3. K-Meas Clusterg HARD Classfcato: Smplfcato of the ML (EM) estmates for a Normal Multvarable (Optmum for CASE Multvarable Gaussa Varable see wth MAP). 2 θ = µ Σ = I σ Pr ˆ, { ω x µ ˆ} k ( x µ ˆ ) ( x µ ˆ ) de k, < de k, ; = 0 other { } { } ω ˆ ˆ ω xk θ Pr ˆ = Pr, = µ ˆ = x k = k = k k Cetrod 7
18 3. K-Meas Clusterg K-Meas Clusterg 8
19 3. K-Meas Clusterg K-Meas Clusterg 9
20 3. K-Meas Clusterg K-Meas Clusterg 20
21 3. K-Meas Clusterg K-Meas Clusterg 2
22 Bra Images 22
23 Bra Images: K-Meas Dfferet Startg Pots 23
24 Bra Images: Expectato- Maxmzato Dfferet Startg Pots 24
25 Bra Images: NN 25
26 3. K-Meas Clusterg J µ ˆ APPLICATION: Vector Quatzato of a -dmesoal real valued vector. See: Proaks: Dgtal Commucatos Chapter 3: Source Codg. FUZZY K-Meas Soft Classfcato. b s a free bledg parameter = ( Pr ˆ { ω x, θˆ }) d ( x, µ ˆ ) Fuzzy k e = = ( ˆPr { ω, ˆ} ) k k k = = b k = c b ( ˆPr { ω, ˆ} ) xk θ b x θ x [ ] { ω ˆ } xk θ ˆPr, x µ k 26
27 INDEX: Formal Clusterg Procedures INTRODUCTION: FORMAL CLUSTERING PROCEDURES 2 SIMILARITY MEASURES 3 CRITERION FUNCTIONS 4 ITERATIVE OPTIMIZATION 5 CONCLUSIONS 27
28 . INTRODUCTION Clusters may form clouds of pots a d-dmesoal space. Normal Dstrbuto: Sample Mea ad Sample Covarace Matrx form a Suffcet Statstcs Mea Sample m: Locates de Ceter of gravty of the cloud ad t best represets all of the data the sese of mmzg the sum of squared dstaces from m to the samples. Sample Covarace Matrx C: deotes the amout the data scatters alog varous drectos aroud m. 28
29 . INTRODUCTION Sample mea vector ad Sample Covarace Matrx are t a sufccet statstcal a geeral case: Dstrbutos wth detcal Mea ad Covarace: m N N = x N k = k C N = k k x m x m N k = ( )( ) T 29
30 . INTRODUCTION Formal Clusterg Procedures: Two Key Steps Data are grouped clusters or groups of data pots that posses strog teral smlartes. A Crtero Fucto s used to seek the groupg that extremzes t. To evaluate the parttog of a set of samples to clusters, the smlarty s measured betwee samples. 30
31 2. SIMILARITY MEASURES Smlarty s measured usg dstace betwee samples Example: Eucldea dstace d(x, x ). d ( [ ] [ ]) 2 d ( x, x ) = x x = x x e = Two samples belogs to the same cluster f d(x, x )<d o. Threshold d o s crtcal. 3
32 2. SIMILARITY MEASURES Dstace threshold affects the umber ad sze of clusters: typcal wth clusters dstace < d < typcal betwee clusters dstace 0 32
33 2. SIMILARITY MEASURES Eucldea dstace d. Clusters are varat to Rotato. Clusters are varat to Traslato. Clusters are varat to Lear Trasformatos geeral. 33
34 2. SIMILARITY MEASURES Normalzato pror to clusterg. Each feature s traslated to have zero mea Each feature s scaled to have ut varace. (These two prevous actos are recommeded wth Neural Nets). PCA Prcpal Compoets Aalyss (Axes cocde wth the egevectors of the sample covarace matrx). AFTER NORMALIZATION AND PCA, CLUSTERS ARE INVARIANT TO DISPLACEMENTS, SCALE CHANGE AND ROTATIONS. 34
35 2. SIMILARITY MEASURES Other Metrcs. Mkowsk Dstace d dq( x, x ) = x x = ( [ ] [ ]) q q Mahalaobs Dstace 2 T d M ( x, x ) = x x Σ x x ( ) ( ) 35
36 2. SIMILARITY MEASURES Smlarty Fuctos: It compares two vectors T (, ) s e x x = x x x x It s varat to Rotato ad Dlato It s o varat to traslato ad geeral lear trasformato 36
37 2. SIMILARITY MEASURES If the foud clusters are used to a posteror problem of classfcato: Metrc (dstace) s used as classfcato crtera or Smlarty fucto s used as classfcato crtera 37
38 3. CRITERION FUNCTIONS Crtero Fuctos for Clusterg: Ital Set D = x, x2,..., x { } Partto to exactly c subsets. D, D,..., D 2 c Obectve: To fd the partto that extremzes the crtero fucto 38
39 3. CRITERION FUNCTIONS 3. Crtero Fucto Sum Of Squared Error Crtero: J e c = x m = x D 2 m s the best represetatve of the samples D. It s approprated whe the clusters form compact clouds ad uform umber of samples per cluster. 39
40 3. CRITERION FUNCTIONS Related Mmum Varace Crtera: J e c e 2 2 = x D x' D J = s; s = x x' 2 Suggesto to obta other crtero fucto: s d s s s s = max xx D e( xx, '); = e( xx, '); = m xx D e( xx, ');, ' 2, ' x D x' D 40
41 3. CRITERION FUNCTIONS 3.2 Scatter Crtera: Mea Vectors ad Scatter matrces used clusterg crtera m = x x D Mea Vector for the cluster c Total mea vector m= x= m x D = Scatter matrx for the I cluster S = ( )( ) Wth-cluster scatter matrx x m x m Betwee-cluster scatter matrx Total Scatter Matrx S = x D c S W = c T B = ( )( ) = T T = ( )( ) = W + B x D S m m m m S x m x m S S T 4
42 3. CRITERION FUNCTIONS 3.2 Scatter Crtera: TRACE CRITERION It measures the square of the scatterg radus Mmze the trace of the Wth Cluster Scatter c c Matrx [ ] [ ] 2 Tr SW = Tr S = x m = Je = = x D It results fucto J e. It s equvalet to maxmze betwee cluster scatterg matrx trace. [ S ] = [ S ] [ S ] Tr Tr Tr W T B Tr [ ] c S = m m B = 2 42
43 3. CRITERION FUNCTIONS 3.2 Scatter Crtera: DETERMINANT CRITERION It measures the square of the scatterg volume. S B s sgular f c<=d; rak(s B )<=c- S W s sgular f -c<d Assumg >d+c c J d = SW = S = It o chages f the axes are scaled 43
44 3. CRITERION FUNCTIONS 3.2 Scatter Crtera: Ivarat Crtera Egevalues of v(s W )S B are varat to osgular lear trasformatos of the data. max : Tr S = ; S = d d W W SB λ + λ = ST = Proposed Crtera m : They are equvalet for c=2 d f = ST S W = + = J Tr λ 44
45 3. CRITERION FUNCTIONS 3.2Ivarat Crtera Demo: S v B W ( λ ) λ λ λ T W + λ,..,,.., d egevalues( W B) + λ,.., + λ,.., + λ = egevalues( ST SB) = λ S v = S S d ( λ ) S v = S v + S v = λs v + S v = + S v T B W W W W v = + S S v T W S S v = v 45
46 3. CRITERION FUNCTIONS 3.2 Scatter Crtero: Ivarat Crtera Trace Crtera. Determat Crtera. Ivarat Crtera. 46
47 CLUSTERING PROCEDURES CONCLUSIONS Uderlyg Model: assumes that samples form c farly well separated clouds of pots. S W measures the compactess of these clouds. Problem: Computatoal complexty to evaluate the overall umber of possbltes parttog s mpractcable. 47
48 4 ITERATIVE OPTIMIZATION Drect parttog: c /c! Practcal soluto: Itate wth some reasoable partto ad to move samples from oe group to aother f such a move wll mprove the value of the crtero fucto. It guaratees local but ot global optmzato. 48
49 4 ITERATIVE OPTIMIZATION Iteratve Improvemet to mmze the sum of squared error crtero J e. Effectve error per cluster J. c = = x J J ; J e = x D A sample s moved from cluster to cluster. xˆ D xˆ D m xˆ m xˆ m m * = m + ; m* = m + = + ; = 2 49
50 50 4 ITERATIVE OPTIMIZATION Icreasg / Decreasg Effectve error per cluster (DEMOSTRAR COMO EJERCICIO) ( ) 2 2 * ˆ * * ˆ ˆ ˆ D D J J + + = + = + = + + x x x m x m x m x m x m x m
51 5 4 ITERATIVE OPTIMIZATION Icreasg / Decreasg Effectve error per cluster (DEMOSTRAR COMO EJERCICIO) ( ) 2 2 * ˆ * * ˆ ˆ ˆ D D J J = = + = x x x m x m x m x m x m x m
52 4 ITERATIVE OPTIMIZATION The sample moved from cluster to cluster s advatageous f 2 ˆ + x m > xˆ m 2 52
53 4 ITERATIVE OPTIMIZATION BASIC ITERATIVE MINIMUM SQUARED ERROR CLUSTERING 53
54 4 ITERATIVE OPTIMIZATION 54
55 7 CONCLUSIONS Whe uderlyg dstrbuto comes from a mxture of compoet destes descrbed by a set of ukow parameters, these parameters ca be estmated by Bayesa or ML (EM_algorthm) methods. Clusterg s a more geeral approach. 55
56 7 CONCLUSIONS: OTHER TOPICS Herarchcal methods to reveal clusters ad sub-clusters: Taxoomy. Estmato of the umber of clusters Self-Orgazg feature Maps: SOFM They preserve eghborhoods to reduce dmesoalty (Kohoe Maps). 56
57 Laboratory Classes Práctca 0: Observacó de base de datos Bra, Gauss. Práctca : Aplcacó de métodos MAP (ldc,qdc) sobre GAUSS. Práctca 2: Aplcacó de métodos MAP (ldc,qdc) sobre PHONEME, SPAM. Práctca 3: Aplcacó de PCA y MDA sobre GAUSS. Práctca 4: ICA como separacó cega de fuetes de audo Práctca 5: k-nearest Negbour ZIP. (Práctca 6: Dscrmate Leal (LMS-MMSE y Perceptro) sobre GAUSS y ZIP). Práctca 7: (NN,Decsó Trees ad K-meas) MULTILAYER NEURAL NETWORKS, TREE CLASSIFIERS ad UNSUPERVISED Methods appled to PET ad Magetc Resoace BRAIN Images. 57
Unsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationDimensionality reduction Feature selection
CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationρ < 1 be five real numbers. The
Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace
More informationKernel-based Methods and Support Vector Machines
Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg
More informationLinear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab
Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.
More information( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model
Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch
More informationGenerative classification models
CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato
More informationOverview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression
Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag
More informationModel Fitting, RANSAC. Jana Kosecka
Model Fttg, RANSAC Jaa Kosecka Fttg: Issues Prevous strateges Le detecto Hough trasform Smple parametrc model, two parameters m, b m + b Votg strateg Hard to geeralze to hgher dmesos a o + a + a 2 2 +
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationKLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames
KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots
More information6. Nonparametric techniques
6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)
More informationChapter 4 Multiple Random Variables
Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More informationLECTURE 2: Linear and quadratic classifiers
LECURE : Lear ad quadratc classfers g Part : Bayesa Decso heory he Lkelhood Rato est Maxmum A Posteror ad Maxmum Lkelhood Dscrmat fuctos g Part : Quadratc classfers Bayes classfers for ormally dstrbuted
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationLecture 7: Linear and quadratic classifiers
Lecture 7: Lear ad quadratc classfers Bayes classfers for ormally dstrbuted classes Case : Σ σ I Case : Σ Σ (Σ daoal Case : Σ Σ (Σ o-daoal Case 4: Σ σ I Case 5: Σ Σ j eeral case Lear ad quadratc classfers:
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More information1 Solution to Problem 6.40
1 Soluto to Problem 6.40 (a We wll wrte T τ (X 1,...,X where the X s are..d. wth PDF f(x µ, σ 1 ( x µ σ g, σ where the locato parameter µ s ay real umber ad the scale parameter σ s > 0. Lettg Z X µ σ we
More informationX ε ) = 0, or equivalently, lim
Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece
More informationApplications of Multiple Biological Signals
Applcatos of Multple Bologcal Sgals I the Hosptal of Natoal Tawa Uversty, curatve gastrectomy could be performed o patets of gastrc cacers who are udergoe the curatve resecto to acqure sgal resposes from
More information3D Geometry for Computer Graphics. Lesson 2: PCA & SVD
3D Geometry for Computer Graphcs Lesso 2: PCA & SVD Last week - egedecomposto We wat to lear how the matrx A works: A 2 Last week - egedecomposto If we look at arbtrary vectors, t does t tell us much.
More informationSimulation Output Analysis
Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More informationAlgebraic-Geometric and Probabilistic Approaches for Clustering and Dimension Reduction of Mixtures of Principle Component Subspaces
Algebrac-Geometrc ad Probablstc Approaches for Clusterg ad Dmeso Reducto of Mxtures of Prcple Compoet Subspaces ECE842 Course Project Report Chagfag Zhu Dec. 4, 2004 Algebrac-Geometrc ad Probablstc Approach
More informationAnnouncements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier
Aoucemets Recogto II H3 exteded to toght H4 to be aouced today. Due Frday 2/8. Note wll take a whle to ru some thgs. Fal Exam: hursday 2/4 at 7pm-0pm CSE252A Lecture 7 Example: Face Detecto Evaluatg a
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationLINEAR REGRESSION ANALYSIS
LINEAR REGRESSION ANALYSIS MODULE V Lecture - Correctg Model Iadequaces Through Trasformato ad Weghtg Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Aalytcal methods for
More informationMaximum Likelihood Estimation
Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More informationBayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier
Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationLecture 3. Least Squares Fitting. Optimization Trinity 2014 P.H.S.Torr. Classic least squares. Total least squares.
Lecture 3 Optmzato Trt 04 P.H.S.Torr Least Squares Fttg Classc least squares Total least squares Robust Estmato Fttg: Cocepts ad recpes Least squares le fttg Data:,,,, Le equato: = m + b Fd m, b to mmze
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationDimensionality reduction Feature selection
CS 675 Itroucto to ache Learg Lecture Dmesoalty reucto Feature selecto los Hauskrecht mlos@cs.ptt.eu 539 Seott Square Dmesoalty reucto. otvato. L methos are sestve to the mesoalty of ata Questo: Is there
More informationBig Data Analytics. Data Fitting and Sampling. Acknowledgement: Notes by Profs. R. Szeliski, S. Seitz, S. Lazebnik, K. Chaturvedi, and S.
Bg Data Aaltcs Data Fttg ad Samplg Ackowledgemet: Notes b Profs. R. Szelsk, S. Setz, S. Lazebk, K. Chaturved, ad S. Shah Fttg: Cocepts ad recpes A bag of techques If we kow whch pots belog to the le, how
More informationRadial Basis Function Networks
Radal Bass Fucto Netorks Radal Bass Fucto Netorks A specal types of ANN that have three layers Iput layer Hdde layer Output layer Mappg from put to hdde layer s olear Mappg from hdde to output layer s
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationNaïve Bayes MIT Course Notes Cynthia Rudin
Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.
More informationChapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II
CEE49b Chapter - Free Vbrato of Mult-Degree-of-Freedom Systems - II We ca obta a approxmate soluto to the fudametal atural frequecy through a approxmate formula developed usg eergy prcples by Lord Raylegh
More informationCHAPTER 3 POSTERIOR DISTRIBUTIONS
CHAPTER 3 POSTERIOR DISTRIBUTIONS If scece caot measure the degree of probablt volved, so much the worse for scece. The practcal ma wll stck to hs apprecatve methods utl t does, or wll accept the results
More informationFeature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationBayesian belief networks
Lecture 14 ayesa belef etworks los Hauskrecht mlos@cs.ptt.edu 5329 Seott Square Desty estmato Data: D { D1 D2.. D} D x a vector of attrbute values ttrbutes: modeled by radom varables { 1 2 d} wth: otuous
More informationQualifying Exam Statistical Theory Problem Solutions August 2005
Qualfyg Exam Statstcal Theory Problem Solutos August 5. Let X, X,..., X be d uform U(,),
More information4. Standard Regression Model and Spatial Dependence Tests
4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.
More informationQR Factorization and Singular Value Decomposition COS 323
QR Factorzato ad Sgular Value Decomposto COS 33 Why Yet Aother Method? How do we solve least-squares wthout currg codto-squarg effect of ormal equatos (A T A A T b) whe A s sgular, fat, or otherwse poorly-specfed?
More informationLecture 12: Multilayer perceptrons II
Lecture : Multlayer perceptros II Bayes dscrmats ad MLPs he role of hdde uts A eample Itroducto to Patter Recoto Rcardo Guterrez-Osua Wrht State Uversty Bayes dscrmats ad MLPs ( As we have see throuhout
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More information1. BLAST (Karlin Altschul) Statistics
Parwse seuece algmet global ad local Multple seuece algmet Substtuto matrces Database searchg global local BLAST Seuece statstcs Evolutoary tree recostructo Gee Fdg Prote structure predcto RNA structure
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationPrincipal Components. Analysis. Basic Intuition. A Method of Self Organized Learning
Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher
More informationLinear Regression with One Regressor
Lear Regresso wth Oe Regressor AIM QA.7. Expla how regresso aalyss ecoometrcs measures the relatoshp betwee depedet ad depedet varables. A regresso aalyss has the goal of measurg how chages oe varable,
More informationClassification : Logistic regression. Generative classification model.
CS 75 Mache Lear Lecture 8 Classfcato : Lostc reresso. Geeratve classfcato model. Mlos Hausrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Lear Bar classfcato o classes Y {} Our oal s to lear to classf
More informationDimensionality Reduction
Dmesoalty Reducto Sav Kumar, Google Research, NY EECS-6898, Columba Uversty - Fall, 010 Sav Kumar 11/16/010 EECS6898 Large Scale Mache Learg 1 Curse of Dmesoalty May learg techques scale poorly wth data
More information18.657: Mathematics of Machine Learning
8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary
More informationSimple Linear Regression
Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato
More informationECON 5360 Class Notes GMM
ECON 560 Class Notes GMM Geeralzed Method of Momets (GMM) I beg by outlg the classcal method of momets techque (Fsher, 95) ad the proceed to geeralzed method of momets (Hase, 98).. radtoal Method of Momets
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More informationPARAMETER ESTIMATION OF GEOGRAPHICALLY WEIGHTED MULTIVARIATE t REGRESSION MODEL
Joural of heoretcal ad Appled Iformato echology 5 th October 06. Vol.9. No. 005-06 JAI & LLS. All rghts reserved. ISSN: 99-8645 www.jatt.org E-ISSN: 87-395 PARAMEER ESIMAION OF GEOGRAPHICALLY WEIGHED MULIVARIAE
More informationNonparametric Techniques
Noparametrc Techques Noparametrc Techques w/o assumg ay partcular dstrbuto the uderlyg fucto may ot be kow e.g. mult-modal destes too may parameters Estmatg desty dstrbuto drectly Trasform to a lower-dmesoal
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationLikelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues
Lkelhood Rato, Wald, ad Lagrage Multpler (Score) Tests Soccer Goals Europea Premer Leagues - 4 Statstcal Testg Prcples Goal: Test a Hpothess cocerg parameter value(s) a larger populato (or ature), based
More informationLecture Notes Types of economic variables
Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte
More informationMachine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Mache Learg CSE6740/CS764/ISYE6740, Fall 0 Itroducto to Regresso Le Sog Lecture 4, August 30, 0 Based o sldes from Erc g, CMU Readg: Chap. 3, CB Mache learg for apartmet hutg Suppose ou are to move to
More informationL5 Polynomial / Spline Curves
L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a
More informationLecture 02: Bounding tail distributions of a random variable
CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome
More informationDr. Shalabh. Indian Institute of Technology Kanpur
Aalyss of Varace ad Desg of Expermets-I MODULE -I LECTURE - SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY AND DISTRIBUTIONS Dr. Shalabh Departmet t of Mathematcs t ad Statstcs t t Ida Isttute of Techology
More informationRegresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze
More informationCS 2750 Machine Learning Lecture 5. Density estimation. Density estimation
CS 750 Mache Learg Lecture 5 esty estmato Mlos Hausrecht mlos@tt.edu 539 Seott Square esty estmato esty estmato: s a usuervsed learg roblem Goal: Lear a model that rereset the relatos amog attrbutes the
More informationSingular Value Decomposition. Linear Algebra (3) Singular Value Decomposition. SVD and Eigenvectors. Solving LEs with SVD
Sgular Value Decomosto Lear Algera (3) m Cootes Ay m x matrx wth m ca e decomosed as follows Dagoal matrx A UWV m x x Orthogoal colums U U I w1 0 0 w W M M 0 0 x Orthoormal (Pure rotato) VV V V L 0 L 0
More informationNew Schedule. Dec. 8 same same same Oct. 21. ^2 weeks ^1 week ^1 week. Pattern Recognition for Vision
ew Schedule Dec. 8 same same same Oct. ^ weeks ^ week ^ week Fall 004 Patter Recogto for Vso 9.93 Patter Recogto for Vso Classfcato Berd Hesele Fall 004 Overvew Itroducto Lear Dscrmat Aalyss Support Vector
More informationThe Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)
We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationChapter 4 Multiple Random Variables
Revew o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for Chapter 4-5 Notes: Although all deftos ad theorems troduced our lectures ad ths ote are mportat ad you should be famlar wth, but I put those
More informationIII-16 G. Brief Review of Grand Orthogonality Theorem and impact on Representations (Γ i ) l i = h n = number of irreducible representations.
III- G. Bref evew of Grad Orthogoalty Theorem ad mpact o epresetatos ( ) GOT: h [ () m ] [ () m ] δδ δmm ll GOT puts great restrcto o form of rreducble represetato also o umber: l h umber of rreducble
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationENGI 4421 Propagation of Error Page 8-01
ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationChapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance
Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss
More informationParametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier
arametrc Dest Estmato: Baesa Estmato. Naïve Baes Classfer Baesa arameter Estmato Suppose we have some dea of the rage where parameters θ should be Should t we formalze such pror owledge hopes that t wll
More informationConvergence of the Desroziers scheme and its relation to the lag innovation diagnostic
Covergece of the Desrozers scheme ad ts relato to the lag ovato dagostc chard Méard Evromet Caada, Ar Qualty esearch Dvso World Weather Ope Scece Coferece Motreal, August 9, 04 o t t O x x x y x y Oservato
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationCLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs
CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor,
More information