Machine Learning 2nd Edition

Similar documents
CHAPTER 5: MULTIVARIATE METHODS

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Clustering (Bishop ch 9)

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

( ) [ ] MAP Decision Rule

CHAPTER 10: LINEAR DISCRIMINATION

Robust and Accurate Cancer Classification with Gene Expression Profiling

An introduction to Support Vector Machine

Machine Learning 2nd Edition

Advanced Machine Learning & Perception

TSS = SST + SSE An orthogonal partition of the total SS

Lecture 6: Learning for Control (Generalised Linear Regression)

Computing Relevance, Similarity: The Vector Space Model

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Normal Random Variable and its discriminant functions

Lecture VI Regression

CHAPTER 2: Supervised Learning

Machine Learning Linear Regression

Clustering with Gaussian Mixtures

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

Variants of Pegasos. December 11, 2009

CHAPTER 7: CLUSTERING

Introduction to Boosting

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering

Department of Economics University of Toronto

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Learning Objectives. Self Organization Map. Hamming Distance(1/5) Introduction. Hamming Distance(3/5) Hamming Distance(2/5) 15/04/2015

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

Solution in semi infinite diffusion couples (error function analysis)

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Robustness Experiments with Two Variance Components

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Lecture 11 SVM cont

January Examinations 2012

Dual Approximate Dynamic Programming for Large Scale Hydro Valleys

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

Math 128b Project. Jude Yuen

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Fall 2010 Graduate Course on Dynamic Learning

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Panel Data Regression Models

Graduate Macroeconomics 2 Problem set 5. - Solutions

Advanced time-series analysis (University of Lund, Economic History Department)

ABSTRACT KEYWORDS. Bonus-malus systems, frequency component, severity component. 1. INTRODUCTION

FACIAL IMAGE FEATURE EXTRACTION USING SUPPORT VECTOR MACHINES

Lecture 2 L n i e n a e r a M od o e d l e s

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment

Notes on the stability of dynamic systems and the use of Eigen Values.

CHAPTER FOUR REPEATED MEASURES IN TOXICITY TESTING

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

Linear Response Theory: The connection between QFT and experiments

On One Analytic Method of. Constructing Program Controls

How about the more general "linear" scalar functions of scalars (i.e., a 1st degree polynomial of the following form with a constant term )?

Data Collection Definitions of Variables - Conceptualize vs Operationalize Sample Selection Criteria Source of Data Consistency of Data

Pattern Classification (III) & Pattern Verification

Chapter 4. Neural Networks Based on Competition

Let s treat the problem of the response of a system to an applied external force. Again,

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Fitting a Conditional Linear Gaussian Distribution

Chapter 6 DETECTION AND ESTIMATION: Model of digital communication system. Fundamental issues in digital communications are

Structural Optimization Using Metamodels

Volatility Interpolation

WiH Wei He

Supervised Learning in Multilayer Networks

( ) () we define the interaction representation by the unitary transformation () = ()

CS286.2 Lecture 14: Quantum de Finetti Theorems II

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

Comb Filters. Comb Filters

Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence

2. SPATIALLY LAGGED DEPENDENT VARIABLES

We are estimating the density of long distant migrant (LDM) birds in wetlands along Lake Michigan.

Local Cost Estimation for Global Query Optimization in a Multidatabase System. Outline

( ) lamp power. dx dt T. Introduction to Compact Dynamical Modeling. III.1 Reducing Linear Time Invariant Systems

Chapter 6: AC Circuits

Motion in Two Dimensions

Machine Learning for Language Technology Lecture 8: Decision Trees and k- Nearest Neighbors

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)

Mechanics Physics 151

Mechanics Physics 151

Application of Discriminant Analysis on Romanian Insurance Market

Statistical pattern recognition

Mechanics Physics 151

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

Introduction to Compact Dynamical Modeling. III.1 Reducing Linear Time Invariant Systems. Luca Daniel Massachusetts Institute of Technology

Lecture 2 M/G/1 queues. M/G/1-queue

Digital Speech Processing Lecture 20. The Hidden Markov Model (HMM)

Chapter Lagrangian Interpolation

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

PHYS 1443 Section 001 Lecture #4

Forecasting Using First-Order Difference of Time Series and Bagging of Competitive Associative Nets

Lecture 12: Classification

Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation

Transcription:

INTRODUCTION TO Lecure Sldes for Machne Learnng nd Edon ETHEM ALPAYDIN, modfed by Leonardo Bobadlla and some pars from hp://www.cs.au.ac.l/~aparzn/machnelearnng/ The MIT Press, 00 alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/mle

Oulne Ths class: Ch 5: Mulvarae Mehods Mulvarae Daa Parameer Esmaon Esmaon of Mssng Values Mulvarae Classfcaon Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0)

CHAPTER 5: Mulvarae Mehods

Mulvarae Dsrbuon 4 Assume all members of class came from jon dsrbuon Can learn dsrbuons from daa P(x C) Assgn new nsance for mos probable class P(C x) usng Bayes rule An nsance descrbed by a vecor of correlaed parameers Realm of mulvarae dsrbuons Mulvarae normal Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Mulvarae Daa Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) 5 = N d N N d d X X X X X X X X X X Mulple measuremens (sensors) d npus/feaures/arbues: d-varae N nsances/observaons/examples

Mulvarae Parameers 6 Σ Cov [ x] = μ= [ µ,..., µ ] Mean : E Covarance: σ Cov Correlaon :Corr [ ] T ( X ) = E ( X μ)( X μ) j ( X,X ) ( X,X ) = d j T σ σ σ d j ρ σ σ σ j d = σ j σ σ σ σ σ j d d d Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Parameer Esmaon Samplemean m : m Covarancemarx S: Correlaon marx R : = s j r j N = = N = x s N = j s s, =,..., d j ( )( x m x m ) N j j Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) 7

Esmaon of Mssng Values 8 Wha o do f ceran nsances have mssng arbues? Ignore hose nsances: no a good dea f he sample s small Use mssng as an arbue: may gve nformaon Impuaon: Fll n he mssng value Mean mpuaon: Use he mos lkely value (e.g., mean) Impuaon by regresson: Predc based on oher Based on for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) arbues

Mulvarae Normal 9 Have d-arbues Ofen can assume each one dsrbued normally Arbues mgh be dependan/correlaed Jon dsrbuon of correlaed several varables P(X =x, X =x, X d =x d )=? X s normally dsrbued wh mean µ and varance Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Mulvarae Normal 0 x p ( x) = ( μ Σ) ~ N, d ( π ) d / exp / Σ ( ) T ( ) x μ Σ x μ Mahalanobs dsance: (x μ)t (x μ) varables are correlaed Dvded by nverse of covarance (large) Conrbue less o Mahalanobs dsance Conrbue more o he probably Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Bvarae Normal Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Mulvarae Normal Dsrbuon p Mahalanobs dsance: (x μ)t (x μ) measures he dsance from x o μ n erms of (normalzes for dfference n varances and correlaons) Bvarae: d = ρσ σ σ ( x,x ) = exp z ρz Σ ρσ Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) σ = σ ( ) ( z + z ) ρ πσ σ ρ z = ( x µ ) / σ

Bvarae Normal 3 Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Bvarae Normal Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) 4

Independen Inpus: Nave 5 Bayes If x are ndependen, offdagonals of are 0, Mahalanobs dsance reduces o weghed (by /σ ) Eucldean dsance: d p ( x )= = p (x )= d (π ) d / = d σ exp[ = If varances are also equal, reduces o Eucldean dsance ( x μ σ )] Based on Inroducon o Machne Learnng The MIT Press (V.)

Projecon Dsrbuon 6 Example: vecor of 3 feaures Mulvarae normal dsrbuon Projecon o dmensonal space (e.g. XY plane) Vecors of feaures Projecon are also mulvarae normal dsrbuon Projecon of d-dmensonal normal o k-dmensonal space s k-dmensonal normal Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

D projecon 7 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Mulvarae Classfcaon 8 Assume members of class from a sngle mulvarae dsrbuon Mulvarae normal s a good choce Easy o analyze Model many naural phenomena Model a class as havng sngle prooype source (mean) slghly randomly changed Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Example 9 Machng cars o cusomers Each ca defnes a class of machng cusomers Cusomers descrbed by (age, ncome) There s a correlaon beween age and ncome Assume each class s mulvarae normal Need o learn P(x C) from daa Use Bayes o compue P(C x) Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Paramerc Classfcaon If p (x C ) ~ N ( μ, ) p C = exp d / / Σ Σ ( x ) ( π ) Dscrmnan funcons are Need o know Covarance Marx and mean o compue dscrmnan funcons. Can gnore P(x) as he same for all classes ( x ) ( ) T ( x μ) ( x μ) P C P C g ( x) = log P ( C x ) =log =log p ( x C ) + log P ( C ) log P( x) P( x) d T = logπ log Σ ( x μ ) Σ ( x μ ) + log P ( C ) LogP( x) Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) 0

Esmaon of Parameers ( ) ( )( ) = = = T r r r r N r C Pˆ m x m x x m S ( ) ( ) ( ) ( ) T C g Pˆ log log + = m x m x x S S Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Covarance Marx per Class Quadrac dscrmnan Requres esmaon of K*d*(d+)/ parameers for covarance marx ( ) ( ) ( ) ( ) T T T T T T C Pˆ w w C Pˆ g log log where log log 0 0 + = = = + + = + + = S S S S W W S S S S m m m w x w x x m m m x x x x Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

lkelhoods dscrmnan: P (C x ) = 0.5 poseror for C 3 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Common Covarance Marx S 4 If no enough daa can assume all classes have same common sample covarance marx S S= Pˆ ( C ) S Dscrmnan reduces o a lnear dscrmnan (x T S - x s common o all dscrmnan and can be removed) ( ) ( ) T x = x m S ( x m ) + log ( C ) g Pˆ T g ( x) = w x + w 0 where w = S m w = m S m + log Pˆ C T 0 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) ( )

Common Covarance Marx S 5 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Dagonal S 6 When xj j =,..d, are ndependen, s dagonal p (x C ) = j p (x j C ) (Nave Bayes assumpon) g d x = j = m ( ) j j x + log ( C ) s j Pˆ Classfy based on weghed Eucldean dsance (n s j uns) o he neares mean Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Dagonal S 7 varances may be dfferen Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Dagonal S, equal varances 8 Neares mean classfer: Classfy based on Eucldean dsance o he neares mean g ( x )= x m +log P (C s ) d = s j= (x j m ) j +log P (C ) Each mean can be consdered a prooype or emplae and hs s emplae machng Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Dagonal S, equal varances 9 *? Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Model Selecon Assumpon Covarance marx No of parameers Shared, Hyperspherc S=S=s^I Shared, Axs-algned S=S, wh sj=0 d Shared, Hyperellpsodal S=S d(d+)/ Dfferen, Hyperellpsodal S K d(d+)/ As we ncrease complexy (less resrced S), bas decreases and varance ncreases Assume smple models (allow some bas) o conrol varance (regularzaon) Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) 30

Model Selecon 3 Dfferen covarance marx for each class Have o esmae many parameers Small bas, large varance Common covarance marces, dagonal covarance ec. reduce number of parameers Increase bas bu conrol varance In-beween saes? Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

3 Regularzed Dscrmnan Analyss(RDA) a=b=0: Quadrac classfer a=0, b=:shared Covarance, lnear classfer a=,b=0: Dagonal Covarance Choose bes a,b by cross valdaon Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Model Selecon: Example 33 Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Model Selecon 34 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Dscree Feaures Bnary feaures: p j p xj= C f xj are ndependen (Nave Bayes ) he dscrmnan s lnear g ( x) = logp ( x C ) + logp ( C ) = ( ) ( ) x ( ) ( x = ) j p j C j pj p x j= [ x logp + ( x ) log( p )] + logp ( C ) j j j j Esmaed parameers d j = j x r Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) pˆ j r 35

Mulvarae Regresson ( ) w w w + ε r,,..., = g x 0 d Mulvarae lnear model w 0 + w x + w x + + w x d ( ) [ E w, w,..., w X = r w w x w x ] 0 d d 0 d d Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) 36

Mulvarae Regresson l ( ) [, w,..., w X = r w w x w x ] E w 0 w d 0 + w x + w x + + w x d 0 d d d Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) 37

CHAPTER 6: Dmensonaly Reducon

Dmensonaly of npu 39 Number of Observables (e.g. age and ncome) If number of observables s ncreased More me o compue More memory o sore npus and nermedae resuls More complcaed explanaons (knowledge from learnng) Regresson from 00 vs. parameers No smple vsualzaon D vs. 0D graph Need much more daa (curse of dmensonaly) Based M of on E -d Alpaydın npus 004 s Inroducon no equal o Machne o npu Learnng of The dmenson MIT Press (V.) M

Dmensonaly reducon 40 Some feaures (dmensons) bear lle or nor useful nformaon (e.g. color of har for a car selecon) Can drop some feaures Have o esmae whch feaures can be dropped from daa Several feaures can be combned ogeher whou loss or even wh gan of nformaon (e.g. ncome of all famly members for loan applcaon) Some feaures can be combned ogeher Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.) Have o esmae whch feaures o combne from

Feaure Selecon vs Exracon 4 Feaure selecon: Choosng k<d mporan feaures, gnorng he remanng d k Subse selecon algorhms Feaure exracon: Projec he orgnal x, =,...,d dmensons o new k<d dmensons, z j, j =,...,k Prncpal Componens Analyss (PCA) Lnear Dscrmnan Analyss (LDA) Facor Analyss (FA) Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Usage 4 Have daa of dmenson d Reduce dmensonaly o k<d Dscard unmporan feaures Combne several feaures n one Use resulng k-dmensonal daa se for Learnng for classfcaon problem (e.g. parameers of probables P(x C) Learnng for regresson problem (e.g. parameers for model y=g(x Theha) Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Subse selecon 43 Have nal se of feaures of sze d There are ^d possble subses Need a crera o decde whch subse s he bes A way o search over he possble subses Can go over all ^d possbles Need some heurscs Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Goodness of feaure se 44 Supervsed Tran usng seleced subse Esmae error on valdaon daa se Unsupervsed Look a npu only(e.g. age, ncome and savngs) Selec subse of ha bear mos of he nformaon abou he person Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Muual Informaon 45 Have a 3 random varables(feaures) X,Y,Z and have o selec whch gves mos nformaon If X and Y are correlaed hen much of he nformaon abou of Y s already n X Make sense o selec feaures whch are uncorrelaed Muual Informaon (Kullback Lebler Dvergence ) s more general measure of muual nformaon Can be exended o n varables (nformaon varables x,.. x n have abou varable x n+ ) Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Subse-selecon 46 Forward search Sar from empy se of feaures Try each of remanng feaures Esmae classfcaon/regresson error for addng specfc feaure Selec feaure ha gves maxmum mprovemen n valdaon error Sop when no sgnfcan mprovemen Backward search Sar wh orgnal se of sze d Drop feaures wh smalles mpac on error Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Subse Selecon There are ^d subses of d feaures Forward search: Add he bes feaure a each sep Se of feaures F nally Ø. A each eraon, fnd he bes new feaure j = argmn E ( F x ) Add xj o F f E ( F xj ) < E ( F ) Hll-clmbng O(d^) algorhm Backward search: Sar wh all feaures and remove one a a me, f possble. Floang search (Add k, remove l) Lecure Noes for E Alpaydın 00 Inroducon o Machne Learnng e The MIT Press (V.0) 47

Floang Search 48 Forward and backward search are greedy algorhms Selec bes opons a sngle sep Do no always acheve opmum value Floang search Two ypes of seps: Add k, remove l More compuaons Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Feaure Exracon 49 Face recognon problem Tranng daa npu: pars of Image + Label(name) Classfer npu: Image Classfer oupu: Label(Name) Image: Marx of 56X56=65536 values n range 0..56 Each pxels bear lle nformaon so can selec 00 bes ones Average of pxels around specfc posons may gve an ndcaon abou an eye color. Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Projecon 50 Fnd a projecon marx w from d-dmensonal o k-dmensonal vecors ha keeps error low Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA: Movaon 5 Assume ha d observables are lnear combnaon of k<d vecors z =w x + +w k x d We would lke o work wh bass as has lesser dmenson and have all(almos) requred nformaon Wha we expec from such bass Uncorrelaed or oherwse can be reduced furher Have large varance (e.g. w have large varaon) or oherwse bear no nformaon Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA: Movaon 5 Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA: Movaon 53 Choose drecons such ha a oal varance of daa wll be maxmum Maxmze Toal Varance Choose drecons ha are orhogonal Mnmze correlaon Choose k<d orhogonal drecons whch maxmze oal varance Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA 54 Choosng only drecons: Maxmze varance subjec o a consran usng Lagrange Mulplers Takng Dervaves Egenvecor. Snce wan o maxmze we should choose an egenvecor wh larges egenvalue Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA 55 d-dmensonal feaure space d by d symmerc covarance marx esmaed from samples Selec k larges egenvalue of he covarance marx and assocaed k egenvecors The frs egenvecor wll be a drecon wh larges varance Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

Wha PCA does 56 z = W T (x m) where he columns of W are he egenvecors of, and m s sample mean Ceners he daa a he orgn and roaes he axes Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

How o choose k? 57 Proporon of Varance (PoV) explaned λ λ + λ + λ + + + λ k + λk + + λ d when λ are sored n descendng order Typcally, sop a PoV>0.9 Scree graph plos of PoV vs k, sop a elbow Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

58 Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA 59 PCA s unsupervsed (does no ake no accoun class nformaon) Can ake no accoun classes : Karhuned-Loeve Expanson Esmae Covarance Per Class Take average weghed by pror Common Prncple Componens Assume all classes have same egenvecors (drecons) bu dfferen varances Lecure Noes for E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)

PCA 60 Does no ry o explan nose Large nose can become new dmenson/larges PC Ineresed n resulng uncorrelaed varables whch explan large poron of oal sample varance Somemes neresed n explaned shared varance (common facors) ha affec daa Based on E Alpaydın 004 Inroducon o Machne Learnng The MIT Press (V.)