Assumptions. Motivation. Linear Transforms. Standard measures. Correlation. Cofactor. γ k

Size: px

Start display at page:

Download "Assumptions. Motivation. Linear Transforms. Standard measures. Correlation. Cofactor. γ k"

George O’Brien’
5 years ago
Views:

1 Outlie Pricipal Compoet Aalysis Yaju Ya Itroductio of PCA Mathematical basis Calculatio of PCA Applicatios //04 ELE79, Sprig 004 What is PCA? Pricipal Compoets Pricipal Compoet Aalysis, origially developed by Hotellig (933), ivolves a mathematical procedure that trasforms a umber of (possibly) correlated variables ito a (smaller) umber of ucorrelated variables called pricipal compoets. The first pricipal compoet accouts for as much of the variability i the data as possible, ad each succeedig compoet is ucorrelated to former compoets ad accouts for as much of the remaiig variability as possible. Objectives of PCA Outlie To discover or to reduce the dimesioality of the data set. To idetify ew meaigful uderlyig variables. Itroductio of PCA Mathematical basis Calculatio of PCA Applicatios

2 Motivatio. Give observatios { x, x,..., x}, the x s will ordiarily be correlated. Is there a fudametal ucorrelated set, perhaps fewer i umber tha the x s, which determie the values that x s will take?. If,,..., are such variables, we shall have a set of relatios of the form xi = fi(,,...) ( i =,,...) Assumptios 3. Cosider oly ormally distributed systems of compoets havig zero mea ad uit variaces. E i = 0 E = δ i j ij () Stadard measures 4. I order to meet the assumptio i 3, we ca express the x s i stadard measures, by takig the deviatio of each from its mea value ad dividig its stadard deviatio. Thus we ca obtai a set of quatities { z, z,..., z} for which our formulas will be simpler. zi = ( xi xi )/var( xi ) Liear Trasforms 5. Cofiig ourselves to the case i which the fuctios f i are liear, the z i = a ij j There might be less s tha z s if there re fewer compoets tha samples, ad above formula icludes this special case whe a ij =0. However we ll first assume that this is ot the case ad the determiat of A is ot zero to see how we should determie. a ij () Cofactor 6. Let A deote the cofactor of a ij ij i A divided by determiat of A. The a A = δ, a Akj = δ ij jk ij i= 7. Solve () for the s by multiplyig both sides by A, summig with respect to from to, ad i usig (3). Sice δ jk j is a sum cosistig of terms which vaish except, therefore k A z = a A = δ jk j = k i ij j i= i= (3) (4) Correlatio 8. Let r be the correlatio betwee z i ad zk, equal to uity if i = k: r = Ezi zk 9. Substitute the value for z i give by (). With the help of (), we the obtai: r = Ezz = E = i k l= a ij j kl l l= a a δ = ij kl jl a = a a ij kj l= a a E ij kl j l (5)

3 Rigid Rotatio 0. Sice r = rki, the umber of equatios (5) is oly ( + ). They are therefore isufficiet for determiig the quatities aij whe the correlatio betwee the samples are kow. Thus systems of ucorrelated compoets may be chose, cosistetly with the observed correlatios, i ( ) ways. This variety of choices of compoets correspods to the ( ) degrees of freedom of a rigid rotatio i a space of dimesios. Idetermiateess. The umber of ukow aij may be reduced by supposig that there are fewer tha compoets, which amouts to settig some of the aij equal to zero. Warig: If arbitrarily specialize the aij, the umber of compoets possibly eve exceeds the umber of samples. Goal i choosig the compoets Picturizatio. Begi with a compoet whose cotributios to the variaces of the x s have as great a total as possible; the we ext take a compoet, ucorrelated with, whose cotributio to the residual variace is as great as possible; ad the we proceed i this way to determie the compoets, ot exceedig i umber, ad perhaps eglectig those whose cotributios to the total variace are small. This is called the method of pricipal compoets. 3. If z, z,...,z be take as rectagular coordiates i dimesios, each poit represets a possible idividual. If, as we assume, the populatio is ormally distributed, the loci of uiform desity are cocetric, similar ellipsoids. The method of pricipal compoets is equivalet to choosig a set of coordiate axes coicidig with the pricipal axes of these ellipsoids. Pricipal Compoets Metric defiitio 4. Now that the set of x s is capable of trasformatios such as chages of uits ad other liear trasformatios, the ellipsoids may be squeezed ad streched i ay way. Thus for each xi,there exits a uit of measure of uique importace. I other words, a metric a defiitio of distace must be assumed i the -dimesioal space. For differet applicatios, differet metrics would be suitable. 3

4 Mathematical Setup 5. Give z i = a ij j The variace of may be writte as var( z ) = E( z i z i i ) = E a () The first term i the sum is the cotributio of to the variace of z i. The sum of the cotributios of to the variaces of all the z s is S = i= ij j il l= a = l aij a (6) i Maximizatio 6. We wat to maximize (6) subject to (5) rih = a ijahj To this ed we write T = S λ ( a a r ) where the λ = ih λ hi T = a a i i T = a ij ih ij i= h= are Lagrage multipliers. Set h= h= λ a = 0 (7) ih h λ a = 0 ( j ) (8) ih hj hj ih Fial Formula for st PC 7. From (8) we ca utilize system rak iformatio to fid a expressio of λ ih, the substitute it back to (7), we ca get a buch of equatios: ( k) a r a M + r a + ( k) a r a + r a + L+ r a + L+ r a + L+ ( k) a = 0 = 0 = 0 Fial Formula for st PC (Co t) 8. Former formula is already very familiar to us: T let a be the vector of [ a, a, Ka ] let R be the covariace matrix with s at diagoal ( R ki) a = 0 The a is the eigevector of R, ad k is the eigevalue correspodig to a. Fial Formula for succeedig PCs 9. Next we eed to fid a compoet makig a maximum cotributio to the residual portio of the variace. Chage the secod subscript i (6), (7) ad (8) from to, 3,. The argumets ad procedure are virtually the same as before. Meaig of k 0. For clarificatio, set the k for the first PC as k, ad the succeedig k s as k, k3,k, the it ca be show that y k y y + + K+ = k k costat y s are the PCs i origial coordiates. k s are the legth of the axes of the ellipsoids. If, istead of the z s, the s be take as rectagular coordiates, the ellipsoids are squeezed ad stretched ito spheres. 4

5 Outlie Itroductio of PCA Mathematical basis Calculatio of PCA Applicatios SVD The sigular value decompositio (SVD) of the N p matrix X has the form Basic Calculatio of PCA. Eigedecompose the sigal s true (or estimated) covariace matrix.. Sort the eigevalues from big to small, ad sort the eigevectors correspodigly. 3. Accordig to the applicatio, select several most sigificat eigevectors, the use the weightig i the eigevectors to liearly combie the raw data to get correspodig pricipal compoets. Eigedecompositio The sample covariace matrix of X is give by S = X T X/N, the X T X = VD V T Which is the eigedecompositio of X T X (ad of S, up to a factor N). The eigevectors are called pricipal compoets directios (or Karhue-Loeve directios ) of X. The first pricipal compoet directio υ has the property that z = Xυ has the largest sample variace amogst all ormalized liear combiatios of the colums of X. Recostructio from PCs My experiece For a give raw data sample, do the dot product with PCs to costruct a recostructio weightig vector. Give all PCs, the raw data sample ca be recostructed by liearly combie the PCs with the recostructio weights. Give a sample matrix, the estimatio error i the covariace matrix may accumulate i the calculatio of the pricipal compoets. The direct SVD of the sample matrix ca yield better pricipal compoets at least from the image recostructio poit of view. 5

6 Outlie Applicatios Itroductio of PCA Mathematical basis Calculatio of PCA Fisher classificatio Remote sesig multibad iformatio extractio Optical character recogitio (OCR) or hadwritig recogitio Face recogitio (Eigeface) Cacer diagosis Applicatios eg. Hadwritig Raw Data PCA represetatio υ (horizotal movemet) maily accouts for the legtheig of the lower tail of the three, while υ (vertical movemet) accouts for character thickess. Pricipal Compoet Space eg. Remote Sesig Vectorized Pixels p Bads N M Vectorized Pixels p PCs matrix Raw data matrix PCs represet New Bads made up of correlated combiatios of the origial bads. ( M N ) 6

7 eg. Remote Sesig (co t) Vectorized Pixels p Bads N eg. Remote Sesig (co t) Suppose p>n, we ca get p eigevectors with legth N p Raw data matrix eigevector scores (loadig) N N a j = its eigevalue, its st eigevector: Set j = elemets are correlatio coefficiets False color image Commo Problems Image dimesio mismatch. Appropriate compressio or iterpolatio is eeded. Calculatio complexity ad data scarcity. For N p matrix X, the sample covariace requires Ο( Np ) operatios; the sap shot algorithm requires Ο( N ) operatios; EM algorithm requires Ο(rNp) operatios (r is the umber of leadig eigevectors). 3 Commo Problems (co t) Commo Problems (co t) The discrimiatio power of pricipal compoets is ot mootoically decreasig Oe of the assumptios of the method is a liearity of correlatio betwee samples. This is rarely met. 7

8 Similar yet Differet Techiques Caoical Aalysis:(CA). Whereas PCA uses all pixels regardless of idetity or class to derive the compoets, i CA oe limits the pixels ivolved to those associated with pre-idetified features/classes. This requires that those features ca be recogized (by photoiterpretatio) i a image display (sigle bad or color composite) i oe to several areas withi the scee. These pixels are "blocked out" as traiig sites. Their multibad values (withi the site areas) are the processed i the maer of PCA. This selective approach is desiged to optimize recogitio ad locatio of the same features elsewhere i the scee. Similar yet Differet Techiques(Co t) f i If the fuctio is ot liear, the we ca get Noliear Compoet Aalysis (NCA). Idepedet Compoet Aalysis (ICA): ICA is a particular rotatio method of factor aalysis to make the bases statistically idepedet rather tha ucorrelated. Bibliography This time: Hotellig H., Aalysis of a complex of statistical variables ito pricipal compoets. J. Educ. Psych., 4:47 44, , 933. T. Aderso, Asymptotic theory for pricipal compoet aalysis, A. Math. Statist., vol. 34, pp. -48, 963. Next time: I. T. Jolliffe. Pricipal compoet aalysis. Spriger-Verlag, New York, 986. J.-Y. Huag ad P. M. Schultheiss. Block quatizatio of correlated Gaussia radom variables. IEEE Tras. Comm., CS-:89-96, Sep H. P. Kramer ad Max V. Mathews, A liear codig for trasmittig a set of correlated sigals IEEE tras. Iformatio Theory. September 956 Vol :3 (ISSN ):4-46 Daiel L. Swets, Juyag Weg, Usig Discrimiat Eigefeatures for Image Retrieval. IEEE Trasactios o Patter Aalysis ad Machie Itelligece 8(8): (996) Ay questios for me? Q: If you are allowed to use oly oe word, which word would you use to characterize PCA? Thaks! 8

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image