Algebraic-Geometric and Probabilistic Approaches for Clustering and Dimension Reduction of Mixtures of Principle Component Subspaces

Size: px
Start display at page:

Download "Algebraic-Geometric and Probabilistic Approaches for Clustering and Dimension Reduction of Mixtures of Principle Component Subspaces"

Transcription

1 Algebrac-Geometrc ad Probablstc Approaches for Clusterg ad Dmeso Reducto of Mxtures of Prcple Compoet Subspaces ECE842 Course Project Report Chagfag Zhu Dec. 4, 2004

2 Algebrac-Geometrc ad Probablstc Approach for Clusterg ad Dmeso Reducto of Mxtures of Prcple Compoet Subspaces ECE842 Course Project Report Chagfag Zhu Abstract Geeralzed Prcpal Compoet Aalyss (GPCA) ad Probablstc Prcpal Compoet Aalyss (PPCA) are two extesos of PCA approaches to the mxtures of prcpal subspaces. GPCA s a algebrac geometrc framewor whch the collecto of lear subspaces s represeted by a set of homogeeous polyomals whose degree correspods to the umber of subspaces ad whose factors (roots) ecode the subspace parameter. PPCA s a probablstc approach where the prcpal compoet aalyss s vewed as a maxmum-lelhood procedure based o a probablty desty model of the observed data. Both techques are capable of estmatg a mxture of subspaces from sample data pots, thus useful for data clusterg ad dmeso reducto problems multvarate data mg. The prmary goal of ths project s to carry out a coceptual study, to explore the prcples ad features of the algebrac-geometrcal ad probablstc approaches to mxtures of prcpal compoet subspaces, ad lear from had-o experece though computatoal mplemetato of these techques. A polyomal factorzato algorthm (PFA) for GPCA ad a expectato-maxmzato (EM) for PPCA were mplemeted usg MATLAB codes. The mplemeted algorthms have bee tested o sythetc data sets. It was show that the PFA algorthm for GPCA ca successfully detfy the umber of subspaces the mxture, ad estmate the ormal vectors of the subspaces, f successful, wth a relatve hgh correlato. However, the mplemeted algorthm s ot robust as t s data depedet. The potetal problems of ths mplemetato were dscussed the report. The mplemeted EM algorthm for PPCA showed that a probablstc mxture model ca detfy the clusters, ad assg the cluster assocato of each data pot correctly. Both techques estmated the compoet subspaces of lower dmesoalty, thus data dmesos ca be reduced ad uderlyg clusters ca be recovered. I ths project, the mplemeted algorthms were oly tested o sythetc 3-dmesoal data ad ot yet tested o hgher dmesoal data or real data, ad the algorthms are far from comprehesve for practcal use. However, the computatoal mplemetato helped a lot the uderstadg of the two approaches for mxtures of prcpal compoet subspaces.

3 . Itroducto I the aalyss of multvarate (mult-dmesoal) data sets, group segmetato ad cluster formato ofte reveals sght that s useful owledge dscovery from the complex data set, whch are ofte hgh dmeso, mult-model, ad lac of pror owledge. Clusterg decomposto may eable the use of relatvely smple models for each of the local clusterg structures, offerg great ease of terpretato as well as the beefts of aalytcal ad computatoal smplfcato []. O the other had, although t s ow possble to aalyze large amouts of hgh-dmesoal data through the use of hgh-performace computers, geeral, however, several problems occur whe the umber of dmesos becomes hgh. These problems clude the exploso of executo tme, dffculty the selecto of explaatory varables [2]. Therefore, data clusterg ad dmeso reducto are mportat problems multvarate data mg. Data clusterg ad dmesoal reducto are correlated to each other. Usually ot all the data are useful for producg a desred clusterg,.e. some features may be redudat, ad some may be rrelevat. May clusterg algorthms fal whe dealg wth hgh dmesoal data. I ths case, detfyg ad retag oly those features that are most relevat to the desred clusterg would facltate the mult-dmesoal data aalyss. If the data clusters ca be vsualzed a lower dmesoal subspace, t wll allow better terpretato ad less computatoal commad. Prcple Compoet Aalyss (PCA) [2][3] s a very popular method used dmeso reducto, data vsualzato ad exploratory data aalyss. The dea s that a d-dmesoal data set ca be reduced to a set of q-dmesoal data usg q lear combato of bases of d-dmeso. The lear combato s cosdered as lear projecto or lear trasformato. The orgal d-dmesoal feature space s trasformed to a ew q-dmesoal (q < d) feature subspace. The ew feature spaces are called prcpal compoet subspace. The advatage of PCA s twofold: ) the orgal data s represeted by fewer varables wth mmal mea square error, whch reduces the dmesoalty of the data set; 2) the trasformato maxmzed the separato of data clusters. However, oe of the lmtatos of PCA s that t oly defes a sgle global projecto of the data. For more complex data, dfferet clusters may requre dfferet projecto drectos. The other lmtato of PCA s that the orgal data should have a lear or ear-lear structure, to esure the sgularty of the data matrx. If the data have a o-lear structure, the lear PCA may ot be adequate explorg the data. May extesos of PCA have bee developed to determe the prcpal subspace. I ths project, we studed the two extesos of PCA to the mxtures of subspaces: Geeralzed Prcpal Compoet Aalyss (GPCA) [4][5] ad Probablstc Prcpal Aalyss (PPCA) [6][7]. Geeralzed prcpal compoet aalyss s a algebracgeometrc approach, whch has bee proposed the computer vso commuty, prmarly the cotext of 3-D moto segmetato. Extesve wor o GPCA has bee carred out by Vdal, et al. [5] ad two algorthms, the polyomal factorzato algorthm (PFA) ad the polyomal dfferetato algorthm (PDA) have bee proposed. Probablstc prcpal compoet aalyss s uderstood a probablstc formulato of 2

4 PCA from a Gaussa latet varable model, whch s closely related to statstcal factor aalyss [6]. The prmary goal of ths project s to explore the prcples ad features of the algebrac-geometrcal ad probablstc approaches for clusterg ad dmeso reducto of mxtures of prcpal compoet subspaces, ad lear from had-o experece though computatoal mplemetato of these techques. The polyomal factorzato algorthm (PFA) for GPCA ad a expectato-maxmzato (EM) algorthm for PPCA were mplemeted MATLAB code. 2. Geometrc approach to mxtures of prcpal compoet subspaces: GPCA 2.. Prcples of GPCA I the geeralzed prcpal compoet aalyss, the sample data pots {x j R K }, j =,2, N, are draw from -dmesoal lear subspace of R K, {S }, =,. The problem s to detfy each subspace wthout owg whch sample pots belog to whch subspace. The uo of these lear subspaces of R K ca be vewed as correspodg to the projectve algebrac set defed by oe or more homogeeous polyomals of degree K varables. Hece, estmatg a collecto of subspace s equvalet to estmatg the algebrac varety defed by such a set of polyomals. I the case whe the subspace has dmesoalty of = K. Vdal, et al. [4][5] has show that the uo of such subspace s defed by a uque homogeeous polyomal p (x). The degree of p (x) s the the umber of hyperplaes ad each oe of the factors of p (x) correspods to each oe of the hyperplaes. Therefore the problem of detfyg a collecto of hyperplaes s reduced to estmatg ad factorg p (x). Sce every sample pot x R K must le o oe of the subspaces, S, every x must also satsfy p (x) = 0. The oe ca retreve p (x) drectly from the gve data samples wthout owg the segmetato of the data pot. Vdal [5] also showed that fact the umber of subspaces s exactly the lowest degree of p (x) such that p (x) = 0 for all sample pots. Ths leads to a smple matrx ra codto whch determes the umber of hyperplaes. Gve, the polyomal s determed from the soluto of a set of lear equatos. Gve p (x), the estmato of the hyperplaes s essetally equvalet to factorg p (x) to a product of lear factors Represetg mxtures of subspace as algebrac sets ad varetes Oe of the mportat cocept uderlyg GPCA problem s represetg the mxture of subspace as algebrac sets ad varetes. Notced that every (K-)-dmesoal space S R K ca be represeted by a ozero ormal vector b R K as: S = {x R K : b T x = 0}. Sce the subspaces S are all dstct from each other, the ormal vectors {b }, =, are parwse learly depedet. Gve that every sample pot x R K lyg o oe of the subspaces S, such a pot satsfes the formula: (b T x = 0) (b 2 T x = 0) (b 3 T x = 0).. (b T x = 0), whch s equvalet to the followg homogeeous polyomal of degree x wth real coeffcets: 3

5 p ( x ) = ( b x) = 0 = Ths olear equato s the multplcato of lear equatos x (or order multvarate polyomal), ad ca be expressed a lear formula as: T 2 K p ( x) = v ( x) c = c x x... x = 0, where T 2 K T v :[ x,..., xk ] [..., x x2... xk,...] T, 2,... K a s called Veroese map ad the tem of x... The coeffcets of the form 2 K x2 xk s a moomal wth, 2, K chose from the degree-lexcographc order. c, 2,... K are fuctos of the etres of {b }, =. The problem of GPCA s the to recover {b }, gve the coeffcets of c of the polyomal p (x). The olear Veroese map maps the orgal data { x j } j =,2, N wth dmeso of K to a embedded data space wth hgher dmeso of M + K + K ( M = = ), whch s very smlar to the commoly used erel K approach. But the mert s that t trasforms the olear equato of p (x) to a lear equato o the vectors of coeffcets c. Whe the umber of subspace s uow, t ca be determed from the ra of the Veroese map matrx L of the form T 2 T N T [ v ( x ), v ( x ),..., v ( x ) ] T. The moomals 2 x... K 2 K x2 xk ca be calculated from the gve data samples, the solvg for c s actually a problem of solvg a set of N lear equatos, where N s the total umber of sample pots. The remag problems s to factorze the polyomal p (x) wth coeffcets of c to fd the etres of {b }, =,,. Each factor wll gve a estmato of a subspace (hyperplae) Polyomal factorzato algorthm (PFA) for GPCA Vdal, et al. descrbed the polyomal factorzato algorthm for GPCA detal [4]. I ths project, the algorthm for the case the absece of ose ad each subspace has dmeso of = K has bee mplemeted. The algorthm mplemeted ths project s summarzed as followg: Gve sample pots { x j } j =,2, N lyg o a collecto of hyperplaes {S R K }, =,, fd the umber of hyperplaes ad the ormal vector to each hyperplae {b R K }, =,, as follows: ) Apply the Veroese map of order, for =,2,, to the vectors { x j } j =,2, N ad form the matrx L. Calculate the ra of each obtaed L. Whe ra (L ) = M, stop the Veroese mappg, ad the umber of hyperplaes s set to be curret. The solve for c from L c = 0 ad ormalze so that c =. 2) Get the coeffcets of the uvarate polyomal q (t) from the last + etres of c. 4

6 3) If the frst l (0 <= l <= ) coeffcets of q (t) are equal to zero, set (b K-,b K ) = (0,) for =,,l. The solve a order polyomal equato q (t) = 0, ad set (b K-,b K ) = (, - t j ) for j = l+, from the - l roots of q (t). 4) If all the coeffcets of q (t) are zero, just set (b K-,b K ) = (0,0) for =,,. 5) After obtag (b K-,b K ) for =,,, solve for {b J } =,, for J = K - 2,,, by solvg a lear system. A practcal PFA algorthm wll have to cosder the cases such as () the dmeso of subspace s smaller tha K ( < K ); (2) degeerate cases whch vectors (b rj+,b rj+2,,,b rk ) are ot parwse learly depedet; ad (3) presece of ose. However, these were ot explored ths course project. 3. Probablstc approach to mxtures of prcpal compoet subspaces: PPCA 3.. Prcples of PPCA Covetoal PCA sees a q-dmesoal (q < d) lear projecto that best represets the data a least-square sese. For a gve data set D of observed d-dmesoal vector D = {t }, =,,N, the sample covarace matrx S s frst calculated, whch s used for Sgular Value Decomposto (SVD) or ege-aalyss to fd a set of egevalues ad correspodg egevectors. The, the q domat egevectors u j ca be used to fathfully represet the orgal data wth mmal loss of formato, ad provde the q prcpal projecto axes. The projected data x s gve by x = U q T (t µ), where U q = [u, u 2,, u q ]. Ths s a lear projecto ad t maxmzes the varace the projected space. Probablstc PCA defes a probablty model [6][7], where the observatos t s defed as a lear trasformato of a latet varable x wth probablty dstrbuto of p(x), wth addtoal ose e: t = Wx + µ+ e. W s a d q lear trasformato matrx, µ s a d-dmesoal vector that allows t to have a o-zero mea. I most studes, x ad e are assumed to have Gaussa dstrbuto p(x) ~ N(0, I q ) ad p(e) ~ N(0, s 2 I d ). The the dstrbuto of t s also Gaussa of the dstrbuto p(t) ~ N(µ, WW T + s 2 I d ). Gve the above probablstc model of the data, oe ca always compute the maxmum-lelhood estmator for the parameters µ, s 2 ad W from the data samples D, ad the maxmum-lelhood estmates of these parameters are: N µ = σ ML t N = d 2 ML = λ d q = q+ W ML = U q (? q - s 2 MLI) /2 R where? q+,,? d are the smallest egevalues of the sample covarace matrx S, the q colums the d q orthogoal matrx U q are the q domat egevectors of S, dagoal 5

7 matrx? q cotas the correspodg q largest egevalues, ad R s a arbtrary q q orthogoal matrx. To smplfy the problem, R ca be chose as detty matrx I Mxture of PPCA Usually data ca be geerated from a mxture of compoets of dfferet probablty desty. I the clusterg usg fte mxture models, each compoet desty fucto p(t ) represets a cluster. Wth the probablstc model defed PPCA, oe ca model each mxture compoet as a sgle PPCA. The observed data the has a probablstc dstrbuto, ad the probablty desty of the observed data s modeled as the weghted sum of a umber of Gaussa dstrbutos ad expressed as: p( t) = 0 = π p( t µ, σ, W ), where p (t µ,s 2,W ) deotes a PPCA desty fucto for compoet, 0 s the total umber of compoets, ad p s the mxg proporto (weght) of the mxture compoets (subject to the costrats: p >= 0 ad sum(p, =,, 0 ) = ). Therefore, the maxmum- lelhood estmato of the model parameters should maxmze the loglelhood of the observed data, whch s gve by: L = N = log( p( t)) = N = log{ 0 = 2 2 π p( t µ, σ, W )}. Usg a Expectato-Maxmzato (EM) algorthm [7][8], we ca compute the maxmum-lelhood estmato for parameters p, µ, s 2 ad W, recursvely. Ths the gves the mxture compoets ad the mxg weght of each compoet the mxture. Oce the model parameters are determed, the lear relato betwee observato ad model compoets gve by t = Wx + µ+ e s completely defed. The the observed data t ca be projected to x space, as x = z W T (t µ ), whch s a q-dmesoal reduced represetato of th -cluster focused vector t. Plot the vector x wll create a th - cluster focused projecto -subspace, ad z gves the proporto of cotrbuto the pot t has to the -subspace EM algorthm for mxture of PPCA Expectato-Maxmzato (EM) refers to a teratve optmzato method to estmate some uow parameters T, gve measuremet data U [8]. I the mxture of PPCA problem, we wat to estmate the set of {p, µ, s 2, W }, =,, 0, usg the observed data D. So EM would be a dea method to solve the problem. The schematc summary of the algorthm s as follows: ) Italzato: I ths step, the tal estmate of the parameters {p 0, µ 0, s 0 2, W 0 } are radomly selected. 2) Usg EM to compute the estmato of parameters that maxmzes the loglelhood of the observed data D. 3) For =,2,. 6

8 E-step: Usg the curret estmato of parameters, calculate the posteror probablty (R ) of data t belogg to the th compoet gve by: 2 π p( t µ,, σ,, W, ) R,, =, =,, 0, =,,N p( t ) M-step: Usg the posteror probablty obtaed from E-step, calculate the ew estmato of parameters as followg: N π R µ, + =,, N = N R =,,, + = N R t =,, The usg the ew estmato of µ,+, =,, 0, compute the weghted sample covarace matrces as: S T R,, ( t µ, + )( t µ, + ) = N R =,, the compute the egevalues ad egevectors of S, ad update the estmate of s 2 ad W as: σ 2, + = d λ j d q j= q+ W,+ = U q (? q - s 2,+I q ) /2 4) Whe terato completes, calculate th -cluster focused projecto -subspace, x, of each sample t : x = R W T (t µ ) 4. Computatoal expermets I the computatoal expermets, we ) mplemeted the polyomal factorzato algorthm (PFA) for GPCA ad a expectato-maxmzato (EM) algorthm for PPCA MATLAB codes; ad2) valdated the capablty of these methods dscoverg the clusters the subspaces. 4.. Sythetc data sets The mplemeted algorthms were tested o a smple sythetc data set. Fgure (a) shows the data set cosstg of dmesoal data pots geerated for the GPCA test (referred to as Set ). The data were geerated from a lear combato of 3 2- dmesoal lear subspaces. Each subspace s represeted by a radomly selected ormal vector. I order to test whether the algorthm ca detfy the umber of subspaces correctly, data were geerated from lear combato of radomly selected = 2,3,4,6 subspaces. I all the cases tested ths study, o ose s added to the geerated data. Fgure (b) dsplays the data set geerated for PPCA test (referred to as Set 2). The data set cossts of 240 data pots geerated from a mxture of three 7

9 Gaussas 3-dmesoal space. Two of the clusters are closely spaced ad the thrd s well separated from the frst two. 8

10 (a) (b) Fgure (a) Sythetc data set for GPCA test. Data were geerated from a combato of 4 lear subspcaces; (b) Sythetc data set for PPCA test. Data were geerated from a mxture of 3 Gaussas. 9

11 4.2. Applyg GPCA to data Set The mplemeted PFA algorthm for GPCA was appled to the sythetc data set. It showed that for all the cases wth = 2,3,4,6, the algorthm ca fd the umber of subspaces correctly. However, fdg the ormal vector of each subspace s ot a trval wor. The dffcultes may come from two facts: ) the algorthm volves solvg for the roots of polyomal equatos. It s lely that for some cases, complex roots are obtaed; ad 2) the algorthm volves solvg multvarate lear systems,.e. solvg for x Ax = b, the successful estmato of the ormal vector compoets the depeds o the codto umber of matrx A. Whe the matrx s ll-codtoed, we could obta a correct soluto to x. If the radomly geerated data does ot mpose these llcodtoed problems to the GPCA procedure, the ormal vector of each subspace ca be estmated. As a example, the 4 radomly selected ormal vectors {b } =,2,3,4, of the subspaces from whch data Set geerates are: ad the estmated ormal vectors { bˆ} are: Note the estmated ormal vectors are ot the same order of the actual ormal vectors, ad the ormal vectors ca be dfferet wth a factor of (-). Table lsted the correlatos (corr) betwee the actual ormal vector {b } ad the estmated ormal vector { bˆ} 5 successful estmatos of subspaces, for the four cases wth the total umber of subspaces = 2,3,4,6, respectvely. The average ad stadard devatos of the absolute values of correlatos were also lsted the table. The correlato betwee the actual ormal vector {b } ad the estmated ormal vector { bˆ} s calculated as T corr = b b ˆ = A mus sg dcates the estmated ormal vector s the opposte drecto (or symmetrc about the org) relatve to the actual ormal vector. 0

12 Table Correlato (corr) betwee the actual ormal vector {b } ad the estmated ormal vector { bˆ} 5 successful estmatos of subspaces, for the four cases wth the total umber of subspaces = 2,3,4,6, respectvely. = 2 = 3 = 4 = AVG ( corr ) STD ( corr ) Expermet o the sythetc data showed that the algorthm mplemeted here ca successfully detfy the umber of subspaces the mxture, ad also estmate the ormal vectors of the subspaces, f successful, wth a relatve hgh correlato (~ 0.7 ths study). Oce the ormal vectors of the subspaces are determed, the orgal data ca be represeted the lower dmesoal subspaces, ad further aalyss ca be carred out o each subspace separately. However, the mplemetato s yet ot robust, sce t does deped o the radomly geerated data Applyg PPCA to Set 2 The geerated data Set 2 s a mxture of three Gaussas, wth two clusters closely placed ad oe cluster placed separately. We appled the EM algorthm to ths data set, frst assumg there are oly two clusters (subspaces), ad the assumg there are three clusters (subspaces). Fgure 2 shows the projected data x-space for the cases (a) assumg 2 clusters; ad (b) assumg 3 clusters. Dfferet colors ad marers are used to dcate the group assocato of each data pot to the subspaces. It s show that the probablstc mxture model ca fd out the clusters, ad assg the cluster assocato of each data pot correctly. Also the orgal data t ca be reduced to a 2-dmesoal data set x. I ths computatoal expermet, we have assumed the umber of subspaces. However, ths formato usually s uow ad caot be assumed arbtrarly. I a practcal usupervsed cluster decomposto, t would be desrable to select the structural parameter 0 of the model automatcally ad correctly. Wag, et al [] proposed usg two formato theoretc crtera,.e. the Aae formato crtero (AIC) ad mmum descrpto legth crtero (MDL), to gude the model selecto. Ths allows a optmal model to be selected from several competg model caddates such that the selected model best fts the observed data D. Ths techque s ot mplemeted ths project.

13 (a) (b) Fgure 2 Projected data x-space of the observatos t for the cases (a) assumg 2 clusters; ad (b) assumg 3 clusters. 2

14 5. Dscussos I ths project, we explored the prcples ad features of the algebrac-geometrcal (GPCA) ad probablstc approaches (PPCA) for clusterg ad dmeso reducto of mxtures of prcpal compoet subspaces, ad mplemeted these two techques MATLAB codes for had-o experece. I the absece of ose, the GPCA ca be casted a algebrac geometrc framewor whch the collecto of subspaces s represeted by a set of homogeeous polyomals whose degree correspods to the umber of subspaces ad whose factors (roots) ecode the subspace parameter [5]. The umber of subspaces ca be determed from the ra codto of the Veroese map matrx of the orgal data, ad the estmato of the hyperplaes s equvalet to factorg the polyomal of degree to a product of lear factors. The polyomal factorzato algorthm (PFA) proposed by Vdal et al. [4][5] s mplemeted the project. There s aother algorthm also proposed by Vdal [5], whch s called polyomal dfferetato algorthm (PDA). The PDA algorthm s desged for subspaces of arbtrary dmesos ad obtas a bass for each subspace by evaluatg the dervatve of the set of polyomals represetg the subspaces at a collecto of pots each oe of the subspaces. Vdal et al. have show that PDA algorthm gves about half of the error of the PFA algorthm, ad also mproves the performace of teratve techques, such as K-subspace ad EM, by about 50% wth respect to radom talzato. However, ths algorthm was ot mplemeted ths study. The expermet o the sythetc data shows that the PFA algorthm mplemeted ths study ca successfully detfy the umber of subspaces the mxture, ad estmate the ormal vectors of the subspaces, f successful, wth a relatve hgh correlato (~ 0.7 ths study). Oce the ormal vectors of the subspaces are determed, the orgal data ca be represeted the lower dmesoal subspaces, ad further aalyss ca be carred out o each subspace separately. However, the mplemetato s yet ot robust sce t s data depedet. Ths may be due to two facts: ) the algorthm volves solvg for the roots of polyomal equatos. It s lely that for some cases, complex roots are obtaed; ad 2) the algorthm volves solvg multvarate lear systems,.e. solvg for x Ax = b, the successful estmato of the ormal vector compoets the depeds o the codto umber of matrx A. Whe the matrx s ll-codtoed, we could obta a correct soluto to x. I PPCA, the prcpal compoet aalyss s vewed as a maxmum-lelhood procedure based o a probablty desty model of the observed data. The probablty model s Gaussa, ad determato of model parameters oly requres the computg of the egevectors ad egevalues of the sample covarace matrx. A mxture model of PPCA s cosdered whe multple clusters (subspaces) preset. I ths case, a EM algorthm s used to fd the prcpal subspaces by teratvely maxmzg the lelhood fucto. 3

15 The EM algorthm s mplemeted ad tested o sythetc data set the study. It s show that the probablstc mxture model ca fd out the clusters, ad assg the cluster assocato of each data pot correctly. The PPCA approach however, has some dsadvatages [5]: ) It s hard to aalyze the exstece ad uqueess of a soluto to the problem; 2) The approach s restrcted to certa classes of dstrbutos or depedece assumptos; ad 3) The covergece of EM s geeral very sestve to talzato, thus there s o guaratee that t wll coverge to the optmal soluto. As a coceptual study, the PPCA decomposto mplemeted ths study s oly completed o a sgle level. May groups [][7] have exteded the mxture of PPCA models to a herarchcal mxture model. I ther method, each PPCA compoet the lower level ca be exteded to a group g j j =,,J of PPCA compoets the ext hgher level. The EM algorthm ca be appled aga to the decomposto the hgher level. I ths way, the multple clustered ca be separated recursvely to geerate a herarchy of mxtures of PPCA wth a umber of levels. Ths herarchcal model wll allow the clusters to be vsualzed dfferet perceptual level, thus s very useful mult-dmesoal data vsualzato. The GPCA ad PPCA are two dfferet vews of the mxtures of prcpal compoets. It s ot easy to compare these two methods drectly, but both techques have the capablty of detfyg clusters ad subspaces, so that the orgal data ca be represeted the subspaces wth lower dmesoalty. They ca be appled a varety of estmato problems, such as 3-D moto segmetato computer vso, ad dmeso reducto problems such as data compresso ad feature extracto. I ths project, the mplemeted algorthms were oly tested o sythetc 3-dmesoal data ad ot yet tested o hgher dmesoal data or real data, ad the algorthms are far from comprehesve for practcal use. However, the computatoal mplemetato helped a lot the uderstadg of the two approaches for mxtures of prcpal compoet subspaces. 6. Refereces [] Y. Wag, L. Luo, M.T. Freedma ad S.Y. Kug, Probablstc Prcple Compoet Subspaces: a Herarchcal Fte Mxture Model for Data Vsualzato, IEEE Trasactos o Neural etwors, pp , Vol., No.3, May 2000 [2] M. Mzuta, Dmeso Reducto Methods, [3] R.A. Johso ad D.W. Wcher, Appled Multvarate Statstcal Aalyss, pp.x, 594, Pretce-Hall, Eglewood Clffs, N.J. (982) [4] R. Vdal, Y. Ma ad S. Sastry, Geeralzed Prcple Compoet Aalyss (GPCA), 2003 IEEE Computer Socety Coferece o Computer Vso ad Patter Recogto (CVPR 03), pp , vol., Jue 8-20, 2003, Madso WI 4

16 [5] R. Vdal, Geeralzed Prcpal Compoet Aalyss (GPCA): a Algebrac Geometrc Approach to Subspace Clusterg ad Moto Segmetato, PhD thess, Uversty of Calfora at Bereley, 2003 [6] M.E. Tppg ad C.M. Bshop, Probablstc Prcpal Compoet Aalyss, Techcal Report NCRG/97/00, Neural Computg Research Group, Asto Uversty, September 997. [7] T. Su ad J. Dy, Automated Herarchcal Mxtures of Probablstc Prcpal Compoet Aalyzers, Proceedgs of the 2 st Iteratoal Coferece o Mache Learg, Artcle No.98, Baff, Caada, July 04-08, 2004 [8] S. Rowes, EM algorthms for PCA ad SPCA, Proceedgs of the 997 Coferece o Advaces Neural Iformato Processg System, pp , Dever, Colorado, 998 5

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Dimensionality reduction Feature selection

Dimensionality reduction Feature selection CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

QR Factorization and Singular Value Decomposition COS 323

QR Factorization and Singular Value Decomposition COS 323 QR Factorzato ad Sgular Value Decomposto COS 33 Why Yet Aother Method? How do we solve least-squares wthout currg codto-squarg effect of ormal equatos (A T A A T b) whe A s sgular, fat, or otherwse poorly-specfed?

More information

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution Global Joural of Pure ad Appled Mathematcs. ISSN 0973-768 Volume 3, Number 9 (207), pp. 55-528 Research Ida Publcatos http://www.rpublcato.com Comparg Dfferet Estmators of three Parameters for Trasmuted

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Analysis of Variance with Weibull Data

Analysis of Variance with Weibull Data Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad

More information

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design Authors: Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud Applcato of Calbrato Approach for Regresso Coeffcet Estmato uder Two-stage Samplg Desg Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst

More information

MOLECULAR VIBRATIONS

MOLECULAR VIBRATIONS MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal

More information

Lecture Note to Rice Chapter 8

Lecture Note to Rice Chapter 8 ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.

More information

A conic cutting surface method for linear-quadraticsemidefinite

A conic cutting surface method for linear-quadraticsemidefinite A coc cuttg surface method for lear-quadratcsemdefte programmg Mohammad R. Osoorouch Calfora State Uversty Sa Marcos Sa Marcos, CA Jot wor wth Joh E. Mtchell RPI July 3, 2008 Outle: Secod-order coe: defto

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~

More information

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD 3D Geometry for Computer Graphcs Lesso 2: PCA & SVD Last week - egedecomposto We wat to lear how the matrx A works: A 2 Last week - egedecomposto If we look at arbtrary vectors, t does t tell us much.

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

Median as a Weighted Arithmetic Mean of All Sample Observations

Median as a Weighted Arithmetic Mean of All Sample Observations Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean Research Joural of Mathematcal ad Statstcal Sceces ISS 30 6047 Vol. 1(), 5-1, ovember (013) Res. J. Mathematcal ad Statstcal Sc. Comparso of Dual to Rato-Cum-Product Estmators of Populato Mea Abstract

More information

Announcements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier

Announcements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier Aoucemets Recogto II H3 exteded to toght H4 to be aouced today. Due Frday 2/8. Note wll take a whle to ru some thgs. Fal Exam: hursday 2/4 at 7pm-0pm CSE252A Lecture 7 Example: Face Detecto Evaluatg a

More information

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory ROAD MAP... AE301 Aerodyamcs I UNIT C: 2-D Arfols C-1: Aerodyamcs of Arfols 1 C-2: Aerodyamcs of Arfols 2 C-3: Pael Methods C-4: Th Arfol Theory AE301 Aerodyamcs I Ut C-3: Lst of Subects Problem Solutos?

More information

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa

More information

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Introduction to Matrices and Matrix Approach to Simple Linear Regression Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,

More information

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots

More information

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i. CS 94- Desty Matrces, vo Neuma Etropy 3/7/07 Sprg 007 Lecture 3 I ths lecture, we wll dscuss the bascs of quatum formato theory I partcular, we wll dscuss mxed quatum states, desty matrces, vo Neuma etropy

More information

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method

More information

Some Notes on the Probability Space of Statistical Surveys

Some Notes on the Probability Space of Statistical Surveys Metodološk zvezk, Vol. 7, No., 200, 7-2 ome Notes o the Probablty pace of tatstcal urveys George Petrakos Abstract Ths paper troduces a formal presetato of samplg process usg prcples ad cocepts from Probablty

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

ENGI 4421 Propagation of Error Page 8-01

ENGI 4421 Propagation of Error Page 8-01 ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits Block-Based Compact hermal Modelg of Semcoductor Itegrated Crcuts Master s hess Defese Caddate: Jg Ba Commttee Members: Dr. Mg-Cheg Cheg Dr. Daqg Hou Dr. Robert Schllg July 27, 2009 Outle Itroducto Backgroud

More information

A New Family of Transformations for Lifetime Data

A New Family of Transformations for Lifetime Data Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several

More information

Chapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II

Chapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II CEE49b Chapter - Free Vbrato of Mult-Degree-of-Freedom Systems - II We ca obta a approxmate soluto to the fudametal atural frequecy through a approxmate formula developed usg eergy prcples by Lord Raylegh

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Lecture Notes 2. The ability to manipulate matrices is critical in economics. Lecture Notes. Revew of Matrces he ablt to mapulate matrces s crtcal ecoomcs.. Matr a rectagular arra of umbers, parameters, or varables placed rows ad colums. Matrces are assocated wth lear equatos. lemets

More information

Analysis of Lagrange Interpolation Formula

Analysis of Lagrange Interpolation Formula P IJISET - Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue, December 4. www.jset.com ISS 348 7968 Aalyss of Lagrage Iterpolato Formula Vjay Dahya PDepartmet of MathematcsMaharaja Surajmal

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data

More information

Lecture 07: Poles and Zeros

Lecture 07: Poles and Zeros Lecture 07: Poles ad Zeros Defto of poles ad zeros The trasfer fucto provdes a bass for determg mportat system respose characterstcs wthout solvg the complete dfferetal equato. As defed, the trasfer fucto

More information

Transforms that are commonly used are separable

Transforms that are commonly used are separable Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )

More information

Lecture 2 - What are component and system reliability and how it can be improved?

Lecture 2 - What are component and system reliability and how it can be improved? Lecture 2 - What are compoet ad system relablty ad how t ca be mproved? Relablty s a measure of the qualty of the product over the log ru. The cocept of relablty s a exteded tme perod over whch the expected

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

1 Convergence of the Arnoldi method for eigenvalue problems

1 Convergence of the Arnoldi method for eigenvalue problems Lecture otes umercal lear algebra Arold method covergece Covergece of the Arold method for egevalue problems Recall that, uless t breaks dow, k steps of the Arold method geerates a orthogoal bass of a

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

Tema 5: Aprendizaje NO Supervisado: CLUSTERING Unsupervised Learning: CLUSTERING. Febrero-Mayo 2005

Tema 5: Aprendizaje NO Supervisado: CLUSTERING Unsupervised Learning: CLUSTERING. Febrero-Mayo 2005 Tema 5: Apredzae NO Supervsado: CLUSTERING Usupervsed Learg: CLUSTERING Febrero-Mayo 2005 SUPERVISED METHODS: LABELED Data Base Labeled Data Base Dvded to Tra ad Test Choose Algorthm: MAP, ML, K-Nearest

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

Generative classification models

Generative classification models CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato

More information

9.1 Introduction to the probit and logit models

9.1 Introduction to the probit and logit models EC3000 Ecoometrcs Lecture 9 Probt & Logt Aalss 9. Itroducto to the probt ad logt models 9. The logt model 9.3 The probt model Appedx 9. Itroducto to the probt ad logt models These models are used regressos

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

( q Modal Analysis. Eigenvectors = Mode Shapes? Eigenproblem (cont) = x x 2 u 2. u 1. x 1 (4.55) vector and M and K are matrices.

( q Modal Analysis. Eigenvectors = Mode Shapes? Eigenproblem (cont) = x x 2 u 2. u 1. x 1 (4.55) vector and M and K are matrices. 4.3 - Modal Aalyss Physcal coordates are ot always the easest to work Egevectors provde a coveet trasformato to modal coordates Modal coordates are lear combato of physcal coordates Say we have physcal

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Nonparametric Techniques

Nonparametric Techniques Noparametrc Techques Noparametrc Techques w/o assumg ay partcular dstrbuto the uderlyg fucto may ot be kow e.g. mult-modal destes too may parameters Estmatg desty dstrbuto drectly Trasform to a lower-dmesoal

More information

Class 13,14 June 17, 19, 2015

Class 13,14 June 17, 19, 2015 Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

The Necessarily Efficient Point Method for Interval Molp Problems

The Necessarily Efficient Point Method for Interval Molp Problems ISS 6-69 Eglad K Joural of Iformato ad omputg Scece Vol. o. 9 pp. - The ecessarly Effcet Pot Method for Iterval Molp Problems Hassa Mshmast eh ad Marzeh Alezhad + Mathematcs Departmet versty of Ssta ad

More information

Overcoming Limitations of Sampling for Aggregation Queries

Overcoming Limitations of Sampling for Aggregation Queries CIS 6930 Approxmate Quer Processg Paper Presetato Sprg 2004 - Istructor: Dr Al Dobra Overcomg Lmtatos of Samplg for Aggregato Queres Authors: Surajt Chaudhur, Gautam Das, Maur Datar, Rajeev Motwa, ad Vvek

More information

Lecture 3. Least Squares Fitting. Optimization Trinity 2014 P.H.S.Torr. Classic least squares. Total least squares.

Lecture 3. Least Squares Fitting. Optimization Trinity 2014 P.H.S.Torr. Classic least squares. Total least squares. Lecture 3 Optmzato Trt 04 P.H.S.Torr Least Squares Fttg Classc least squares Total least squares Robust Estmato Fttg: Cocepts ad recpes Least squares le fttg Data:,,,, Le equato: = m + b Fd m, b to mmze

More information

L5 Polynomial / Spline Curves

L5 Polynomial / Spline Curves L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Ahmed Elgamal. MDOF Systems & Modal Analysis

Ahmed Elgamal. MDOF Systems & Modal Analysis DOF Systems & odal Aalyss odal Aalyss (hese otes cover sectos from Ch. 0, Dyamcs of Structures, Al Chopra, Pretce Hall, 995). Refereces Dyamcs of Structures, Al K. Chopra, Pretce Hall, New Jersey, ISBN

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

Module 7. Lecture 7: Statistical parameter estimation

Module 7. Lecture 7: Statistical parameter estimation Lecture 7: Statstcal parameter estmato Parameter Estmato Methods of Parameter Estmato 1) Method of Matchg Pots ) Method of Momets 3) Mamum Lkelhood method Populato Parameter Sample Parameter Ubased estmato

More information

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0

More information

Convergence of the Desroziers scheme and its relation to the lag innovation diagnostic

Convergence of the Desroziers scheme and its relation to the lag innovation diagnostic Covergece of the Desrozers scheme ad ts relato to the lag ovato dagostc chard Méard Evromet Caada, Ar Qualty esearch Dvso World Weather Ope Scece Coferece Motreal, August 9, 04 o t t O x x x y x y Oservato

More information

BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS. Aysegul Akyuz Dascioglu and Nese Isler

BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS. Aysegul Akyuz Dascioglu and Nese Isler Mathematcal ad Computatoal Applcatos, Vol. 8, No. 3, pp. 293-300, 203 BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS Aysegul Ayuz Dascoglu ad Nese Isler Departmet of Mathematcs,

More information

Dimensionality Reduction

Dimensionality Reduction Dmesoalty Reducto Sav Kumar, Google Research, NY EECS-6898, Columba Uversty - Fall, 010 Sav Kumar 11/16/010 EECS6898 Large Scale Mache Learg 1 Curse of Dmesoalty May learg techques scale poorly wth data

More information