Regularized Discriminant Analysis for High Dimensional, Low Sample Size Data

Size: px
Start display at page:

Download "Regularized Discriminant Analysis for High Dimensional, Low Sample Size Data"

Transcription

1 Regularzed Dscrmnant Analyss for Hgh Dmensonal, Low Sample Sze Data Jepng Ye Arzona State Unversty Tempe, AZ Te Wang Arzona State Unversty Tempe, AZ ABSTRACT Lnear and Quadratc Dscrmnant Analyss have been used wdely n many areas of data mnng, machne learnng, and bonformatcs. Fredman proposed a compromse between Lnear and Quadratc Dscrmnant Analyss, called Regularzed Dscrmnant Analyss (), whch has been shown to be more flexble n dealng wth varous class dstrbutons. apples the regularzaton technques by employng two regularzaton parameters, whch are chosen to jontly maxmze the classfcaton performance. The optmal par of parameters s commonly estmated va crossvaldaton from a set of canddate pars. It s computatonally prohbtve for hgh dmensonal data, especally when the canddate set s large, whch lmts the applcatons of to low dmensonal data. In ths paper, a novel algorthm for s presented for hgh dmensonal data. It can estmate the optmal regularzaton parameters from a large set of parameter canddates effcently. Experments on a varety of datasets confrm the clamed theoretcal estmate of the effcency, and also show that, for a properly chosen par of regularzaton parameters, performs favorably n classfcaton, n comparson wth other exstng classfcaton methods. Categores and Subject Descrptors: H.2.8 [Database Management]: Database Applcatons - Data Mnng General Terms: Algorthms Keywords: Dmensonalty reducton, Quadratc Dscrmnant Analyss, regularzaton, cross-valdaton. INTRODUCTION Statstcal dscrmnant analyss s a frequently used and wdely applcable tool n a varety of areas [6, 2, 26, 27]. The am of dscrmnant analyss s to assgn a data pont to one of several classes (groups) on the bass of a number of feature varables. Numerous methods on dscrmnant analyss have been proposed and appled n the past. The most Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. KDD 06, August 20 23, 2006, Phladelpha, Pennsylvana, USA. Copyrght 2006 ACM /06/ $5.00. frequently used methods are parametrc approaches, especally Lnear and Quadratc Dscrmnant Analyss. Lnear Dscrmnant Analyss (LDA) s based on the assumpton that the varables are multvarate normally dstrbuted n each class wth dfferent mean vectors and a common covarance matrx. It has been used n varous applcatons [, 6, 9, 24]. In Quadratc Dscrmnant Analyss (QDA), the varables are assumed to be multvarate normally dstrbuted n each class wth dfferent mean vectors and dfferent covarance matrces [4]. QDA provdes a less restrctve procedure by allowng dfferent covarance matrces and may thus ft the data better than LDA. However, LDA nvolves a much smaller number of parameters to estmate than QDA, and s thus more robust and relable than QDA n the parameter estmaton. Fredman [8] proposed a compromse between LDA and QDA, called Regularzed Dscrmnant Analyss (or n short), whch allows one to shrn the separate covarances of QDA toward a common covarance as n LDA. The regularzed covarance matrx of the -th class has the followng form: β (ασ + ( α)s w ) + ( β)i d, where Σ s the covarance of the -th class, S w, the so-called pooled covarance matrx as used n LDA, s also nown as the wthn-class scatter matrx [9], I d s the dentty matrx of sze d by d, and d s the dmensonalty of the data. Here α [0, ] and β [0, ] are two regularzaton parameters. The trace term n the orgnal formulaton n [8] s absorbed nto the β parameter for smplcty. provdes a farly rch class of regularzaton alternatves. The four corners defnng the extremes of the (α, β) plane represent well-nown classfcaton procedures. The upper rght corner (α =, β = ) represents QDA. The upper left corner (α = 0, β = ) represents LDA. The lne connectng the lower left and lower rght corner,.e., β = 0 wth 0 α, corresponds to the nearest-centrod classfer well nown n pattern recognton, where a test data pont s assgned to the class wth the closest (Eucldean dstance) centrod. Varyng α wth β fxed at produces models between QDA and LDA. For a gven tranng dataset, α and β are commonly estmated va cross-valdaton. Selectng an optmal value for a parameter par such as (α, β) s called model selecton [4]. The computatonal cost of model selecton for s hgh, especally when the data dmensonalty, d, s large, snce t requres expensve matrx computatons for each canddate par. Ths restrcts to low dmensonal data. In ths paper, we mae the frst attempt n extendng

2 the applcablty of to hgh dmensonal, low sample sze (HDLSS) data, as HDLSS data are emergng from varous felds. In hgh throughput gene expresson experments, technologes have been desgned to measure the gene expresson levels of tens of thousands of genes n a sngle mcroarray chp. However, the sample sze n each dataset s typcally small rangng from tens to low hundreds due to the cost of the experments. In mage-based object or face recognton applcatons, two or three dmensonal mages are usually converted to column representatons, resultng n hgh dmensonal data whle the number of mages s usually small. In text document classfcaton, the number of features equals to the number of dstnct words n the documents, whch s typcally n the thousands, whle the number of documents n the study may be much smaller. A common characterstc of all these datasets s that the dmensonalty, d, of the data vector s much larger than the sample sze. Ths leads to varous statstcal ssues, nown as the hgh dmensonal, low sample sze problem [3]. In ths paper, we propose an effcent algorthm for on HDLSS data. The prmary contrbutons of ths wor nclude: We show that the classfcaton rule n can be decomposed nto two components: the frst component nvolves matrces of low dmensonalty, whle the second component nvolves matrces of hgh dmensonalty. More mportantly, we show that for a gven test data pont, the second component n the classfcaton rule s constant for all classes, whch has no effect on classfcaton and can be smply removed. We call ths the decomposton property of. We present an effcent algorthm for, by applyng the decomposton property above to speed up the model selecton process of. The basc dea s to dvde the computatons n nto two successve stages. The frst stage has a relatvely hgher computatonal cost, but t s ndependent of α and β. The second stage has a relatvely lower computatonal cost. When searchng for the optmal parameter par from a set of canddates va cross-valdaton, we only need to repeat the second stage, thus dramatcally reducng the computatonal cost of model selecton, especally when the canddate set s large. We have conducted expermental studes on a varety of HDLSS data, ncludng text documents, face mages, and mcroarray gene expresson data. Results confrm our theoretcal estmate of the computatonal cost of the proposed algorthm n model selecton. Experments also demonstrate that, wth properly chosen regularzaton parameters, s effectve n classfcaton, n comparson wth several other nown classfcaton algorthms. The rest of the paper s organzed as follows. An overvew of QDA and s gven n Secton 2. An effcent algorthm for s presented n Secton 3. Expermental results are gven n Secton 4. Conclusons are presented n Secton AN OVERVIEW OF QDA AND For convenence, we present n Table the mportant notatons that wll be used n the rest of the paper. Notaton Descrpton n sample sze d number of features (dmensons) number of classes A data matrx A data matrx of the -th class n sze of the -th class µ centrod of the -th class Σ covarance matrx of the -th class ˆΣ regularzed covarance matrx of the -th class µ global centrod of the tranng set S b between-class scatter matrx S w wthn-class scatter matrx S t total scatter matrx t ran of the matrx S t α the frst regularzaton parameter β the second regularzaton parameter g(x) class label of the data pont x Table : Important notatons used n the paper. In ths secton, we brefly revew the Quadratc Dscrmnant Analyss (QDA), some ssues related to the applcaton of QDA on HDLSS data, and the Regularzed Dscrmnant Analyss (). Note that LDA s a specal case of QDA when all classes share a common class covarance. Gven a tranng dataset of n data ponts {(x, y )} n =, where x IR d s the feature vector of the -th data pont, d s the data dmensonalty, y = g(x ) {, 2,, } s the class label of x, and s the number of classes. Let A = [x, x 2,, x n ] IR d n be the data matrx, whch can be decomposed nto classes as A = [A, A 2,, A ], where A contans all data ponts from the -th class. Denote n = A as the sze of the -th class. We have n = = n. Assumng the class denstes follow the normal dstrbuton, we apply the followng classfcaton rule [8, 4]: a test pont x IR d s classfed as class C(x) defned by C(x) = argmn (x µ ) T Σ (x µ ) + ln Σ, () where the centrod µ of -th class s defned as µ = n A e (), (2) e () IR n s a vector of all ones, and the covarance matrx Σ of -th class s defned as Σ = n x A (x µ )(x µ ) T. (3) Note that we have assumed an equal pror for all classes n Eq. () for smplcty. The decson boundary usng the above classfcaton rule s quadratc and the algorthm s thus called Quadratc Dscrmnant Analyss (QDA). In a specal case where all classes share a common covarance, that s, Σ = Σ j, for any class and class j, QDA s reduced to the well-nown Lnear Dscrmnant Analyss (LDA) [5, 7, 9, 4]. The tradtonal QDA formulaton n Eq. () requres all class covarance matrces to be nonsngular. However, for

3 many applcatons nvolvng HDLSS data, such as text document classfcaton, face recognton, and mcroarray gene expresson data analyss, all class covarance matrces may be sngular, snce the data dmensonalty may be much larger than the sample sze of all classes n the tranng dataset. Furthermore, the estmates of the class covarance matrces may be based and unrelable. As ponted out n [8], ths bas s more pronounced, when ther egenvalues tend toward equalty, whle t s correspondngly less severe when ther egenvalues are hghly dsparate. In both cases, ths phenomenon becomes more pronounced as the sample sze decreases. Thus, HDLSS data presents a major challenge for the applcaton of QDA. In [8], Fredman proposed a compromse between LDA and QDA, called Regularzed Dscrmnant Analyss (), whch allows one to shrn the separate covarances of QDA toward a common covarance as n LDA by employng regularzaton technques. Regularzaton has been commonly used n the soluton of ll-posed nverse problems [20], where the number of parameters exceeds the sample sze. In such cases, the parameter estmates can be hghly unstable, gvng rse to hgh varance. By employng a method of regularzaton, one attempts to mprove the estmates by regulatng ths bas varance trade-off. Quadratc Dscrmnant Analyss s ll-posed f n < d for any class. One method of regularzaton s to replace the ndvdual class covarance matrx Σ by S (α) as follows: S (α) = ασ + ( α)s w, (4) where S w s the weghted average of the class covarance matrces, called pooled covarance matrx, or wthn-class scatter matrx [9], whch s defned as S w = n = x A (x µ )(x µ ) T = n = (n Σ ). (5) The regularzaton parameter α taes on values between 0 and. It controls the degree of shrnage of the ndvdual class covarance matrx estmates toward the pooled estmate. The value α = gves rse to QDA, whereas α = 0 yelds LDA. However, the regularzaton n Eq. (4) s stll farly lmted. Frst, t mght not provde enough regularzaton. If the total sample sze, n, s less than the data dmensonalty, d, then even LDA s ll-posed [7, 22]. Second, basng the class covarance matrces toward commonalty may not be the most effectve way to shrn them. Recall that rdge regresson regularzes ordnary lnear least squares regresson by shrnng toward a multple of the dentty matrx [4, 5]. To ths end, a further regularzaton s gven by ˆΣ = βs (α) + ( β)i d, (6) where I d s the dentty matrx of d by d and β s an addtonal regularzaton parameter, whch controls shrnage toward a multple of the dentty matrx. In ths paper, we apply a varant of the regularzed class covarance matrx n Eq. (6), gven by ˆΣ = β (ασ + ( α)s t ) + ( β)i d, (7) where total scatter matrx S t s defned as S t = n n = (x µ)(x µ) T, (8) α [0, ], and β [0, ]. The mnor dfference here les n the matrx S t used n Eq. (7), whle S w s employed n Eq. (6). It s nterestng to note that, when β, the classfcaton rule based on the regularzed class covarance matrx n Eq. (6) may be numercally unstable, whle the one based on the matrx n Eq. (7) s stable even for HDLSS data (see Secton 3). The use of S t nstead of S w has recently been explored n LDA for mprovng numercal stablty [3, 22]. A test pont x n s classfed as class Ĉ(x) gven by Ĉ(x) = argmn (x µ ) T ˆΣ (x µ ) + ln ˆΣ, (9) where ˆΣ s defned n Eq. (7). The performance of may be crtcally dependent on the value of the parameters α and β. Cross-valdaton s commonly used to estmate the optmal α and β from a fnte set, Λ = {(α, β j )}, where =,, r, and j =,, s. The number of canddate pars (α, β) s Λ = rs. In practce, a large number, rs, of canddate pars s often desrable to acheve good classfcaton performance. However, wth a large number of parameter pars, the computatonal cost of model selecton for may be prohbtve for HDLSS data, snce t requres expensve matrx computatons for each canddate par. A drect mplementaton of as the one used n [8] nvolves the formaton of ˆΣ and the nverson of ˆΣ for all. The computaton of the nverson of all class covarance matrces taes O(d 3 ) tme and s prohbtve for HDLSS data, where the data dmensonalty d s large. Ths lmts the applcatons of to low dmensonal data. 3. EFFICIENT MODEL SELECTION FOR In ths secton, we frst establsh a ey property of, whch shows that the classfcaton rule n can be decomposed nto two components. The frst component nvolves matrces of low dmensonalty, whle the second component nvolves matrces of hgh dmensonalty. More mportantly, we show that for a gven test data pont, the second component n the classfcaton rule s constant for all classes, whch has no effect on the classfcaton and can be smply removed. Thus, the computatonal cost of can be sgnfcantly reduced. We call ths the decomposton property of. We show below that the essence of the decomposton property of s that the frst component of the classfcaton rule les n the orthogonal complement of the null space of S t, whch has low dmensonalty for HDLSS data, whle the second component les n the null space of S t of hgh dmensonalty. Defne the between-class scatter matrx S b, used n dscrmnant analyss [9], as follows: S b = n = It follows from the defnton that n (µ µ)(µ µ) T. (0) S t = S b + S w. () We have the followng result concernng the relatonshp between the null space of S t and the null space of Σ and S w :

4 Lemma 3.. Let Σ, S b, S t, and S w be defned as above. The null space of S t denoted as N(S t ) s a subset of the null space, N(S b ) of S b and a subset of the null space, N(Σ ) of Σ, for all. That s, N(S t ) N(S b ) and N(S t ) N(Σ ), for all. Proof. Consder any x N(S t ). That s, S t x = 0 and x T S t x = 0. From Eqs. (5) and (), we have It follows that S t = K = 0 = x T S t x = x T K = K = n n Σ + S b. (2) = n n Σ + S b x n n xt Σ x + x T S b x. Snce Σ, for all, and S b are postve sem-defnte, we have x T Σ x = 0, for all, and x T S b x = 0. It follows that Σ x = 0 and S b x = 0. Therefore, x also les n the null space of Σ and S b. Hence, N(S t ) N(S b ) and N(S t ) N(Σ ). Let S t = UDU T be the Sngular Value Decomposton D (SVD) [0] of S t, where U s orthogonal, D = t 0, 0 0 D t t t s dagonal and t = ran(s t ). Note that t n. Partton U nto U = [U, U 2 ], where U IR d t and U 2 IR d (d t). Then U 2 les n the null space of S t,.e., S t U 2 = 0. We have the followng result concernng the decomposton structure of ˆΣ : Lemma 3.2. Let U = [U, U 2 ] be defned as above and let ˆΣ be defned as n Eq. (7). Then ˆΣ can be expressed as M ˆΣ = U 0 U 0 ( β)i d t T, (3) where and Σ = U T Σ U. M = β α Σ + ( α)d t + ( β)i t, (4) Proof. Recall from Eq. (7) that It follows that ˆΣ = β (ασ + ( α)s t ) + ( β)i d. U T ˆΣ U = β αu T Σ U + ( α)u T S t U + ( β)i d. From Lemma 3., Σ U 2 = 0, for all. It follows that ˆΣ = U β αu T Σ U + ( α)u T S t U + ( β)i d U T = U M 0 0 ( β)i d t U T. Lemma 3.2 mples that all regularzed class covarance matrces share a smlar decomposton structure, whch leads to the decomposton property of as summarzed n the followng proposton: Proposton 3.. Let U, U 2, and M be defned as above. Then the classfcaton rule n Eq. (9) s equvalent to: Ĉ(x)=argmn (x µ ) T U M U T (x µ ) + ln M + ( β) (x µ ) T U 2 U T 2 (x µ ) + ( β) d t. Proof. Denote F from Lemma 3.2 that (5) = (x µ ) T ˆΣ (x µ ). It follows F = (x µ ) T ˆΣ (x µ ) = (x µ ) T M U 0 0 ( β)i d t = (x µ ) T U M U T (x µ ) U T (x µ ) + ( β) (x µ ) T U 2 U T 2 (x µ ). (6) The result follows drectly from Lemma 3.2 and Eq. (6), as ln ˆΣ = ln M + ln ( β)i d t = ln M + ( β) d t. Proposton 3. mples that the classfcaton rule n can be decomposed nto two components as n Eq. (5). The frst component,.e., (x µ ) T U M U T (x µ ), nvolves U, whch les n the orthogonal complement of the null space of S t, whle the second component,.e., ( β) (x µ ) T U 2 U2 T (x µ ) + ( β) d t, nvolves U 2, whch les n the null space of S t. Note that the null space of S t s of dmenson d t, whch s much larger than the dmenson, t, of the orthogonal complement of the null space of S t, for HDLSS data. However, two ssues need to be resolved before we apply the classfcaton rule n Eq. (6). Frst, the computaton may be numercally unstable as β, due to the presence of ( β) n the computaton. Second, fndng the best parameter par (α, β) from a set, Λ, of canddate pars may be expensve, snce U 2 IR d (d t) s of large sze for HDLSS data. Interestngly, both ssues can be addressed smultaneously by smply removng the second term n Eq. (6), based on the lemma below: Lemma 3.3. Let U 2, µ and µ be defned as above, then U T 2 (µ µ) = 0. Thus, U T 2 (x µ ) = U T 2 (x µ), for any x. Proof. Note that U 2 les n the null space of S t. From Lemma 3., U 2 also les n the null space of S b. That s, U T 2 S b = 0. S b n Eq. (0) can be expressed as S b = H b H T b, where H b = n [ n (µ µ), n 2 (µ 2 µ),, n (µ µ)]. It follows from U T 2 S b = 0 that U T 2 H b = 0,.e., U T 2 [ n (µ µ), n 2 (µ 2 µ),, n (µ µ)] = 0, and U T 2 (µ µ) = 0. Hence, U T 2 (x µ ) = U T 2 (x µ). (7) From Lemma 3.3, the classfcaton rule n Eq. (5) can be further smplfed by removng the second component as Ĉ(x) = argmn (x µ ) T U M U T (x µ ) + ln M = argmn where x = U T x and µ = U T µ. ( x µ ) T M ( x µ ) + ln M, (8)

5 3. The computaton of M and M The man computatons n Eq. (8) are the nverson of M and the determnant of M, for all, whch tae O(t 3 ) = O(n 3 ) tme, as M IR t t and t n. Recall from Secton 2 that the drect mplementaton of computes the nverson of ˆΣ for all drectly wth the tme complexty of O(d 3 ), whch s sgnfcantly hgher than O(n 3 ) for HDLSS data. In the followng, we present an effcent way of computng the nverson of M and the determnant of M, for all, wth a tme complexty of O(n 3 /), thus further reducng the complexty of the algorthm. Defne the matrx H IR d n as follows: H = n [A µ (e () ) T ], (9) where A IR d n s the data matrx of the -th class, µ s the centrod of the -th class, and e () s the vector of all ones of length n. Then the class covarance matrx, Σ, of the -th class can be expressed as: It follows that Σ = H H T. (20) Σ = U T H H T U = H HT, (2) where H = U T H. Denote D αβ as the dagonal matrx: From Eq. (4), We have M = αβ H HT + D αβ = Dαβ 0.5 ( αβd 0.5 αβ = Dαβ 0.5 where D αβ = ( α)βd t + ( β)i t. (22) X X T + I t D 0.5 X = αβd 0.5 αβ H )( αβd 0.5 αβ H ) T + I t D 0.5 αβ αβ, (23) H IR t n. It follows from the Sherman-Woodbury-Morrson formula [0] that M = D 0.5 αβ I t X (I n + X T X ) X T D 0.5 αβ. (24) Note that the matrx nverson n Eq. (24) s on N = I n + X T X IR n n. (25) However, the nverson M wll not be formed explctly, as the multplcaton of ( x µ ) T M ( x µ ), taes O(n 2 ) tme, for each x. Note from Eq. (24) that ( x µ ) T M ( x µ ) = ( x µ ) T D αβ ( x µ ) ( x µ ) T D 0.5 αβ X N X T D 0.5 αβ ( x µ ), (26) whch taes O(n 3 ) for computng N and O(nn ) tme for all other computatons, for each x. The total complexty s thus O(nn +n 3 ) tme, for each x. Thus, the computaton of ( x µ ) T M ( x µ ) for all taes O(n 2 + = n3 ) tme. Assumng all classes are of approxmately equal sze, that s, n n/, then the tme complexty for the computaton s O(n 2 + n 3 / 2 ). One ey observaton here s that the computaton of N s ndependent of the test pont x. Note that the total number of test ponts n v-fold cross-valdaton s n/v. In ths case, the total computatonal cost for all test ponts s O(n 3 + n 3 / 2 ), nstead of O(n 3 + n 4 / 2 ). Next, we consder the computaton of M, whch s ndependent of the test pont x. From Eq. (23), M = X X T + I t D αβ = X T X + I n D αβ, (27) where the last equalty follows from the followng lemma: Lemma 3.4. Let X IR t n be any matrx of sze t by n. Then the followng equalty always holds: Proof. Let X = RDS T XX T + I t = X T X + I n. be the SVD of X, where R and S are orthogonal and D = Σ m s dagonal wth Σ m = dag(λ,, λ m ) and m = ran(x). It follows that XX T + I t = R(DD T + I t )R T = DD T + I t = Σ 2 m + I m I t m = m j= (λ 2 j + ), X T X + I n = S(D T D + I n )S T = D T D + I n = Σ 2 m + I m I n m = Thus XX T + I t = X T X + I n. m j= (λ 2 j + ), From Eq. (27), the tme complexty of the computaton of M, for all, s O t n 2 + n 3 = O n n 2 + n 3, = = = = whch s O(n 3 /), assumng all classes are of approxmately equal sze. 3.2 The computaton of U Recall that the ey matrx n the decomposton property of n Proposton 3. s U, whch les n the orthogonal complement of the null space of S t. We have appled the SVD for computng U as S t = UDU T, where U = [U, U 2 ] s a partton of U. When the data dmensonalty d s large, the full SVD computaton of S t IR d d s expensve. However from Lemma 3.3, only the frst component of the classfcaton rule, whch nvolves U, s effectve n, whle U 2, the null space of S t can smply be omtted. Thus, U can be computed effcently wthout the full SVD computaton of S t as follows. Defne matrx H t as: H t = n (A µe T ), (28) where µ s the global centrod and e s the vector of all ones. It follows from the defnton that S t = H t Ht T. Note that H t IR d n, whch s much smaller than S t for HDLSS T data. Let H t = Û ˆΣ ˆV be the reduced SVD of H t, where Û IR d t and ˆV IR n t have orthonormal columns and ˆΣ IR t t s dagonal wth t = ran(h t ). It follows that S t = H t Ht T = Û ˆΣ 2 Û T. Thus, U = Û and D t = ˆΣ 2. The tme complexty of the reduced SVD computaton of H t s O(dn 2 ) [0], nstead of O(d 3 ) for the full SVD computaton.

6 3.3 The man algorthm Let Λ = {(α, β j )}, where =,, r and j =,, s, be the canddate set for the regularzaton parameters. In model selecton, v-fold cross-valdaton s appled, where the data s dvded nto v subsets of (approxmately) equal sze. All subsets are mutually exclusve, and n the -th fold, the - th subset s held out for test and all other subsets are used n tranng. For each (α, β j ), we compute the cross-valdaton accuracy, Accu(, j), defned as the mean of the accuraces for all folds. The best regularzaton par (α, β j ) s the one wth (, j ) = arg max,j Accu(, j). The pseudo-code of the proposed algorthm s gven below. Algorthm Input: data matrx A set of parameters: {α } r = and {β j } s j= Output: the optmal parameter par (α, β j ). For h = : v /* v-fold cross valdaton */ 2. Construct A h and Aĥ; /* A h = h-th fold, for tranng */ /* Aĥ = rest, for testng */ 3. Construct H t usng A h as n Eq. (28); 4. Compute the reduced SVD of H t : H t = Û ˆΣ ˆV T ; 5. t ran(h t ); U Û; D t ˆΣ 2 ; 6. A h L U T Aĥ; AĥL U T Aĥ; /* Null space, U 2, of S t s removed */ 7. Form { H u } u= based on A h L as n Eq. (9); 8. For = : r /* α, α 2,, α r */ 9. For j = : s /* β, β 2,, β s */ 0. D αβ ( α )β j D t + ( β j )I t ;. For u = : 2. X u α β j D 0.5 αβ H u ; 3. N (I + Xu T X u ) as n Eq. (25), 4. Compute M u as n Eq. (27), 5. EndFor 6. temp 0; /* Varable temp counts the number of test ponts correctly classfed */ 7. For each x AĥL /* x = U T x and x Aĥ */ 8. C(x) argmn u {(x µ u ) T Mu (x µ u ) 9. +ln M u }; /* The multplcaton s done as n Eq. (26) */ 20. If (C(x) == g(x)) temp temp +; 2. EndFor 22. Accu(h,, j) temp/ AĥL ; /* AĥL denotes the number of test ponts */ 23. EndFor 24. EndFor 25. EndFor v h= 26. Accu(, j) Accu(h,, j); /* Accu(, j) v denotes the cross-valdaton accuracy */ 27. (, j ) arg max,j Accu(, j); 28. Output (α, β j ) as the best parameter par. 3.4 Tme Complexty Lne 4 taes O(n 2 d) tme for the reduced SVD computaton [0]. Lnes 5 and 6 tae O(dn 2 ) tme for the matrx multplcatons. The For loop from Lne to Lne 5 taes O(n 3 /) tme. There are about n/v elements n AĥL, thus Lne 8 to Lne 20 wthn the For loop run about n/v tmes. Followng the multplcaton n Eq. (26), the computatons from Lne 8 to Lne 20 tae O(n 2 ) tme, and the For loop from Lne 7 to Lne 2 tae O(n 3 /v) tme. Thus, the double For loops from Lne 8 to Lne 26 tae O n 3 rs(/ + /v) tme. The total runnng tme of the algorthm s thus It follows that T (r, s) T (, ) T (r, s) = O v(n 2 d + n 3 rs(/ + /v)) = O vn 2 (d + nrs(/ + /v)). d + nrs(/ + /v) nrs(/ + /v) + d + n(/ + /v) d + n(/ + /v). For HDLSS data, where the sample sze n s much smaller than the data dmensonalty d,.e., n d, the overhead of estmatng the optmal regularzaton par among a large search space may be small. Note that the frst stage of taes O(vn 2 d) tme, whch s expensve for HDLSS data. However, t s ndependent of the parameters. In the second stage of, the most expensve steps are the computatons of M and M, whch tae O vn 3 rs(/ + /v) tme. It s ndependent of the data dmensonalty d. Ths s the ey reason why the proposed algorthm s applcable for HDLSS data for a large canddate set of parameters. 3.5 versus We conclude ths secton by showng an nterestng relatonshp between and Uncorrelated LDA (LDA) [23]. s an extenson of the orgnal formulaton n [6] to hgh dmensonal, small sample sze data. It follows the basc framewor of LDA [5, 7, 9, 4] that computes the optmal transformaton (projecton) by mnmzng the rato of the wthn-class dstance to the between-class dstance, thus achevng maxmum dscrmnaton. One ey property of s that the features n the transformed space are uncorrelated, thus ensurng mnmum redundancy among the features n the reduced space. It has been appled successfully n mcroarray gene expresson data analyss [24]. It was shown n [22] that the optmal transformaton G of conssts of the frst q egenvectors of S + t S b, where q = ran(s b ). Wth the computed G, a test pont x can be classfed n as class h, where h = arg mn G T (x µ ) 2. It has been shown n [22] that arg mn (x µ ) T S + t (x µ ) = arg mn G T (x µ ) 2. Interestngly, we can show that the lmt of when α 0 and β s equvalent to as summarzed n the followng lemma: Theorem 3.. The classfcaton rule n approaches that of, when α 0 and β. That s, f G s the transformaton of. Then, lm Ĉ(x) = arg mn G T (x µ ) 2 α 0,β. Proof. From Eq. (8), the classfcaton rule n s equvalent to Ĉ(x) = argmn ( x µ ) T M ( x µ ) + ln M, (29)

7 sze dmensonalty # of classes dataset (n) (d) () re re ORL PIX ALL ALLAML Table 2: Statstcs for our test datasets. From Eq. (4), we have lm α 0,β M = D t. Thus lm Ĉ(x) = argmn α 0,β = argmn = argmn ( x µ ) T D t ( x µ ) + ln D t, (x µ ) T U D t U T (x µ ) (x µ ) T S + t (x µ ), whch s equvalent to arg mn G T (x µ ) 2, the classfcaton rule n. Theorem 3. shows that s a specal case of when α = 0 and β =. Wth a properly chosen parameter par (α, β) through cross-valdaton, s expected to outperform, whch s confrmed by the emprcal results presented n the next secton. Note that the lmt of ˆΣ, as α 0 and β does not exst for HDLSS data, as the lmt of ˆΣ = β (ασ + ( α)s t ) + ( β)i d s sngular, when d > n. However, Theorem 3. shows that the lmt below exsts: lm (x µ ) T ˆΣ (x µ ) = (x µ ) T S + t (x µ ), α 0,β due to the decomposton property of as n Eq. (8). 4. EXPERIMENTS In ths secton, we expermentally evaluate the performance of the proposed algorthm. v-fold cross valdaton wth v = 5 has been used n for model selecton. All of our experments have been performed on a P4 3.00GHz Wndows XP machne wth 2GB memory. 4. Datasets We have used three types of HDLSS data for the evaluaton, ncludng text documents, face mages, and gene expresson data. The mportant statstcs of these datasets are summarzed below (see also Table 2): re0 and re are two text document datasets, derved from Reuters-2578 text categorzaton test collecton Dstrbuton.0 [8]. re0 ncludes 320 documents belongng to 4 dfferent classes. The dmenson of ths dataset s re has 5 classes, each wth 98 nstances; ts dmenson s ORL s a face mage dataset, whch contans 400 face mages of 40 ndvduals. The mage sze s The face mages are perfectly centralzed. The major challenge on ths dataset s the varaton of the face pose. There s no lghtng varaton, wth mnmal facal expresson varatons, and no occluson. We use the whole mage as an nstance (.e., the dmenson of an nstance s 92 2 = 0304). PIX s a face mage dataset, whch contans 300 face mages of 30 ndvduals. The mage sze s We subsample the mages wth a sample step of 5 5, and the dmenson of each nstance s reduced to = ALL s a gene expresson dataset consstng of sx dagnostc groups [25]. The breadown of the samples s: 5 samples for BCR, 27 samples for E2A, 64 samples for Hyperdp, 20 samples for MLL, 43 samples for T, and 79 samples for TEL. ALLAML4 s a gene expresson dataset, whch contans the gene expresson profles of two acute cases of leuema: acute lymphoblastc leuema (ALL) and acute myeloblastc leuema (AML). The ALL part of the dataset comes from two sample types, B-cell and T-cell, and the AML part s splt nto bone marrow samples and perpheral blood. Ths dataset was fst studed n the semnal paper of Golub et al. []. Golub et al. studed ths problem to address the bnary classfcaton problem between the AML samples and the ALL samples. ALLAML4 s a four-class dataset (B-cell, T-cell, AML-BM, and AML PB). 4.2 Effcency In ths experment, we test the effcency of the proposed algorthm. Table 3 shows the computatonal tme (n seconds) of on dfferent numbers of parameter pars r s. We set r = s for smplcty wth r tang values from to 32, thus the sze of the canddate set, Λ ranges from to 024. It s clear from the table that the computatonal cost of grows slowly as r s s small. When r s s large, the cost, T (r, s), of the proposed algorthm s stll sgnfcantly smaller than rst (, ), the computatonal cost of wthout applyng the optmzatons proposed n ths paper. For example, we can observe that T (6, 6)/T (, ) on dfferent datasets s less than 7, whch s sgnfcantly smaller than 6 6 = 256, whle T (32, 32)/T (, ) s less than 25 n all cases, much smaller than = 024. Among all datasets, the document datasets have relatvely larger ncreasng rates of runnng tme than the others, whle the gene expresson datasets have the smallest ncreasng rates. Note that the rato of the sample sze to the data dmensonalty,.e., n/d, s relatvely large for both document datasets, whle t s relatvely small for both gene expresson datasets. These results are consstent wth the theoretcal estmaton of the effcency n Secton Classfcaton performance In ths experment, we evaluate n classfcaton and compare t wth Uncorrelated LDA [23] and Support Vector Machnes () [2, 4, 2]. For each dataset, we frst set the percentage of the data for tranng to be ether /2 or /3 (by a random partton). Then we apply the proposed algorthm, as well as and, on the tranng data to learn the model, whch s further appled to the remanng Note that the cost of wll be even hgher f the decomposton property of from ths paper s not appled.

8 Classfcaton Accuracy Classfcaton Accuracy re0 (rato = /2) re (rato = /2) Classfcaton Accuracy Classfcaton Accuracy re0 (rato = /3) re (rato = /3) Fgure : Comparson of,, and n classfcaton accuracy usng re0 and re. The x-axs denotes 30 dfferent parttons nto tranng and testng set, where rato s the percentage of the data for tranng. # of parameter pars (r s) dataset re re ORL PIX ALL ALLAML Table 3: Computatonal tme (n seconds) of for dfferent numbers of parameter pars. test data to get the accuracy of classfcaton. To gve a better estmaton of accuracy, the procedure s repeated 30 tmes and the resultng accuraces are averaged. Note that for, we choose the optmal model from 900 parameter pars wth r = 30 and s = 30. Because of the mproved effcency of the proposed algorthm, t s practcal to select the optmal model from such a large search space. The classfcaton accuraces of the 30 dfferent parttons for all sx datasets are shown n Fg. 3. In Table 4, we report the mean accuracy and standard dervaton of the 30 dfferent parttons for each dataset, where rato denotes the percentage of the data for tranng and s ether /3 or /2 n our experments. For all the datasets, the performance usng rato = /2 s better than that usng rato = /3 n terms of classfcaton accuracy. Ths conforms to our expectaton that the classfcaton performance may be mproved wth a larger number of tranng data. We can observe from the accuracy curves n Fg. 3 that and often follow smlar trends. For both document datasets re and re0, all three algorthms acheve comparable performance n re (all three accuracy curves n Fg. are very close to each other), whle and outperform n re0. For both face mage datasets, and outperform by a large margn, whle acheves slghtly hgher accuraces than. As for both gene expresson datasets, the accuracy curves are smlar for all three algorthms, whle acheves a smaller overall varance than and. Overall, s very compettve wth and n classfcaton. Recall from Theorem 3. that s a specal case of when α = 0 and β =. Wth a properly chosen parameters, s expected to outperform, whch s confrmed by our emprcal results above. 5. CONCLUSIONS We present n ths paper a novel algorthm for that s applcable for hgh dmensonal, low sample sze data. s a compromse between LDA and QDA, regulated by two regularzaton parameters. A major advantage of the proposed algorthm s ts low computatonal cost n selectng the optmal parameters from a large canddate set, n comparson wth the tradtonal formulaton. Thus t facltates effcent model selecton for. The ey to the proposed effcent model selecton procedure les n the decomposton property of establshed n ths paper. We evaluate the proposed algorthm usng document, mage, and gene expresson datasets. s compared wth and n classfcaton. Results confrm the hgh effcency of the proposed algorthm. Our experments also demonstrate that wth the proposed effcent model selecton algorthm, can be effectvely appled to hgh dmensonal, low sample sze data. The relatve performance of over vares a lot for dfferent types of data. outperforms for both

9 Classfcaton Accuracy Classfcaton Accuracy 5 5 ORL (rato = /2) PIX (rato = /2) Classfcaton Accuracy Classfcaton Accuracy ORL (rato = /3) PIX (rato = /3) Fgure 2: Comparson of,, and n classfcaton accuracy usng ORL and PIX. The x-axs denotes 30 dfferent parttons nto tranng and testng set, where rato s the percentage of the data for tranng. mage datasets by a large margn, whle they are comparable for both gene expresson datasets. One of the future wor s to study the effect of the characterstcs of the data on the performance of. We also plan to apply to other applcatons nvolvng HDLSS data such as gene expresson pattern mages, proten expresson data, etc. Table 4: Comparson of classfcaton accuracy (n percentage) of,, and. The mean and standard devaton of 30 dfferent parttons wth a rato of /2 and /3 are reported. dataset rato mean std mean std mean std re0 / re0 / re / re / ORL / ORL / PIX / PIX / ALL / ALL / ALLAML4 / ALLAML4 / Acnowledgements Research of JY s sponsored, n part, by the Center for Evolutonary Functonal Genomcs of the Bodesgn Insttute at the Arzona State Unversty. 6. REFERENCES [] P.N. Belhumeour, J.P. Hespanha, and D.J. Kregman. Egenfaces vs. Fsherfaces: Recognton usng class specfc lnear projecton. IEEE Trans Pattern Analyss and Machne Intellgence, 9(7):7 720, 997. [2] C. J. C. Burges. A tutoral on support vector machnes for pattern recognton. Data Mnng and Knowledge Dscovery, 2(2):2 67, 998. [3] L.F. Chen, H.Y.M. Lao, M.T. Ko, J.C. Ln, and G.J. Yu. A new LDA-based face recognton system whch can solve the small sample sze problem. Pattern Recognton, 33:73 726, [4] N. Crstann and J.S. Taylor. Support Vector Machnes and other Kernel-based Learnng Methods. Cambrdge Unversty Press, [5] R.O. Duda, P.E. Hart, and D. Stor. Pattern Classfcaton. Wley, [6] S. Dudot, J. Frdlyand, and T. P. Speed. Comparson of dscrmnaton methods for the classfcaton of tumors usng gene expresson data. Journal of the Amercan Statstcal Assocaton, 97(457):77 87, [7] R.A. Fsher. The use of multple measurements n taxonomc problems. Annals of Eugencs, 7:79 88, 936. [8] J.H. Fredman. Regularzed dscrmnant analyss. Journal of the Amercan Statstcal Assocaton, 84(405):65 75, 989. [9] K. Fuunaga. Introducton to Statstcal Pattern Classfcaton. Academc Press, USA, 990. [0] G. H. Golub and C. F. Van Loan. Matrx Computatons. The Johns Hopns Unversty Press, USA, thrd edton, 996. [] T.R. Golub and et al. Molecular classfcaton of cancer: class dscovery and class predcton by gene

10 Classfcaton Accuracy Classfcaton Accuracy ALL (rato = /2) ALLAML4 (rato = /2) Classfcaton Accuracy Classfcaton Accuracy 5 5 ALL (rato = /3) ALLAML4 (rato = /3) Fgure 3: Comparson of,, and n classfcaton accuracy usng ALL and ALLAML4. The x-axs denotes 30 dfferent parttons nto tranng and testng set, where rato s the percentage of the data for tranng. expresson montorng. Scence, 286:53 537, 999. [2] U. Grouven, F. Bergel, and A. Schultz. Implementaton of lnear and quadratc dscrmnant analyss ncorporatng costs of msclassfcaton. Computer Methods and Programs n Bomedcne, 49():55 60, 996. [3] P. Hall, J.S. Marron, and A. Neeman. Geometrc representaton of hgh dmenson, low sample sze data. Journal of the Royal Statstcal Socety seres B, 67: , [4] T. Haste, R. Tbshran, and J.H. Fredman. The Elements of Statstcal Learnng : Data Mnng, Inference, and Predcton. Sprnger, 200. [5] A. Hoerl and R. Kennard. Rdge regresson: Based estmaton for nonorthogonal problems. Technometrcs, 2(3):55 67, 970. [6] Z. Jn, J. Y. Yang, Z.S. Hu, and Z. Lou. Face recognton based on the uncorrelated dscrmnant transformaton. Pattern Recognton, 34:405 46, 200. [7] W.J. Krzanows, P. Jonathan, W.V McCarthy, and M.R. Thomas. Dscrmnant analyss wth sngular covarance matrces: methods and applcatons to spectroscopc data. Appled Statstcs, 44:0 5, 995. [8] D.D. Lews. Reuters-2578 text categorzaton test collecton dstrbuton.0. lews, 999. [9] D. L. Swets and J. Weng. Usng dscrmnant egenfeatures for mage retreval. IEEE Trans Pattern Analyss and Machne Intellgence, 8(8):83 836, 996. [20] A. N. Thonov and V. Y. Arsenn. Solutons of Ill-posed problems. John Wley and Sons, Washngton D.C., 977. [2] V.N. Vapn. Statstcal Learnng Theory. Wley, 998. [22] J. Ye. Characterzaton of a famly of algorthms for generalzed dscrmnant analyss on undersampled problems. Journal of Machne Learnng Research, 6: , [23] J. Ye, R. Janardan, Q. L, and H. Par. Feature extracton va generalzed uncorrelated lnear dscrmnant analyss. In ICML Conference Proceedngs, [24] J. Ye, T. L, T. Xong, and R. Janardan. Usng uncorrelated dscrmnant analyss for tssue classfcaton wth gene expresson data. IEEE/ACM Trans. Computatonal Bology and Bonformatcs, (4):8 90, [25] E.J. Yeoh et al. Classfcaton, subtype dscovery, and predcton of outcome n pedatrc lymphoblastc leuema by gene expresson proflng. Cancer Cell, (2):33 43, [26] L. Zhang and L. Luo. Splce ste predcton wth quadratc dscrmnant analyss usng dversty measure. Nuclec Acds Research, 3(2): , [27] M. Zhang. Identfcaton of proten codng regons n the human genome by quadratc dscrmnant analyss. Proceedngs of the Natonal Academy of Scences, USA, 94: , 997.

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Fisher Linear Discriminant Analysis

Fisher Linear Discriminant Analysis Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

A Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition

A Novel Biometric Feature Extraction Algorithm using Two Dimensional Fisherface in 2DPCA subspace for Face Recognition A Novel ometrc Feature Extracton Algorthm usng wo Dmensonal Fsherface n 2DPA subspace for Face Recognton R. M. MUELO, W.L. WOO, and S.S. DLAY School of Electrcal, Electronc and omputer Engneerng Unversty

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980 MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and

More information

Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han

Subspace Learning Based on Tensor Analysis. by Deng Cai, Xiaofei He, and Jiawei Han Report No. UIUCDCS-R-2005-2572 UILU-ENG-2005-1767 Subspace Learnng Based on Tensor Analyss by Deng Ca, Xaofe He, and Jawe Han May 2005 Subspace Learnng Based on Tensor Analyss Deng Ca Xaofe He Jawe Han

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Tensor Subspace Analysis

Tensor Subspace Analysis Tensor Subspace Analyss Xaofe He 1 Deng Ca Partha Nyog 1 1 Department of Computer Scence, Unversty of Chcago {xaofe, nyog}@cs.uchcago.edu Department of Computer Scence, Unversty of Illnos at Urbana-Champagn

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Non-linear Canonical Correlation Analysis Using a RBF Network

Non-linear Canonical Correlation Analysis Using a RBF Network ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Comparison of Wiener Filter solution by SVD with decompositions QR and QLP

Comparison of Wiener Filter solution by SVD with decompositions QR and QLP Proceedngs of the 6th WSEAS Int Conf on Artfcal Intellgence, Knowledge Engneerng and Data Bases, Corfu Island, Greece, February 6-9, 007 7 Comparson of Wener Flter soluton by SVD wth decompostons QR and

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem H.K. Pathak et. al. / (IJCSE) Internatonal Journal on Computer Scence and Engneerng Speedng up Computaton of Scalar Multplcaton n Ellptc Curve Cryptosystem H. K. Pathak Manju Sangh S.o.S n Computer scence

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Pattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:

Pattern Recognition 42 (2009) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage: Pattern Recognton 4 (9) 764 -- 779 Contents lsts avalable at ScenceDrect Pattern Recognton ournal homepage: www.elsever.com/locate/pr Perturbaton LDA: Learnng the dfference between the class emprcal mean

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Lecture 4: Constant Time SVD Approximation

Lecture 4: Constant Time SVD Approximation Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Feb 14: Spatial analysis of data fields

Feb 14: Spatial analysis of data fields Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information