Array Variate Random Variables with Multiway Kronecker Delta Covariance Matrix Structure

Size: px

Start display at page:

Download "Array Variate Random Variables with Multiway Kronecker Delta Covariance Matrix Structure"

Horace Collins
5 years ago
Views:

1 Array Variate Randm Variables with Multiway Krnecker Delta Cvariance Matrix Structure Deniz Akdemir Department f Statistics University f Central Flrida Orland, FL Arjun K. Gupta Department f Mathematics and Statistics Bwling Green State University Bwling Green, Ohi February 27, 2011 Abstract Standard statistical methds applied t matrix randm variables ften fail t describe the underlying structure in multiway data sets. After a review f the essential backgrund material, this paper intrduces the ntin f array variate randm variable. A nrmal array variate randm variable is defined and a methd fr estimating the parameters f array variate nrmal distributin is given. We intrduce a technique called slicing fr estimating the cvariance matrix f high dimensinal data. Finally, principal cmpnent analysis and classificatin techniques are develped fr array variate bservatins and high dimensinal data. AMS 2000 Subject Classificatin: Primary 62H10, Secndary 62H05. Keywrds & Phrases: Nrmal Distributin, Multivariate Distributin, Matrix Variate Nrmal Distributin, Array Variate Randm Variable, Array Variate Nrmal Distributin, Multilevel Data Analysis, Repeated Measures, Classificatin, Dimensin Reductin 1 Intrductin A data structure is a particular way f string and rganizing data. Different kinds f data structures are suited fr different kinds f applicatins. In this paper, we will study cntinuus prbability distributins fr data that has array structure. A ne dimensinal array is a scalar, a tw dimensinal array is a matrix, a three dimensinal array is a stack f matrices, a fur dimensinal array is a stack 1

2 f three dimensinal arrays, etc. Type f data that statistical analysis entails is usually best rganized in array frm. T stre ne bservatin f a m 1 -variate randm variable, we use a m 1 1 matrix ( a vectr), a tw dimensinal array. If we have a randm sample f m 2 m 1 -variate bservatins, we can rganize this in a m 1 m 2 matrix, again a tw dimensinal array. It is als very cmmn that a data describes the bservatin f m 1 variables, n m 2 individuals at m 3 time pints. This kind f data set is suitable t be rganized in a three dimensinal array f dimensins m 1 m 2 m 3. If bth m 3 time pints and m 4 different places are bserved then this can be stred in a fur dimensinal array f dimensins m 1 m 2 m 3 m 4. The assciated randm variable in each f these cases has array structure. The array variate randm variable up t 2 dimensins has been studied intensively in [Gupta and Nagar, 2000] and by many thers. Hwever fr arrays bservatins f 3, 4 r in general i dimensins n apprpriate prbability mdels have been prpsed. Figure 1 illustrates N bservatins f a 4 way array variable. Repeated measurements (acrss time, acrss space r bth) can be assumed t have array structure. Mst applicatins in spatial statistics invlve mdeling f cmplex spatial tempral dependency structures, and many f the prblems f space and time mdeling and seperatin f effects can be slved by using the results in this paper. In the literature, this kind f data is usually refered as multiway data [Krnenberg and Crpratin, 2008] and we assume that multiway data is in array frm N ,2,3,4 1,2,3,4,..,10 1,2,3,4,5 Figure 1: The figure illustrates N bservatins f a 4 way array variate variable. In Sectin 2, we first study the algebra f arrays, and als intrduce the cncept f an array variable randm variable and a nrmal mdel fr array variables. In sectin 3, estimatin methd fr the nrmal mdel is described, 2

3 als describe a technique called slicing fr nnsingular estimatin f high dimensinal cvariance matrices when N < p. Then, in Sectins 4 and 5, we give applicatins f the mdel t principal cmpnent analysis and classificatin. 2 Array Algebra and Array Variate Randm Variables 2.1 Array Algebra In this paper we will nly study arrays with real elements. We will write X t say that X is an array. When it is necessary we can write the dimensins f the array as subindices, e.g., if X is a m1 m 2 m 3 m 4 dimensinal array in R m1 m2... mi, then we can write X m1 m 2 m 3 m 4. Arrays with the usual elementwise summatin and scalar multiplicatin peratins can be shwn t be a vectr space. T refer t an element f an array X m1 m 2 m 3 m 4, we write the psitin f the element as a subindex t the array name in parenthesis, ( X) r1r 2r 3r 4. If we wanttrefertaspecificclumnvectrbtainedbykeepingallbutanindicated dimensin cnstant, we indicate the cnstant dimensins as befre but we will put : fr the nn cnstant dimensin, e.g., fr X m1 m 2 m 3 m 4, ( X) r1r 2:r 4 refers t the the clumn vectr ((X) r1r 21r 4,(X) r1r 22r 4,...,(X) r1r 2m 3r 4 ). We will nw review sme basic principles and techniques f array algebra. These results and their prfs can be fund in Rauhala [Rauhala, 1974], [Rauhala, 1980] and Blaha [Blaha, 1977]. Definitin 2.1. Inverse Krnecker prduct f tw matrices A and B f dimensins p q and r s crrespndingly is written as A i B and is defined as A i B = [A(B) jk ] pr qs = B A, where represents the rdinary Krnecker prduct. The fllwing prperties f the inverse Krnecker prduct are useful: 0 i A = A i 0 = 0. (A 1 +A 2 ) i B = A 1 i B +A 2 i B. A i (B 1 +B 2 ) = A i B 1 +A i B 2. αa i βb = αβa i B. (A 1 i B 1 )(A 2 i B 2 ) = A 1 A 2 i B 1 B 2. (A i B) 1 = (A 1 i B 1 ). (A i B) + = (A + i B + ), where A + is the Mre-Penrse inverse f A. (A i B) = (A i B ), where A is the l-inverse f A defined as A = (A A) 1 A. 3

4 If {λ i } and {µ j } are the eigenvalues with the crrespnding eigenvectrs {x i } and {y j } fr matrices A and B respectively, then A i B has eigenvalues {λ i µ j } with crrespnding eigenvectrs {x i i y j }. Given tw matrices A n n and B m m A i B = A m B n, tr(a i B) = tr(a)tr(b). A i B = B A = U 1 A BU 2, fr sme permutatin matrices U 1 and U 2. It is well knwn that a matrix equatin AXB = C can be rewritten in its mnlinear frm as Furthermre, the matrix equality A i Bvec(X) = vec(c). (1) A i BXC = E btained by stacking equatins f the frm (1) can be written in its mnlinear frm as (A i B i C)vec(X) = vec(e). This prcess f stacking equatins culd be cntinued and R-matrix multiplicatin peratin intrduced by Rauhala [Rauhala, 1974] prvides a cmpact way f representing these equatins in array frm: Definitin 2.2. R-Matrix Multiplicatin is defined elementwise: ((A 1 ) 1 (A 2 ) 2...(A i ) i Xm1 m 2... m i ) q1q 2...q i m 1 m 2 m 3 m i = (A 1 ) q1r 1 (A 2 ) q2r 2 (A 3 ) q3r 3... (A i ) qir i ( X) r1r 2...r i. r 1=1 r 2=1 r 3=1 R-Matrix multiplicatin generalizes the matrix multiplicatin (array multiplicatin in tw dimensins)t the case f k-dimensinal arrays. The fllwing useful prperties f the R-Matrix multiplicatin are reviewed by Blaha [Blaha, 1977]: (A) 1 B = AB. (A 1 ) 1 (A 2 ) 2 C = A 1 CA 2. Ỹ = (I)1 (I) 2...(I) i Ỹ. r i=1 ((A 1) 1 (A 2) 2...(A i) i )((B 1) 1 (B 2) 2...(B i) i )Ỹ = (A1B1)1 (A 2B 2) 2...(A ib i) i Ỹ. The peratr rvec describes the relatinship between X m1 m 2...m i and its mnlinear frm x m1m 2...m i 1. 4

5 Definitin 2.3. rvec( X m1 m 2...m i ) = x m1m 2...m i 1 where x is the clumn vectr btained by stacking the elements f the array X in the rder f its dimensins; i.e., ( X) j1j 2...j i = (x) j where j = (j i 1)n i 1 n i 2...n 1 + (j i 2)n i 2 n i 3...n (j 2 1)n 1 +j 1. Let L m1 m 2...m i = (A 1 ) 1 (A 2 ) 2...(A i ) i X where (Aj ) j is an m j n j matrix fr j = 1,2,...,i and X is an n 1 n 2... n i array. Write l = rvec( L) and x = rvec( X). Then, l = A 1 i A 2 i... i A i x. Therefre, there is an equivalent expressin f the array equatin in mnlinear frm. Definitin 2.4. The square nrm f Xm1 m 2...m i is defined as X 2 = m 1 m 2 j 1=1j 2=1 m i... (( X) j1j 2...j i ) 2. Definitin 2.5. The distance f X 1m1 m2...m i frm X 2m1 m2...m i is defined as X 1 X 2 2. Example 2.1. Let Ỹ = (A 1) 1 (A 2 ) 2...(A i ) i X + Ẽ. Then Ẽ 2 is minimized fr X = (A 1 ) 1 (A 2 )2...(A i )i Ỹ. j i=1 2.2 Array Variate Randm Variables Arrays can be cnstant arrays, i.e. if ( X) r1r 2...r i R are cnstants fr all r j, j = 1,2,...,m j and j = 1,2,...,i then the array X is a cnstant array. Array variate randm variables are arrays with all elements ( X) r1r 2...r i R randm variables. If the sample space fr the randm utcme s is S, ( X) r1r 2...r i = ( X(s)) r1r 2...r i whereeach f( X(s)) r1r 2...r i is arealvalued functin frm S t R. If X is an array variate randm variable, its density (if it exists) is a scalar functin f X( X) such that: f X( X) 0; X f X( X)d X = 1; P( X A) = A f X( X)d X, where A is a subset f the space f realizatins fr X. A scalar functin f X, Ỹ ( X,Ỹ) defines a jint (bi-array variate) prbability density functin if f X, Ỹ ( X,Ỹ) 0; Ỹ X f X, Ỹ ( X,Ỹ)d XdỸ = 1; 5

6 P(( X,Ỹ) A) = A f X, Ỹ ( X,Ỹ)d XdỸ, where A is a subset f the space f realizatins fr ( X,Ỹ). The marginal prbability density functin f X is defined by f X( X) = f X, Ỹ ( X,Ỹ)dỸ, Ỹ and the cnditinal prbability density functin f X given Ỹ is defined by f X Ỹ ( X Ỹ) = f X, Ỹ ( X,Ỹ) fỹ(ỹ), where fỹ(ỹ) > 0. Tw randm arrays X and Ỹ are independent if and nly if f X, Ỹ ( X,Ỹ) = f X( X)fỸ(Ỹ). Therem 2.1. Let (A 1 ) 1, (A 2 ) 2,..., (A i ) i be m 1, m 2,..., m i dimensinal psitive definite matrices. The Jacbian J( X Z) f the transfrmatin X = (A 1 ) 1 (A 2 ) 2...(A i ) i Z + M is ( A 1 j 1 mj A 2 j 2 mj... A i j i mj ) 1. Prf. The result is prven using the equivalence f mnlinear frm btained thrugh the rvec( X) and array X. Let L m1 m 2...m i = (A 1 ) 1 (A 2 ) 2...(A i ) i Z where (A j ) j is an m j n j matrix fr j = 1,2,...,i and X is an n 1 n 2... n i array. Write l = rvec( L) and z = rvec( Z). Then, l = A 1 i A 2 i... i A i z. The resultfllws frmntingthatj(l z) = A 1 i A 2 i... i A i 1,andusinginductin with the rule A i B = A m B n fr n n matrix A and m m matrix B t shw that A 1 i A 2 i... i A i 1 = ( A 1 j 1 m j A 2 j 2 m j... A i j i m j ) 1. Crllary 2.1. Let Z f Z( Z). Define X = (A 1 ) 1 (A 2 ) 2...(A i ) i Z + M where (A 1 ) 1,(A 2 ) 2,...,(A i ) i be m 1,m 2,...,m i dimensinal psitive definite matrices. The pdf f X is given by f X( X;(A 1) 1,(A 2) 2,...,(A i) i, M) = f(a 1 1 )1 (A 1 2 )2...(A 1 i ) i ( X M)) A 1 j 1 m j A 2 j 2 m j... A. i j i m j The main advantage in chsing a Krnecker structure is the decrease in the number f parameters. In Sectin 3, we use Krnecker delta cvariance structures t prvide a regularized estimatr f cvariance structure f an array. 6

7 2.2.1 Array Variate Nrmal Distributin By using the results in the previus sectin n array algebra, mainly the relatinship f the arrays t their mnlinear frms described by Definitin 2.3, we can write the density f the standard nrmal array variable. Definitin 2.6. If Z N m1 m 2... m i ( M = 0,Λ = I m1m 2...m i ), then Z has array variate standard nrmal distributin. The pdf f Z is given by f Z( Z) = exp( 1 2 Z 2 ) (2π) m1m2...mi/2. (2) Fr the scalar case, the density fr the standard nrmal variable z R 1 is given as φ 1 (z) = 1 exp( 1 (2π) z2 ). Fr the m 1 dimensinal standard nrmal vectr z R m1, the density is given by 1 φ m1 (z) = exp( 1 (2π) m z z). Finally the m 1 m 2 standard matrix variate variable Z R m1 m2 has the density 1 φ m1 m 2 (Z) = exp( 1 2 trace(z Z)). (2π) m 1 m 2 2 With the abve definitin, we have generalized the ntin f nrmal randm variable t the array variate case. Definitin 2.7. We write X N m1 m 2... m i ( M,Λ m1m 2...m i ) if rvec( X) N m1m 2...m i (rvec( M),Λ m1m 2...m i ). Here, M is the expected value f X, and Λm1m 2...m i is the cvariance matrix f the m 1 m 2...m i -variate randm variable rvec( X). The family f nrmal densities with Krnecker Delta Cvariance Structure are btained by cnsidering the densities btained by the lcatin-scale transfrmatins f the standard nrmal variables. This kind f mdel is defined in the next therem. Therem 2.2. Let Z N m1 m 2... m i ( M = 0,Λ = I m1m 2...m i ). Define X = (A 1 ) 1 (A 2 ) 2...(A i ) i Z + M where A1,A 2,...,A i are nn singular matrices f rders m 1,m 2,...,m i. Then the pdf f X is given by φ( X; M,A exp( 1 2 1,A 2,...A i) = (A 1 1 )1 (A 1 2 )2...(A 1 i ) i ( X M) 2 ) (2π) m 1m 2...m i /2 A 1 j 1 m j A 2 j 2 m j... A. i j i m j (3) 7

8 Array variate densities in the elliptical family are easily cnstructed using Crrlary 2.1. Fr example, the fllwing definitin prvides a generalizatin f the Student s t distributin t the array variate case. Definitin 2.8. Let A 1,A 2,...,A i be nn singular matrices with rders m 1, m 2,..., m i and M be a m 1 m 2... m i cnstant array. Then the pdf f an m 1 m 2... m i array variate t randm variable, T, with degrees f freedm k is given by f( T; M,A 1,A 2,...A i) = c (1+ (A 1 1 )1 (A 1 2 )2...(A 1 i ) i ( T M) 2 ) (k+m 1m 2...m i )/2 A 1 j 1 m j A 2 j 2 m j... A i j i m j where c = (kπ)m 1 m 2...m i /2 Γ((k+m 1m 2...m i)/2) Γ(k/2). Distributinal prperties f a array nrmal variable with density in the frm f Therem 2.2 can btained by using the equivalent mnlinear representatin. The mments, the marginal and cnditinal distributins, independence f variates shuld be studied cnsidering the equivalent mnlinear frm f the array variable and the well knwn prperties f the multivariate nrmal randm variable. 3 Estimatin In this sectin we prvide an heuristic methd f estimating the mdel parameters. The ptimality f these estimatrs are nt prven but merely checked by simulatin studies. Inference abut the parameters f the mdel in Therem 2.2 fr the matrix variate case has been cnsidered in the statistical literature (Ry and Khattree [2003], [Ry and Leiva, 2008], Lu and Zimmerman [2005], [Srivastava et al., 2008], etc.). In these papers, the unique maximum likelihd estimatrs f the parameters f the mdel in Therem 2.2 fr the matrix variate case are btained under different assumptins fr the cvariance parameters. Sme classificatin rules based n the matrix variate bservatins with Krnecker delta cvariance structures have been studied in [Ry and Leiva, 2009], and als in [Krzyśk and Skrzybut, 2009]. The mdel in Therem 2.2 the way it is stated is unidentifiable. Hwever, this prblem can easily be reslved by putting restrictins n the cvariance parameters. The apprach we take is t assume that j 1 f the last diagnal elements f matrices A j A j are equal t 1 fr j = 1,2,...,i. The Flip-Flp Algrithm is prven t attain the maximum likelihd estimatrs f the parameters f tw dimensinal array variate nrmal distributin [Srivastava et al., 2008]. The fllwing is similar t the flip flp algrithm. First, assume { X 1, X2,..., X N } is a randm sample frm a N( M,A 1,A 2,...A i ) distributin with j 1 f the last diagnal elements f matrices A j A j equal t 1 fr j = 1,2,...,i. Further, we assume that all A js are square psitive definite matrices f rank at least j. Finally, assume that we have N i j=1 m j > m 2 r fr all r = 1,2,...,i. (4) 8

9 Algrithm fr estimatin: 1. Estimate M by M = 1 N N l=1 X l, and btain the centered array bservatins X c l = X l M fr l = 1,2,...,N. 2. Start with initial estimates f A 2,A 3,...,A i. 3. On the basis f the estimates f A 2,A 3,...,A i calculate an estimate f A 1 by first scaling the array bservatins using Z l = (I) 1 (A 1 2 )2,(A 1 3 )3,...,(A 1 i ) i Xc l, and then calculating the square rt f cvariance alng the 1st dimensin f the arrays Z l, l = 1,2,...,N. 4. On the basis f the mst recent estimates f the mdel parameters, estimate A j j = 2,...,i. by first scaling the array bservatins using Z l = (A 1 1 )1 (A 1 2 )2,...(A 1 j 1 )j 1 I(A 1 j+1 )j+1...(a 1 i ) i Xc l, and then calculating the square rt f cvariance alng the jth dimensin f the arrays Z l s fr j = 2,...,i. Scale the estimate f A j A j s that the last j 1 diagnal elements are equal t Repeat steps 3 and 4 until cnvergence is attained. We will use the estimatin algrithm t illustrate a technique we will call slicing fr btaining an apprximate nnsingular estimate f the cvariance matrix f a p dimensinal vectr variate randm variable when N < p. A vectr x f dimensin p can be sliced int p/m 1 = m 2 pieces and rganized int a matrix f rder p = m 1 m 2 fr sme natural numbers m 1 and m 2. Or, in general, the same vectr can be rganized in an array f dimensin p = m 1 m 2... m i fr sme natural numbers m 1, m 2,..., m i. Once we slicethedataandrerganizeitinarrayfrm, wewillpretendthatthisarraydata was generated frm the mdel in Therem 2.2. We require that the additinal assumptins stated befre the algrithm fr estimatin f the parameters f this mdel hld. An apprximate nnsingular estimate f the cvariance matrix Λ f the p dimensinal vectr variate randm variable can be btained by using the estimatrs frm this algrithm and using ˆΛ = (Â1 i Â 2 i Â i )(Â1 i Â 2 i Â i ) That we d nt have t assume any cvariance cmpnents are zer is the main difference and advantage f this regularizatin methd t the usual shrinkage methds like lass [Friedman et al., 2008]. Example 3.1. Let x N 12 (µ = 0,Λ) where Λ is the identity matrix. We illustrate slicing fr i = 2, m 1 = 3 and m 2 = 4. N = 5,10,20,50 Sets f N = 5,10,20,50 bservatins were generated and Λ was estimated using Λ = Â1 i Â 2 9

10 Cmpnents Cumulative cntributin f PC s N=5 N=10 N=20 N=50 Figure 2: As N gets larger the estimatrs f the eigenvalues apprach the true values. The true cvariance structure is represented with the black line. assuming the mdel in Therem 2.2. We repeated the whle experiment 5 times. The results are summarized in Figure 2. The eigenvalues cnverge t the true eigenvalues as N gets larger. Example 3.2. The Aln cln data set [Aln et al., 1999] have expressin measurements n 2000 genes and n 1 = 40 tumr tissues and n 2 = 22 nrmal tissue samples. We will cmpare the means f the nrmal and tumr tissue samples. We assume first that nrmal and tumr tissues have the same cvariance Λ, a psitive definite matrix. We slice each f the n = 62 bservatins int a matrix and estimate Λ with Λ = Â1 i Â 2 assuming the mdel in Therem 2.2 hlds. Fr testing the equality f the means, we calculate the F + statistic prpsed in [Kubkawa and Srivastava, 2008] replacing their estimatr 10

11 f the inverse f cvariance matrix Λ with the inverse f Λ: F + = 2000 (62 1)+1 ( 1 (62 1) ) 1 ( x 1 x 2 ) Λ 1 ( x 1 x 2 ) = Using the sampling distributin F r,n r prpsed in [Kubkawa and Srivastava, 2008] assuming that the rank r f Λ is 62 1, the p-value is calculated as Thus, the hypthesis f equality f the means is rejected. The estimate f the cvariance matrix fr cln data set is available at the url: akdemir. user. cs. ucf. edu/ cvclncancer. txt. Example 3.3. N = 10 i.i.d. bservatins frm a N 12(µ = 0,Λ), distributin are generated fr a randmly generated unstructured nnsingular cvariance matrix Λ. The right plt in Figure 3 cmpares the estimated eigenvalues btained by slicing this data int a array with the rdinary sample cvariance. True and Estimated Eigenvalues % cntributin f eigenvalues True and Estimated Eigenvalues % cntributin f eigenvalues Index Index Figure 3: The left plt in Figure 3 cmpares the estimated eigenvalues t the true eigenvalues (Example 3.4). The right plt in Figure 3 cmpares the estimated eigenvalues btained under different assumptins t the true eigenvalues (Example 3.3). The red s represent the true values, black s are fr estimates under Krnecker delta cvariance assumptin and the blue + s are fr estimates under unrestricted cvariance assumptin. Example 3.4. Let and A 1 = ( ) 1/2, A 2 = A 3 = /2. 1/2, 11

12 Als, let M be the 0 array f dimensins The fllwing are the estimates f A 1, A 2 and A 3 based n a randm sample f size 100 frm the N(A 1,A 2,A 3, M). and Â 1 = ( ) 1/2, Â 2 = /2, Â 3 = The left plt in Figure 3 cmpares the estimated eigenvalues t the true eigenvalues fr this example. Example 3.5. In this example, we will use the heatmap f the true and estimated cvariance matrices under different scenaris t see that slicing gives a reasnable descriptin f the variable variances and cvariances. In Figure 4 the true cvariance matrix is a identity matrix, we estimate this cvariance matrix fr N = 10,50, and 100 independent sets f randm samples by using 15 8 slicing. In Figure 5 the true cvariance matrix is a blck diagnal matrix with krnecker delta structure. Finally, in Figure 6 the true cvariance is a matrix with 4 way krnecker structure. Cnvergence f the estimatrs is bserved even when p >> N. 4 Principal Cmpnents Analysis and Dimensin Reductin Principal cmpnents analysis (PCA) is a useful statistical technique that has fund applicatins in fields such as face recgnitin and image cmpressin, and is a cmmn technique fr finding patterns in data f high dimensin. The end prduct f PCA is a set f new uncrrelated variables rdered in terms f their variances btained frm a linear cmbinatin f the riginal variables. Definitin 4.1. Fr the m 1 m 2... m i dimensinal array variate randm variable X, the principal cmpnents are defined as the principal cmpnents f the d = m 1 m 2...m i -dimensinal randm vectr rvec( X). The main statistical prblem is the estimatin f the cvariance f rvec( X), its eigenvectrs and eigenvalues fr small sample sizes. Let X l, l = 1,2,..,N be a randm sample fr the array variate randm variable X. Let p = m 1 m 2...m i. When N < p, it is well knwn that the usual cvariance estimatr fr rvec( X) will be singular with prbability ne. Therefre, when N < p, there is n cnsistent estimatr f the cvariance f rvec( X) under the unstructured cvariance assumptin. 1/2. 12

13 True Cvariance N=10 N=50 N=100 Figure 4: The true cvariance matrix is a identity matrix with krnecker delta structure. We estimate this cvariance matrix fr N = 10,50, and 100 independent sets f randm samples by using 15 8 slicing. 13

14 True Cvariance N=10 N=50 N=100 Figure 5: The true cvariance matrix is a blck diagnal matrix with krnecker delta structure. We estimate this cvariance matrix fr N = 10, 50, and 100 independent sets f randm samples by using 15 8 slicing. 14

15 True Cvariance N=10 N=50 N=100 Figure 6: The true cvariance matrix is way krnecker delta structured matrix. We estimate this cvariance matrix fr N = 10, 50, and 100 independent sets f randm samples by using 15 8 slicing. 15

16 On the ther hand, if we assume that the cvariance matrix has Krnecker delta structure, we can btain a nnsingular estimate f the cvariance structure with the methds develped in this sectin. The cnditin n the sample size is relaxed cnsiderably. If we have pn > m 2 r fr r = 1,2,...,i and the assumtins stated befre the algrithm fr estimatin f the parameters f this mdel hld, then the estimatr f the cvariance matrix is nnsingular. When the cvariance des nt have Krnecker structure, the estimate btained here culd be used as regularized nnsingular estimate f the cvariance. If {λ(a r ) rj } are the m j eigenvalues f A r A r with the crrespnding eigenvectrs {(x r ) rj } fr r = 1,2,...,i and r j = 1,2,...,m r, then (A 1 i A 2 i A i )(A 1 i A 2 i A i ) will have eigenvalues {λ(a 1 ) r1 λ(a 2 ) r2...λ(a i ) ri } with crrespnding eigenvectrs {(x 1 ) r1 i (x 2 ) r2 i... i (x i ) ri }. By replacing A r by their estimatrs, we estimate the eigenvalues and eigenvectrs f the cvariance f rvec( X) using this relatinship. Since each eigenvectr is a Krnecker prduct f smaller cmpnents the reductin in dimensin btained by this apprach is larger than the ne that culd be btained using the rdinary principle cmpnents n the rdinary sample cvariance matrix. Example 4.1. The 3-way aersl particles data n air quality was first analysed by multiway methds by [Stanimirva and Simenv, 2005]. They wanted t characterize air quality n the basis f particle size, seasnality, and chemical cmpsitin. The 3-way data cnsisting f the cncentratins f 17 chemical cmpunds plus dust fr each f 5 particle sizes in each f the 4 seasns were measured at N = 2 lcatins in Kärnten (Carynthia), Austria. The data is given in [Krnenberg and Crpratin, 2008]. Fr describing the cvariance f the randm variable using 2 bservatins, we culd assume that the data cmes frm the array variate nrmal distributin with unstructured cvariance matrix. Hwever the estimate btained under the unstructured mdel is singular with rank 1. On the ther hand, the array variate mdel with Krnecker delta cvariance structure prvides a nnsingular cvariance estimate. In Figure 7, the cumulative cntributin f the eigenvalues f each cmpnent dimensin are displayed in the first three plts. The eigenvalues f the resulting estimatr f the cvariance f randm variable is shwn in the last plt. In Figure 8, this dimensinal aersl data at tw lcatins is summarized by their prjectin n the first 3 estimated principal cmpnents. 5 Classificatin Suppse the array variable is generated by a mixture f tw densities, i.e., X πn(a 1,A 2,...,A i,m 1 )+(1 π)n(b 1,B 2,...,B i,m 2 ). Based n a set f training bservatins, the classificatin f a new bservatin t the cmpnent density N(A 1,A 2,...,A i,m 1 ) r N(B 1,B 2,...,B i,m 2 ) can be dne using the Bayes rule tgether with the estimatrs btained in Sectin 3. Let the training estimates f parameters be π, Â1, Â2,..., Âi, M 1, B 1, B 2, 16

17 % cntributin f eigenvalues % cntributin f eigenvalues % cntributin f eigenvalues % cntributin f eigenvalues Index Index Index Index Figure 7: The cumulative cntributin f the eigenvalues f each cmpnent dimensin are displayed in the first three plts. The eigenvalues f the resulting estimate f the cvariance f randm variable is shwn in the last plt. First 50 cmpnents ut f the 340 cmpnents explain mre than 90% f the variatin in the aersl data. 17

18 Prjectin f the Aersl Measurements at Tw Lcatins c c2 0 5 c1 Figure 8: In Figure 8, the aersl data at the tw lcatins is summarized by their prjectin n the first 3 estimated principal cmpnents...., B i, M 2. Then, the psterir prbability f an bservatin with value X t cme frm the first cmpnent N(A 1,A 2,...,A i,m 1 ) is given by πφ( X;Â1 P( X N(A 1,A 2,...,A i,m 1 )) =,Â2,...,Âi, M 1 ) πφ( X;Â1,Â2,...,Âi, M 1 ) + (1 π)φ( X; B 1, B 2,..., B i, M. (5) 2 ) Accrding t this we wuld classify the bservatin X t the first cmpnent if the psterir prbability is large (fr example if the prbability is mre than 0.5), therwise we classify it t the secnd cmpnent. The extensin t the case f mre than tw cmpnent densities is straightfrward. Example 5.1. We have used the Fisher s linear discriminant analysis (i.e., π =.5, A j = B j fr j = 1,2,...,i) fr the Aln cln data set [Aln et al., 1999]. The linear discriminant functin was calculated using w = Λ 1 ( x 1 x 2 ) where Λ is the cvariance estimate frm Example 3.2. An bservatin x was classified as nrmal if x w > 0, therwise as tumr. Figure 9 summarizes ur findings. Misclassificatin rate is %

19 Linear Discriminatin fr Aln Cancer Data fr nrmal fr tumr w x Classifed as Nrmal Classified as Tumr Observatin Number Figure 9: Linear discriminant analysis fr the Aln cln data set [Aln et al., 1999]. An bservatin x was classified as nrmal if x w > 0, therwise as tumr. Misclassificatin rate is %

20 References U. Aln, N. Barkai, DA Ntterman, K. Gish, S. Ybarra, D. Mack, and AJ Levine. Brad Patterns f Gene Expressin Revealed by Clustering Analysis f Tumr and Nrmal Cln Tissues Prbed by Olignucletide Arrays. Prceedings f the Natinal Academy f Sciences f the United States f America, 96(12):6745, G. Blaha. A Few Basic Principles and Techniques f Array Algebra. Jurnal f Gedesy, 51(3): , J. Friedman, T. Hastie, and R. Tibshirani. Sparse Inverse Cvariance Estimatin with the Graphical Lass. Bistatistics, 9(3):432, A.K. Gupta and D.K. Nagar. Matrix Variate Distributins. Chapman Hall/CRC Mngraphs and Surveys in Pure and Applied Mathematics. Chapman & Hall, P.M. Krnenberg and Ebks Crpratin. Applied Multiway Data Analysis. Wiley Online Library, M. Krzyśk and M. Skrzybut. Discriminant Analysis f Multivariate Repeated Measures Data with a Krnecker Prduct Structured Cvariance Matrices. Statistical Papers, 50(4): , T. Kubkawa and M.S. Srivastava. Estimatin f the Precisin Matrix f a Singular Wishart Distributin and Its Applicatin in High-Dimensinal Data. Jurnal f Multivariate Analysis, 99(9): , N. Lu and D.L. Zimmerman. The Likelihd Rati Test fr a Separable Cvariance Matrix. Statistics & Prbability Letters, 73(4): , U.A. Rauhala. Array Algebra with Applicatins in Phtgrammetry and Gedesy. Divisin f Phtgrammetry, Ryal Institute f Technlgy, U.A. Rauhala. Intrductin t Array Algebra. Phtgrammetric Engineering and Remte Sensing, 46(2): , A. Ry and R. Khattree. Tests fr Mean and Cvariance Structures Relevant in Repeated Measures Based Discriminant Analysis. Jurnal f Applied Statistical Science, 12(2):91 104, A. Ry and R. Leiva. Likelihd Rati Tests fr Triply Multivariate Data with Structured Crrelatin n Spatial Repeated Measurements. Statistics & Prbability Letters, 78(13): , A. Ry and R. Leiva. Classificatin Rules fr Multivariate Repeated Measures Data with Equicrrelated Crrelatin Structure n bth Time and Spatial Repeated Measurements. UTSA, Cllege f Business,

21 M.S. Srivastava, T. vn Rsen, and D. Vn Rsen. Mdels with a Krnecker Prduct Cvariance Structure: Estimatin and Testing. Mathematical Methds f Statistics, 17(4): , I. Stanimirva and V. Simenv. Mdeling f Envirnmental Fur-Way Data frm Air Quality Cntrl. Chemmetrics and Intelligent Labratry Systems, 77(1-2): ,

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) > Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);