On Independent Component Analysis

Size: px
Start display at page:

Download "On Independent Component Analysis"

Transcription

1 On Independent Component Analysis Université libre de Bruxelles European Centre for Advanced Research in Economics and Statistics (ECARES) Solvay Brussels School of Economics and Management Symmetric

2 Outline Symmetric Symmetric

3 IC Model IC Model Symmetric

4 IC Model In the independent component (IC) model it is assumed that the p-variate random vector x = Ωz +µ, (1) where µ is a location vector, Ω is a full rank p p mixing matrix, and z is a p-variate vector with mutually independent components with common median zero. Symmetric

5 Independent Component Analysis In the independent component analysis (ICA) the aim is to find an estimate of an unmixing matrix Γ such that Γx has independent components. Symmetric

6 Symmetric

7 ICA is an important and timely research area. The field of applications of ICA is wide and constantly expanding, varying from biomedical image data applications to signal processing, and economics. Symmetric

8 Standardization Standardization Symmetric

9 The mixing matrix Ω in Model (1) is clearly not uniquely defined: for any p p permutation matrix P and any full-rank diagonal matrix D, one can indeed always write x = [ ΩPD ][ (PD) 1 z ] +µ = Ω z +µ, (2) where z still has independent components with median zero. Symmetric Solving this identifiability problem requires either standardizing z or standardizing the mixing matrix Ω.

10 Location and Scatter Functionals Location and Scatter Functionals Symmetric

11 Let x denote a p-variate random vector with a cumulative distribution function F x and let X = [x 1...x n ], where x 1,..., x n is a random sample from the distribution F x. Symmetric

12 Location and Scatter Functionals A p 1 vector-valued functional T(F x ), which is affine equivariant in the sense that T(F Ax+b ) = AT(F x )+b for all nonsingular p p matrices A and for all p-vectors b, is called a location functional. Symmetric

13 Location and Scatter Functionals A p p matrix-valued functional S(F x ) which is positive definite and affine equivariant in the sense that S(F Ax+b ) = AS(F x )A T for all nonsingular p p matrices A and for all p-vectors b, is called a scatter functional. Symmetric

14 Location and Scatter Functionals The corresponding sample statistics are obtained if the functionals are applied to the empirical cumulative distribution F n based on a sample x 1, x 2,...,x n. Notation T(F n ) and S(F n ) or T(X) and S(X) is used for the sample statistics. The location and scatter sample statistics then also satisfy and T(AX + b1 T n) = AT(X)+b S(AX + b1 T n) = AS(X)A T for all nonsingular p p matrices A and for all p-vectors b. Scatter matrix functionals are usually standardized such that in the case of standard multivariate normal distribution S(F x ) = I. Symmetric

15 Location and Scatter Functionals The first examples of location and scatter functionals are the mean vector and the regular covariance matrix: T 1 (F x ) = E(x) and S 1 (F x ) = Cov(F x ) = E ( (x E(x))(x E(x)) T). Symmetric

16 Location and Scatter Functionals Location and scatter functionals can be based on the third and fourth moments as well. A location functional based on third moments is T 2 (F x ) = 1 p E ( (x E(x)) T Cov(F x ) 1 (x E(x))x ) Symmetric and a scatter matrix functional based on fourth moments is S 2 (F x ) = 1 p + 2 E ( (x E(x))(x E(x)) T Cov(F x ) 1 (x E(x))(x E(x)) T).

17 Location and Scatter Functionals There are several other location and scatter functionals, even families of them, having different desirable properties (robustness, efficiency, limiting multivariate normality, fast computations, etc). Symmetric

18 Location and Scatter Functionals If a scatter matrix functional S(F x ) is a diagonal matrix for all x having independent components, it is said to posses the independence property. Symmetric

19 Location and Scatter Functionals The regular covariance matrix is a scatter matrix with the independence property. Another example of a scatter matrix with the independence property is the matrix based on fourth moments. Symmetric

20 Location and Scatter Functionals Most scatter functionals do posses the independence property only if all the components (or all the components except for one) are symmetric. However, every scatter/shape matrix functional S(F x ) can be symmetrized by setting S sym (F x ) = S(F x1 x 2 ), where x 1 and x 2 are independent random vectors having the same cumulative distribution function F x. The resulting symmetrized scatter matrix does always have the independence property Symmetric

21 Back to the standardization of the Symmetric

22 Vector z in Model (1) can be standardized using two different location functionals and two different scatter matrix functionals. Symmetric

23 The marginal distributions of z in Model (1) can be standardized using two different location functionals T 1 and T 2 and two different scatter functionals S 1 and S 2, possessing the independence property, by setting T 1 (F z ) = 0, S 1 (F z ) = I p, T 2 (F z ) = δ and S 2 (F z ) = D, where δ is a p-vector with all components δ i 0, i = 1,..., p, and D is a diagonal matrix with diagonal elements d 1... d p > 0. If now δ i > 0, i = 1,..., p, and if the diagonal elements of D are distinct, then the mixing matrix Ω is uniquely defined. Symmetric

24 Standardizing the Mixing Matrix Mixing matrix Ω in Model (1) can be standardized fixing the order, signs, and scales of the column vectors of Ω. Symmetric

25 Standardizing the Mixing Matrix The 1 can also be standardized by standardizing the mixing matrix using a mapping Ω L = ΩD + 1 PD 2, where D + 1 is the positive definite diagonal matrix that makes each column of ΩD + 1 have Euclidean norm one, P is the permutation matrix for which the matrix B = (b ij ) = ΩD + 1 P satisfies b ii > b ij for all i < j, and D 2 is the diagonal matrix that makes all the diagonal entries of L = ΩD + 1 PD 2 to be equal to one. Ties may be taken care of e.g., by basing the ordering on subsequent rows of B above, but they may prevent the mapping to be continuous. Thus it is often convenient to restrict to the collection of mixing matrices Ω for which no ties occur in the permutation step. Symmetric

26 There are good things and bad things in both standardization approaches, but the key thing is that both standardization methods presented above enable to fix Model (1) uniquely. Symmetric

27 Symmetric

28 Lack of uniqueness of Model (1) causes some ambiguity about what is meant by an IC functional. Symmetric

29 Let M denote the set of all full-rank p p matrices. (Then naturally all unmixing matrices Γ M.) Let P denote a permutation matrix, J a sign-change matrix, and D a scaling matrix. Let C = {C M C = PJD for some P, J, and D}. Symmetric Now two matrices Γ 1 and Γ 2 are said to be equivalent if Γ 1 = CΓ 2 for some C C. We then write Γ 1 Γ 2.

30 A functional Γ(F x ) M is an IC functional in the (1) if Γ(F x )Ω I p, and if it is affine equivariant in the sense that Γ(F Ax ) = Γ(F x )A 1 Symmetric for all A M.

31 Based Symmetric Based

32 Approach based on the use of two scatter matrices Let S 1 (F x ) and S 2 (F x ) denote two different scatter functionals with the independence property. The IC functional Γ(F x ) based on the scatter matrix functionals S 1 (F x ) and S 2 (F x ) is defined as a solution of the equations ΓS 1 (F x )Γ T = I p and ΓS 2 (F x )Γ T = Λ, Symmetric where Λ = Λ(F x ) is a diagonal matrix with diagonal elements λ 1... λ p > 0.

33 Approach Based Scatter Matrices One of the first solutions for the ICA problem, the fourth order blind identification (FOBI) functional is obtained if the scatter functionals S 1 (F x ) and S 2 (F x ) are the scatter matrices based on the second and fourth moments, respectively. Symmetric

34 Approach Based Scatter Matrices The functionals and corresponding sample statistics G(X) and L(X) are affine equivariant and invariant in the sense that G(AX + b1 T n ) = G(X)A 1 and L(AX + b1 T n ) = L(X) for all A M and b R p. For the asymptotics, it is therefore not a restriction to assume that X is a random sample from a distribution F x with S(F x ) = I and S 2 (F x ) = Λ, where the diagonal elements of Λ are λ 1... λ p > 0. Symmetric

35 Approach Based Scatter Matrices Assume that n(s1 (X) I) = O p (1) and n(s 2 (X) Λ) = O p (1), with λ 1 >... > λ p > 0, and assume that the diagonal elements of G(X) are set to be positive. Then n(g(x)ii 1) = 1 2 n(s1 (X) ii 1)+o p (1), Symmetric (λ i λ j ) ng(x) ij = ns 2 (X) ij λ i ns1 (X) ij + o p (1), i j, and n(l(x)ii λ i ) = n(s 2 (X) ii λ i ) λ i n(s1 (X) ii 1)+o p (1).

36 Approach Based Scatter Matrices It is interesting to note that the asymptotic behavior of the diagonal elements of G(X) does not depend on S 2 (X) at all. The three equations above are in fact true if λ i is distinct from all the other eigenvalues λ j, j i. The limiting joint distributions of the sample eigenvectors and sample eigenvalues for a subset with distinct population eigenvalues can then be derived from the limiting distributions of S 1 (X) and S 2 (X). Symmetric

37 Signed Ranks Signed Ranks Symmetric

38 Symmetric In symmetric it is assumed that the p-variate vector x = Ωz +µ (3) where Ω is a full-rank p p mixing matrix, µ is a location vector and z is a p-variate vector with mutually independent and symmetrically distributed components. Symmetric

39 Signed Ranks The parametrization of the (3) based on standardizing the mixing matrix leads to considering the model associated with x = Lz +µ, (4) where µ R p, L M, and z has independent and symmetrically distributed marginals with common median zero. The resulting collection of densities (of the form h(z) = p r=1 h r(z r ), where h r is the symmetric density of z r ) will be denoted as F. Symmetric

40 Signed Ranks The hypothesis under which n mutually independent observations x i, i = 1,...,n are obtained from (4), where z has density h, will be denoted as P (n) ϑ,h, with ϑ = (µ T,(vecd L) T ) T Θ = R p vecd (M), or alternatively, as P (n) µ,l,h. This leads to the semiparametric model P (n) = h P (n) h = h ϑ Θ {P (n) ϑ,h }. Symmetric

41 Assumptions As usual, ULAN at some specific g = f requires technical assumptions: in the present context, we need that f belongs to the collection F ulan of densities in F for which each f r, r = 1,...,p, is absolutely continuous, with a derivative f r that satisfies (below we let ϕ fr = f r /f r) σ 2 f r = y 2 f r (y) dy <, I fr = ϕ 2 f r (y)f r (y) dy <, Symmetric and J fr = y 2 ϕ 2 f r (y)f r (y) dy <.

42 For any f F ulan, we let γ rs (f) = I fr σ 2 f s, we define the optimal p-variate location score function ϕ f R p R p through z = (z 1,...,z p ) ϕ f (z) = (ϕ f1 (z 1 ),...,ϕ fp (z p )), and we denote by I f the diagonal matrix with diagonal entries I fr, r = 1,...,p. Further we write I l for the l-dimensional identity matrix and we define C = p 1 p (e r e r u s e s+δ s r ), r=1 s=1 Symmetric where e r and u r stand for the rth vectors of the canonical basis of R p and R p 1, respectively, and δ s r is equal to one if s r and to zero otherwise.

43 ULAN of symmetric Then the parametric model P (n) f is ULAN for any fixed f F ulan, with central sequence (n) ϑ,f = ( (n) ϑ,f;1 (n) ϑ,f;2 ) = ( n 1/2 (L 1 ) n i=1 ϕ f(z i ) n 1/2 C(I p L 1 ) n i=1 vec(ϕ f(z i )Z i I p ) where Z i = Z i (ϑ) = L 1 (X i µ), and full-rank information matrix ( ) Γ L,f = Γ L,f;1 0 0 Γ, L,f;2 where Γ L,f;1 = (L 1 ) I f L 1 and ), Symmetric [ p Γ L,f;2 = C(I p L 1 ) (J fr 1)(e r e r e r e r) r=1 p ( + γsr (f)(e r e r e s e s)+(e r e s e s e r) )] (I p L 1 )C. r,s=1,r s

44 Efficient inference ULAN property allows to derive parametric efficiency bounds at f and to construct the corresponding parametrically optimal inference procedures for a parameter. In the present context, when testing H 0 : L = L 0 against H a : L L 0, parametrically optimal tests reject the null at asymptotic level α whenever ϑ,f;2 Γ 1 L 0,f;2 ϑ,f;2 > χ 2 p(p 1),1 α, Symmetric where χ 2 k,1 α denotes the α-upper quantile of the χ2 k distribution.

45 Under local alternatives Under local alternatives of the form H a : L = L 0 + n 1/2 H, where H is an arbitrary p p matrix with zero diagonal entries, these tests have asymptotic power Ψ p(p 1) ( χ 2 p(p 1),1 α ;(vecd H) Γ L0,f;2(vecd H) ), where χ 2 k,1 α stands for the α-upper quantile of the χ2 k distribution, and Ψ k ( ;δ) denotes the cumulative distribution function of the non-central χ 2 k distribution with non-centrality parameter δ. This settles the parametrically optimal (at f ) performance for hypothesis testing. Symmetric

46 Semiparametrically efficient inference The underlying density f is often unspecified in practice, which leads to considering the semiparametric model. Semiparametrically efficient (at f ) inference procedures on L then may be based on the so-called efficient central sequence ϑ,f;2 resulting from ϑ,f;2 by performing adequate tangent space projections. Symmetric

47 Under local alternatives The performance of semiparametrically efficient tests on L can be characterized in terms of Γ L,f;2 : a test of H 0 : L = L 0 is semiparametrically efficient at f (at asymptotic level α) if its asymptotic powers under local alternatives of the form H a : L = L 0 + n 1/2 H, are given by Ψ p(p 1) ( χ 2 p(p 1),1 α ;(vecd H) Γ L 0,f;2 (vecd H) ). Symmetric

48 Testing We first consider the problem of testing H 0 : L = L 0 against H a : L L 0, where L 0 is fixed. Semiparametrically optimal procedures are based on the efficient central sequence ϑ,f. Classically, ϑ,f is obtained by performing tangent space computations. When, however, the semiparametric model at hand enjoys a strong invariance structure, the efficient central sequence ϑ,f can alternatively be obtained by conditioning the original central sequence ϑ,f with respect to the corresponding maximal invariant. Symmetric

49 Signed-ranks In the present setup, this maximal invariant is given by (S 1 (ϑ),...,s n (ϑ), R + 1 (ϑ),...,r+ n (ϑ)), with S i (ϑ) = (S i1 (ϑ),...,s ip (ϑ)) and R + i (ϑ) = (R + i1 (ϑ),...,r+ ip (ϑ)), where S ir (ϑ) is the sign of Z ir (ϑ) = (L 1 (X i µ)) r and R + ir (ϑ) is the rank of Z ir(ϑ) among Z 1r (ϑ),..., Z nr (ϑ). This is what leads to considering signed-rank procedures when performing inference on L in the present context. Symmetric

50 Signed-rank testing in symmetric s Let ˆϑ 0 = (ˆµ,(vecd L) ), where ˆµ is an estimator that is locally and asymptotically discrete, and n consistent under H 0. Then one can show that the nonparametric counterpart of the test statistic is given by where [ ( 1 vec odiag n and Q f = ( ˆϑ0,f;2 ) (Γ L 0,f;2 ) 1 ˆϑ0,f;2, ϑ,f;2 = C(I p L 1 ) n ( (S i (ϑ) ϕ f i=1 Γ L,f;2 = C(I p L 1 ) [ p r,s=1,r s F 1 + ( R + i (ϑ) n+1 )))( S i (ϑ) F 1 + ( R + i (ϑ) n+1 Symmetric )) ) ] ( γsr (f)(e r e r e s e s)+(e r e s e s e r) )] (I p L 1 )C.

51 Linear hypothesis Assume that Ω is p(p 1) l matrix with full rank l. Let V(Ω) denote the vector space that is spanned by the columns of Ω. We consider testing H 0 : vecd L {vecd L 0 + v v V(Ω)} against H a : vecd L {vecd L 0 + v v V(Ω)}. Symmetric

52 Test statistic Let where Q ϑ,f (L 0,Ω) = ( ϑ,f;2) P ϑ,ω ϑ,f;2, P ϑ,ω = (Γ L,f;2 ) 1 Ω(Ω Γ L,f;2 Ω) 1 Ω. Symmetric

53 One Step Estimation Based on Signed Ranks Let ϑ = ( µ T,(vecd L) T ) T denote a root-n consistent and locally asymptotically discrete preliminary estimator. Let [ p GL,f,h;2 = C(I p L 1 ) T ( γsr (f, h)(e r er T e s es T ) where r,s=1,r s + ρ rs (f, h)(e r e T s e se T r ))] (I p L 1 )C T, Symmetric γ rs (f, h) = and ρ rs (f, h) = ϕ fr (F 1 r F 1 r (u))ϕ hr (Hr 1 (u)) du (u)ϕ hr (H 1 (u)) du r F 1 s ϕ fs (F 1 s (u) Hs 1 (u) du (u)) Hs 1 (u) du

54 and let Ĝ L,f;2 denote an estimate of GL,f,h;2 formed by plugging in preliminary a estimator ϑ and estimators ˆγ rs (f) and ˆρ rs (f) that (i) are locally asymptotically discrete and (ii) satisfy ˆγ rs (f) = γ rs (f, h)+o P (1) and ˆρ rs (f) = ρ rs (f, h)+o P (1) as n, under ϑ Θ h Fulan {P (n) ϑ,h }. Symmetric

55 Signed Ranks Let vecd ˆLf = (vecd L)+n 1/2 (Ĝ L,f;2 ) 1 ϑ,f;2, where Ĝ L,f;2 is the consistent estimate of GL,f,h;2 just defined. Then n vecd (ˆL f L) d ( N p(p 1) 0,(Γ L,f;2 ) 1) as n, under µ R p{p (n) µ,l,f }. Symmetric

56 Symmetric

57 Due to the vast amount of different ICA estimates and algorithms, asymptotic as well as finite sample criteria are needed for their comparisons. While asymptotic results (convergence, asymptotic normality, etc.) are often missing, several finite-sample performance indices have been proposed in the literature to compare different estimates in simulation studies. Symmetric

58 First, one can compare the true sources z (which are of course known in the simulations) and the estimated sources ẑ = ˆΓx. Second, one can measure the closeness of the true unmixing matrix Ω 1 (used in the simulations) and the estimated unmixing matrix ˆΓ. In both cases the problem is that the order, signs and scales of the rows of the estimated unmixing matrix may not match as ˆΓ is typically not an estimate of Ω 1. For a good estimate, the gain matrix Ĝ = ˆΓΩ is close to a matrix PJD, where P is a permutation matrix, J is a sign-change matrix, and D is a scaling matrix. Symmetric

59 Let A denote a p p matrix. The shortest squared distance (divided by p 1) between the set {CA C C} of equivalent matrices (to A) and I p is given by D 2 (A) = 1 p 1 inf C C CA I p 2 Symmetric where is the matrix (Frobenius) norm.

60 Let A be any p p matrix having at least one nonzero element in each row. The shortest squared distance D 2 (A) fulfils the following four conditions: 1. 1 D 2 (A) 0, 2. D 2 (A) = 0 if and only if A I p, 3. D 2 (A) = 1 if and only if A 1 p a T for some p-vector a, and 4. the function c D 2 (I p + c odiag(a)) is increasing in c [0, 1] for all matrices A such that A 2 ij 1, i j. Symmetric

61 The shortest distance between the identity matrix and the set of matrices {CˆΓΩ : C C} equivalent to the gain matrix Ĝ = ˆΓΩ is as given in the following. The minimum distance index for ˆΓ is ˆD = D(ˆΓΩ) = 1 p 1 inf C C CˆΓΩ I p. Symmetric

62 It follows directly that 1 ˆD 0, and ˆD = 0 if and only if ˆΓ Ω 1. The worst case with ˆD = 1 is obtained if all the row vectors of ˆΓΩ point to the same direction. Thus the value of the minimum distance index is easy to interpret. Note that D(ˆΓΩ) = D(CˆΓΩ) for all C C. Also, if Symmetric x i = Ωz i and x i = (AΩ)z i = Ω z i, and ˆΓ is calculated from X = [x 1,..., x n], then D(ˆΓ Ω ) = D(ˆΓΩ). Thus the minimum distance index provides a fair comparison for different.

63 Assume that the model is fixed such that Γ(F x ) = Ω = I p and that n vec(ˆγ I p ) d N p 2(0,Σ). Then nˆd 2 = n p 1 odiag(ˆγ) 2 + o P (1) and the limiting distribution of nˆd 2 is that of (p 1) 1 k i=1 δ iχ 2 i where χ 2 1,...,χ2 k are independent chi squared variables with one degree of freedom, and δ 1,...,δ k are the k nonzero eigenvalues (including all algebraic multiplicities) of Symmetric ASCOV( n vec(odiag(ˆγ))) = (I p 2 D p,p )Σ(I p 2 D p,p ), with D p,p = p i=1 (e ie T i ) (e i e T i ).

64 Symmetric

65 Asymptotics for different scatter matrices, complex valued ICA, time series... Symmetric

66 I P. Ilmonen, On asymptotical properties of the scatter matrix based estimates for complex valued independent component analysis, submitted. P. Ilmonen, J. Nevalainen and H. Oja, Characteristics of multivariate distributions and the invariant coordinate system, Statistics and Probability Letters 80(23-24) (2010), P. Ilmonen, K. Nordhausen, H. Oja and E. Ollila, On asymptotics of ICA estimators and their performance Indices, submitted. P. Ilmonen and D. Paindaveine, Semiparametrically efficient inference based on signed ranks in symmetric independent component models, the Annals of Statistics 39(5) (2011), P. Ilmonen and D. Paindaveine, Signed rank tests in symmetric s, manuscript. Symmetric

67 II P. J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner, Efficient and Adaptive Statistical Inference for Semiparametric Models, Johns Hopkins University Press, Baltimore (1993). L. Le Cam, Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New York (1986). M. Hallin and B. J. M. Werker, Semiparametric efficiency, distribution-freeness, and invariance, Bernoulli 9 (2003), H. Oja, D. Paindaveine and S. Taskinen, Parametric and nonparametric tests for multivariate independence in IC models, Submitted. E. Ollila and H.-J. Kim, On testing hypotheses of mixing vectors in the ICA model using FastICA, Proceedings of IEEE International Symposium on Biomedical Imaging (ISBI 11) (2011), Symmetric

68 III A. Hyvärinen, J. Karhunen and E. Oja, Independent Component Analysis, John Wiley & Sons, New York (2001). H. Oja, Multivariate Nonparametric Methods With R, Springer-Verlag, New York (2010). Symmetric

69 Thank You! Symmetric

70 An example The first specific testing problem we consider, is testing if the element k of vecd L is some fixed c 0. Let o i and v i stand for the ith vectors of the canonical basis of R p(p 1) and R p(p 1) 1, respectively Now Ω can be chosen to be a p(p 1) p(p 1) 1 matrix having canonical basis vectors of R p(p 1), excluding the kth basis vector, as its column vectors i.e. Ω i = o i, i < k, Ω i = o i+1, i k, and vecd L 0 can be chosen to have c 0 as its element k and all other elements of vecd L 0 can be chosen to be 0. Here trace(p ϑ,ω Γ L,f;2 ) = 1. Symmetric

71 An example Testing if the rth vector of L is some fixed c 0 is equal to testing if elements ((r 1)(p 1)+1) (r(p 1)) of of vecd L are fixed. Let o i and w i stand for the ith vectors of the canonical basis of R p(p 1) and R p(p 2), respectively. Now Ω can be chosen to be a p(p 1) p(p 2) matrix having canonical basis vectors of R p(p 1), excluding the basis vectors ((r 1)(p 1)+1) (r(p 1)), as its column vectors i.e. Ω i = o i, i < ((r 1)(p 1)+1), Ω i = o i+(p 1), i ((r 1)(p 1)+1),and vecd L 0 can be chosen to have elements of c 0 (except the diagonal element) and all other elements of vecd L 0 can be chosen to be 0. Here trace(p ϑ,ω Γ L,f;2 ) = p 1. Symmetric

72 ULAN ULAN, ULAN, ULAN... Symmetric

73 ULAN A sequence of statistical models P (n) f = {P (n) ϑ,f ϑ Θ Rk, f F} is uniformly locally asymptotically normal (ULAN) if for any ϑ n = ϑ+o(n 1/2 ) and any bounded sequence (τ n ), there exists a symmetric positive definite matrix G ϑ,f such that, under P (n) ϑ,f as n, log(dp (n) ϑ n+n 1/2 τ n,f /dp(n) ϑ n,f ) = τ T n (n) ϑ n,f 1 2 τ T n G ϑ,f τ n + o P (1), Symmetric and that, still under P (n) ϑ,f, (n) ϑ n,f is asymptotically normal with mean zero and covariance matrix G ϑ,f.

74 ULAN ULAN property allows to derive parametric efficiency bounds at f and to construct the corresponding parametrically optimal inference procedures for ϑ. When testing H 0 : ϑ = ϑ 0 against H a : ϑ ϑ 0, parametrically optimal tests reject the null at asymptotic level α whenever (n)t ϑ 0,f G 1 ϑ 0,f (n) ϑ 0,f > χ 2 k,1 α, where χ 2 k,1 α denotes the α-upper quantile of the χ2 k distribution. Under sequences of alternatives of the form, these tests have the asymptotic power P (n) ϑ 0 +n 1/2 τ,f Ψ k (χ 2 k,1 α ;τ T G ϑ0,fτ), where Ψ k ( ;δ) stands for the cumulative distribution function of the non-central χ 2 k distribution with non-centrality parameter δ. This settles the parametrically optimal (at f ) performance for hypothesis testing. Symmetric

75 ULAN As for point estimation, an estimator ˆϑ is parametrically efficient at f if and only if n (ˆϑ ϑ) d Nk ( 0, G 1 ϑ,f). Symmetric

76 ULAN The underlying density f is often unspecified in practice, which leads to considering the semiparametric model P (n) = h ϑ Θ {P (n) ϑ,h }. In P(n), semiparametrically optimal (still at f ) inference procedures are based on the efficient central sequence (n) ϑ,f resulting from the original central sequence (n) ϑ,f by performing adequate tangent space projections. Under P (n) ϑ,f, the efficient central sequence (n) ϑ,f typically is still asymptotically normal with mean zero, but now with covariance matrix Gϑ,f (the efficient information matrix at f ). Semiparametrically optimal tests (at f ) reject the null at asymptotic level α whenever Symmetric (n)t ϑ 0,f (Gϑ 0,f ) 1 (n) ϑ 0,f > χ 2 k,1 α. They have asymptotic powers Ψ k (χ 2 k,1 α ;τ T (G ϑ 0,f )τ) under the sequences of alternatives considered above.

77 ULAN An estimator ˆϑ is semiparametrically efficient at f if and only if n (ˆϑ ϑ) d Nk ( 0,(G ϑ,f ) 1). Symmetric

A more efficient second order blind identification method for separation of uncorrelated stationary time series

A more efficient second order blind identification method for separation of uncorrelated stationary time series A more efficient second order blind identification method for separation of uncorrelated stationary time series Sara Taskinen 1a, Jari Miettinen a, Klaus Nordhausen b,c a Department of Mathematics and

More information

Characteristics of multivariate distributions and the invariant coordinate system

Characteristics of multivariate distributions and the invariant coordinate system Characteristics of multivariate distributions the invariant coordinate system Pauliina Ilmonen, Jaakko Nevalainen, Hannu Oja To cite this version: Pauliina Ilmonen, Jaakko Nevalainen, Hannu Oja. Characteristics

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

Invariant coordinate selection for multivariate data analysis - the package ICS

Invariant coordinate selection for multivariate data analysis - the package ICS Invariant coordinate selection for multivariate data analysis - the package ICS Klaus Nordhausen 1 Hannu Oja 1 David E. Tyler 2 1 Tampere School of Public Health University of Tampere 2 Department of Statistics

More information

The squared symmetric FastICA estimator

The squared symmetric FastICA estimator he squared symmetric FastICA estimator Jari Miettinen, Klaus Nordhausen, Hannu Oja, Sara askinen and Joni Virta arxiv:.0v [math.s] Dec 0 Abstract In this paper we study the theoretical properties of the

More information

Deflation-based separation of uncorrelated stationary time series

Deflation-based separation of uncorrelated stationary time series Deflation-based separation of uncorrelated stationary time series Jari Miettinen a,, Klaus Nordhausen b, Hannu Oja c, Sara Taskinen a a Department of Mathematics and Statistics, 40014 University of Jyväskylä,

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1:, Multivariate Location Contents , pauliina.ilmonen(a)aalto.fi Lectures on Mondays 12.15-14.00 (2.1. - 6.2., 20.2. - 27.3.), U147 (U5) Exercises

More information

Scatter Matrices and Independent Component Analysis

Scatter Matrices and Independent Component Analysis AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 175 189 Scatter Matrices and Independent Component Analysis Hannu Oja 1, Seija Sirkiä 2, and Jan Eriksson 3 1 University of Tampere, Finland

More information

Davy PAINDAVEINE Thomas VERDEBOUT

Davy PAINDAVEINE Thomas VERDEBOUT 2013/94 Optimal Rank-Based Tests for the Location Parameter of a Rotationally Symmetric Distribution on the Hypersphere Davy PAINDAVEINE Thomas VERDEBOUT Optimal Rank-Based Tests for the Location Parameter

More information

Independent component analysis for functional data

Independent component analysis for functional data Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability

More information

Signed-rank Tests for Location in the Symmetric Independent Component Model

Signed-rank Tests for Location in the Symmetric Independent Component Model Signed-rank Tests for Location in the Symmetric Independent Component Model Klaus Nordhausen a, Hannu Oja a Davy Paindaveine b a Tampere School of Public Health, University of Tampere, 33014 University

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Package BSSasymp. R topics documented: September 12, Type Package

Package BSSasymp. R topics documented: September 12, Type Package Type Package Package BSSasymp September 12, 2017 Title Asymptotic Covariance Matrices of Some BSS Mixing and Unmixing Matrix Estimates Version 1.2-1 Date 2017-09-11 Author Jari Miettinen, Klaus Nordhausen,

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Tests Using Spatial Median

Tests Using Spatial Median AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 331 338 Tests Using Spatial Median Ján Somorčík Comenius University, Bratislava, Slovakia Abstract: The multivariate multi-sample location problem

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

Separation of uncorrelated stationary time series. using autocovariance matrices

Separation of uncorrelated stationary time series. using autocovariance matrices Separation of uncorrelated stationary time series arxiv:1405.3388v1 [math.st] 14 May 2014 using autocovariance matrices Jari Miettinen 1, Katrin Illner 2, Klaus Nordhausen 3, Hannu Oja 3, Sara Taskinen

More information

Package SpatialNP. June 5, 2018

Package SpatialNP. June 5, 2018 Type Package Package SpatialNP June 5, 2018 Title Multivariate Nonparametric Methods Based on Spatial Signs and Ranks Version 1.1-3 Date 2018-06-05 Author Seija Sirkia, Jari Miettinen, Klaus Nordhausen,

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Multivariate Signed-Rank Tests in Vector Autoregressive Order Identification

Multivariate Signed-Rank Tests in Vector Autoregressive Order Identification Statistical Science 2004, Vol 9, No 4, 697 7 DOI 024/088342304000000602 Institute of Mathematical Statistics, 2004 Multivariate Signed-Ran Tests in Vector Autoregressive Order Identification Marc Hallin

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

On Invariant Within Equivalence Coordinate System (IWECS) Transformations

On Invariant Within Equivalence Coordinate System (IWECS) Transformations On Invariant Within Equivalence Coordinate System (IWECS) Transformations Robert Serfling Abstract In exploratory data analysis and data mining in the very common setting of a data set X of vectors from

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

arxiv: v2 [stat.me] 31 Aug 2017

arxiv: v2 [stat.me] 31 Aug 2017 Asymptotic and bootstrap tests for subspace dimension Klaus Nordhausen 1,2, Hannu Oja 1, and David E. Tyler 3 arxiv:1611.04908v2 [stat.me] 31 Aug 2017 1 Department of Mathematics and Statistics, University

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Estimation of linear non-gaussian acyclic models for latent factors

Estimation of linear non-gaussian acyclic models for latent factors Estimation of linear non-gaussian acyclic models for latent factors Shohei Shimizu a Patrik O. Hoyer b Aapo Hyvärinen b,c a The Institute of Scientific and Industrial Research, Osaka University Mihogaoka

More information

SUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES

SUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES Submitted to the Annals of Statistics SUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES By Christine Cutting, Davy Paindaveine Thomas Verdebout

More information

Pascal Eigenspaces and Invariant Sequences of the First or Second Kind

Pascal Eigenspaces and Invariant Sequences of the First or Second Kind Pascal Eigenspaces and Invariant Sequences of the First or Second Kind I-Pyo Kim a,, Michael J Tsatsomeros b a Department of Mathematics Education, Daegu University, Gyeongbu, 38453, Republic of Korea

More information

Optimization and Testing in Linear. Non-Gaussian Component Analysis

Optimization and Testing in Linear. Non-Gaussian Component Analysis Optimization and Testing in Linear Non-Gaussian Component Analysis arxiv:1712.08837v2 [stat.me] 29 Dec 2017 Ze Jin, Benjamin B. Risk, David S. Matteson May 13, 2018 Abstract Independent component analysis

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Estimation and Testing for Common Cycles

Estimation and Testing for Common Cycles Estimation and esting for Common Cycles Anders Warne February 27, 2008 Abstract: his note discusses estimation and testing for the presence of common cycles in cointegrated vector autoregressions A simple

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

Robust Optimal Tests for Causality in Multivariate Time Series

Robust Optimal Tests for Causality in Multivariate Time Series Robust Optimal Tests for Causality in Multivariate Time Series Abdessamad Saidi and Roch Roy Abstract Here, we derive optimal rank-based tests for noncausality in the sense of Granger between two multivariate

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

A Squared Correlation Coefficient of the Correlation Matrix

A Squared Correlation Coefficient of the Correlation Matrix A Squared Correlation Coefficient of the Correlation Matrix Rong Fan Southern Illinois University August 25, 2016 Abstract Multivariate linear correlation analysis is important in statistical analysis

More information

Hypothesis testing in multilevel models with block circular covariance structures

Hypothesis testing in multilevel models with block circular covariance structures 1/ 25 Hypothesis testing in multilevel models with block circular covariance structures Yuli Liang 1, Dietrich von Rosen 2,3 and Tatjana von Rosen 1 1 Department of Statistics, Stockholm University 2 Department

More information

Graduate Econometrics I: Unbiased Estimation

Graduate Econometrics I: Unbiased Estimation Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56 Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will

More information

On Multivariate Runs Tests. for Randomness

On Multivariate Runs Tests. for Randomness On Multivariate Runs Tests for Randomness Davy Paindaveine Université Libre de Bruxelles, Brussels, Belgium Abstract This paper proposes several extensions of the concept of runs to the multivariate setup,

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

Empirical Power of Four Statistical Tests in One Way Layout

Empirical Power of Four Statistical Tests in One Way Layout International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Random Eigenvalue Problems Revisited

Random Eigenvalue Problems Revisited Random Eigenvalue Problems Revisited S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. Email: S.Adhikari@bristol.ac.uk URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Definitions An m n (read "m by n") matrix, is a rectangular array of entries, where m is the number of rows and n the number of columns. 2 Definitions (Con t) A is square if m=

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

Practical tests for randomized complete block designs

Practical tests for randomized complete block designs Journal of Multivariate Analysis 96 (2005) 73 92 www.elsevier.com/locate/jmva Practical tests for randomized complete block designs Ziyad R. Mahfoud a,, Ronald H. Randles b a American University of Beirut,

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Systems Simulation Chapter 7: Random-Number Generation

Systems Simulation Chapter 7: Random-Number Generation Systems Simulation Chapter 7: Random-Number Generation Fatih Cavdur fatihcavdur@uludag.edu.tr April 22, 2014 Introduction Introduction Random Numbers (RNs) are a necessary basic ingredient in the simulation

More information

Robust Optimal Tests for Causality in Multivariate Time Series

Robust Optimal Tests for Causality in Multivariate Time Series Robust Optimal Tests for Causality in Multivariate Time Series Abdessamad Saidi and Roch Roy Abstract Here, we derive optimal rank-based tests for noncausality in the sense of Granger between two multivariate

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

GLM Repeated Measures

GLM Repeated Measures GLM Repeated Measures Notation The GLM (general linear model) procedure provides analysis of variance when the same measurement or measurements are made several times on each subject or case (repeated

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Fundamentals of Unconstrained Optimization

Fundamentals of Unconstrained Optimization dalmau@cimat.mx Centro de Investigación en Matemáticas CIMAT A.C. Mexico Enero 2016 Outline Introduction 1 Introduction 2 3 4 Optimization Problem min f (x) x Ω where f (x) is a real-valued function The

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

arxiv: v2 [math.st] 4 Aug 2017

arxiv: v2 [math.st] 4 Aug 2017 Independent component analysis for tensor-valued data Joni Virta a,, Bing Li b, Klaus Nordhausen a,c, Hannu Oja a a Department of Mathematics and Statistics, University of Turku, 20014 Turku, Finland b

More information

Multivariate Time Series: Part 4

Multivariate Time Series: Part 4 Multivariate Time Series: Part 4 Cointegration Gerald P. Dwyer Clemson University March 2016 Outline 1 Multivariate Time Series: Part 4 Cointegration Engle-Granger Test for Cointegration Johansen Test

More information

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Blind Source Separation (BSS) and Independent Componen Analysis (ICA) Massoud BABAIE-ZADEH Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Outline Part I Part II Introduction

More information

the long tau-path for detecting monotone association in an unspecified subpopulation

the long tau-path for detecting monotone association in an unspecified subpopulation the long tau-path for detecting monotone association in an unspecified subpopulation Joe Verducci Current Challenges in Statistical Learning Workshop Banff International Research Station Tuesday, December

More information

Chi-square goodness-of-fit test for vague data

Chi-square goodness-of-fit test for vague data Chi-square goodness-of-fit test for vague data Przemys law Grzegorzewski Systems Research Institute Polish Academy of Sciences Newelska 6, 01-447 Warsaw, Poland and Faculty of Math. and Inform. Sci., Warsaw

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix Alexander P. Koldanov and Petr A. Koldanov Abstract A multiple decision statistical problem for the elements of inverse covariance

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

MA 265 FINAL EXAM Fall 2012

MA 265 FINAL EXAM Fall 2012 MA 265 FINAL EXAM Fall 22 NAME: INSTRUCTOR S NAME:. There are a total of 25 problems. You should show work on the exam sheet, and pencil in the correct answer on the scantron. 2. No books, notes, or calculators

More information