On Independent Component Analysis

Similar documents
A more efficient second order blind identification method for separation of uncorrelated stationary time series

Characteristics of multivariate distributions and the invariant coordinate system

Independent Component (IC) Models: New Extensions of the Multinormal Model

Invariant coordinate selection for multivariate data analysis - the package ICS

The squared symmetric FastICA estimator

Deflation-based separation of uncorrelated stationary time series

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

Scatter Matrices and Independent Component Analysis

Davy PAINDAVEINE Thomas VERDEBOUT

Independent component analysis for functional data

Signed-rank Tests for Location in the Symmetric Independent Component Model

Graduate Econometrics I: Maximum Likelihood II

Package BSSasymp. R topics documented: September 12, Type Package

Central Limit Theorem ( 5.3)

5 Introduction to the Theory of Order Statistics and Rank Statistics

Semiparametric Gaussian Copula Models: Progress and Problems

Stat 5101 Lecture Notes

Semiparametric Gaussian Copula Models: Progress and Problems

Likelihood-based inference with missing data under missing-at-random

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Notes on Random Vectors and Multivariate Normal

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Tests Using Spatial Median

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Stat 710: Mathematical Statistics Lecture 31

Separation of uncorrelated stationary time series. using autocovariance matrices

Package SpatialNP. June 5, 2018

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Multivariate Signed-Rank Tests in Vector Autoregressive Order Identification

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Efficiency of Profile/Partial Likelihood in the Cox Model

On Invariant Within Equivalence Coordinate System (IWECS) Transformations

On Expected Gaussian Random Determinants

Lecture 3. Inference about multivariate normal distribution

arxiv: v2 [stat.me] 31 Aug 2017

Linear Algebra Massoud Malek

Large Sample Properties of Estimators in the Classical Linear Regression Model

Estimation of linear non-gaussian acyclic models for latent factors

SUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES

Pascal Eigenspaces and Invariant Sequences of the First or Second Kind

Optimization and Testing in Linear. Non-Gaussian Component Analysis

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Second-Order Inference for Gaussian Random Curves

2. Matrix Algebra and Random Vectors

Canonical Correlation Analysis of Longitudinal Data

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Estimation and Testing for Common Cycles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Robust Optimal Tests for Causality in Multivariate Time Series

Asymptotic Statistics-III. Changliang Zou

A Squared Correlation Coefficient of the Correlation Matrix

Hypothesis testing in multilevel models with block circular covariance structures

Graduate Econometrics I: Unbiased Estimation

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Statistical Inference

On Multivariate Runs Tests. for Randomness

Binary choice 3.3 Maximum likelihood estimation

6-1. Canonical Correlation Analysis

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

TAMS39 Lecture 2 Multivariate normal distribution

Empirical Power of Four Statistical Tests in One Way Layout

Hypothesis testing: theory and methods

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Random Eigenvalue Problems Revisited

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Review of Linear Algebra

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Chapter 4. Theory of Tests. 4.1 Introduction

Practical tests for randomized complete block designs

simple if it completely specifies the density of x

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Systems Simulation Chapter 7: Random-Number Generation

Robust Optimal Tests for Causality in Multivariate Time Series

Math 494: Mathematical Statistics

GLM Repeated Measures

[y i α βx i ] 2 (2) Q = i=1

Analysis of variance, multivariate (MANOVA)

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

Graduate Econometrics I: Maximum Likelihood I

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Lecture 7 Introduction to Statistical Decision Theory

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Fundamentals of Unconstrained Optimization

University of California, Berkeley

Statistical Inference of Covariate-Adjusted Randomized Experiments

arxiv: v2 [math.st] 4 Aug 2017

Multivariate Time Series: Part 4

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39

the long tau-path for detecting monotone association in an unspecified subpopulation

Chi-square goodness-of-fit test for vague data

ECE521 week 3: 23/26 January 2017

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Probability Theory and Statistics. Peter Jochumzen

Review (Probability & Linear Algebra)

MA 265 FINAL EXAM Fall 2012

Transcription:

On Independent Component Analysis Université libre de Bruxelles European Centre for Advanced Research in Economics and Statistics (ECARES) Solvay Brussels School of Economics and Management Symmetric

Outline Symmetric Symmetric

IC Model IC Model Symmetric

IC Model In the independent component (IC) model it is assumed that the p-variate random vector x = Ωz +µ, (1) where µ is a location vector, Ω is a full rank p p mixing matrix, and z is a p-variate vector with mutually independent components with common median zero. Symmetric

Independent Component Analysis In the independent component analysis (ICA) the aim is to find an estimate of an unmixing matrix Γ such that Γx has independent components. Symmetric

Symmetric

ICA is an important and timely research area. The field of applications of ICA is wide and constantly expanding, varying from biomedical image data applications to signal processing, and economics. Symmetric

Standardization Standardization Symmetric

The mixing matrix Ω in Model (1) is clearly not uniquely defined: for any p p permutation matrix P and any full-rank diagonal matrix D, one can indeed always write x = [ ΩPD ][ (PD) 1 z ] +µ = Ω z +µ, (2) where z still has independent components with median zero. Symmetric Solving this identifiability problem requires either standardizing z or standardizing the mixing matrix Ω.

Location and Scatter Functionals Location and Scatter Functionals Symmetric

Let x denote a p-variate random vector with a cumulative distribution function F x and let X = [x 1...x n ], where x 1,..., x n is a random sample from the distribution F x. Symmetric

Location and Scatter Functionals A p 1 vector-valued functional T(F x ), which is affine equivariant in the sense that T(F Ax+b ) = AT(F x )+b for all nonsingular p p matrices A and for all p-vectors b, is called a location functional. Symmetric

Location and Scatter Functionals A p p matrix-valued functional S(F x ) which is positive definite and affine equivariant in the sense that S(F Ax+b ) = AS(F x )A T for all nonsingular p p matrices A and for all p-vectors b, is called a scatter functional. Symmetric

Location and Scatter Functionals The corresponding sample statistics are obtained if the functionals are applied to the empirical cumulative distribution F n based on a sample x 1, x 2,...,x n. Notation T(F n ) and S(F n ) or T(X) and S(X) is used for the sample statistics. The location and scatter sample statistics then also satisfy and T(AX + b1 T n) = AT(X)+b S(AX + b1 T n) = AS(X)A T for all nonsingular p p matrices A and for all p-vectors b. Scatter matrix functionals are usually standardized such that in the case of standard multivariate normal distribution S(F x ) = I. Symmetric

Location and Scatter Functionals The first examples of location and scatter functionals are the mean vector and the regular covariance matrix: T 1 (F x ) = E(x) and S 1 (F x ) = Cov(F x ) = E ( (x E(x))(x E(x)) T). Symmetric

Location and Scatter Functionals Location and scatter functionals can be based on the third and fourth moments as well. A location functional based on third moments is T 2 (F x ) = 1 p E ( (x E(x)) T Cov(F x ) 1 (x E(x))x ) Symmetric and a scatter matrix functional based on fourth moments is S 2 (F x ) = 1 p + 2 E ( (x E(x))(x E(x)) T Cov(F x ) 1 (x E(x))(x E(x)) T).

Location and Scatter Functionals There are several other location and scatter functionals, even families of them, having different desirable properties (robustness, efficiency, limiting multivariate normality, fast computations, etc). Symmetric

Location and Scatter Functionals If a scatter matrix functional S(F x ) is a diagonal matrix for all x having independent components, it is said to posses the independence property. Symmetric

Location and Scatter Functionals The regular covariance matrix is a scatter matrix with the independence property. Another example of a scatter matrix with the independence property is the matrix based on fourth moments. Symmetric

Location and Scatter Functionals Most scatter functionals do posses the independence property only if all the components (or all the components except for one) are symmetric. However, every scatter/shape matrix functional S(F x ) can be symmetrized by setting S sym (F x ) = S(F x1 x 2 ), where x 1 and x 2 are independent random vectors having the same cumulative distribution function F x. The resulting symmetrized scatter matrix does always have the independence property Symmetric

Back to the standardization of the Symmetric

Vector z in Model (1) can be standardized using two different location functionals and two different scatter matrix functionals. Symmetric

The marginal distributions of z in Model (1) can be standardized using two different location functionals T 1 and T 2 and two different scatter functionals S 1 and S 2, possessing the independence property, by setting T 1 (F z ) = 0, S 1 (F z ) = I p, T 2 (F z ) = δ and S 2 (F z ) = D, where δ is a p-vector with all components δ i 0, i = 1,..., p, and D is a diagonal matrix with diagonal elements d 1... d p > 0. If now δ i > 0, i = 1,..., p, and if the diagonal elements of D are distinct, then the mixing matrix Ω is uniquely defined. Symmetric

Standardizing the Mixing Matrix Mixing matrix Ω in Model (1) can be standardized fixing the order, signs, and scales of the column vectors of Ω. Symmetric

Standardizing the Mixing Matrix The 1 can also be standardized by standardizing the mixing matrix using a mapping Ω L = ΩD + 1 PD 2, where D + 1 is the positive definite diagonal matrix that makes each column of ΩD + 1 have Euclidean norm one, P is the permutation matrix for which the matrix B = (b ij ) = ΩD + 1 P satisfies b ii > b ij for all i < j, and D 2 is the diagonal matrix that makes all the diagonal entries of L = ΩD + 1 PD 2 to be equal to one. Ties may be taken care of e.g., by basing the ordering on subsequent rows of B above, but they may prevent the mapping to be continuous. Thus it is often convenient to restrict to the collection of mixing matrices Ω for which no ties occur in the permutation step. Symmetric

There are good things and bad things in both standardization approaches, but the key thing is that both standardization methods presented above enable to fix Model (1) uniquely. Symmetric

Symmetric

Lack of uniqueness of Model (1) causes some ambiguity about what is meant by an IC functional. Symmetric

Let M denote the set of all full-rank p p matrices. (Then naturally all unmixing matrices Γ M.) Let P denote a permutation matrix, J a sign-change matrix, and D a scaling matrix. Let C = {C M C = PJD for some P, J, and D}. Symmetric Now two matrices Γ 1 and Γ 2 are said to be equivalent if Γ 1 = CΓ 2 for some C C. We then write Γ 1 Γ 2.

A functional Γ(F x ) M is an IC functional in the (1) if Γ(F x )Ω I p, and if it is affine equivariant in the sense that Γ(F Ax ) = Γ(F x )A 1 Symmetric for all A M.

Based Symmetric Based

Approach based on the use of two scatter matrices Let S 1 (F x ) and S 2 (F x ) denote two different scatter functionals with the independence property. The IC functional Γ(F x ) based on the scatter matrix functionals S 1 (F x ) and S 2 (F x ) is defined as a solution of the equations ΓS 1 (F x )Γ T = I p and ΓS 2 (F x )Γ T = Λ, Symmetric where Λ = Λ(F x ) is a diagonal matrix with diagonal elements λ 1... λ p > 0.

Approach Based Scatter Matrices One of the first solutions for the ICA problem, the fourth order blind identification (FOBI) functional is obtained if the scatter functionals S 1 (F x ) and S 2 (F x ) are the scatter matrices based on the second and fourth moments, respectively. Symmetric

Approach Based Scatter Matrices The functionals and corresponding sample statistics G(X) and L(X) are affine equivariant and invariant in the sense that G(AX + b1 T n ) = G(X)A 1 and L(AX + b1 T n ) = L(X) for all A M and b R p. For the asymptotics, it is therefore not a restriction to assume that X is a random sample from a distribution F x with S(F x ) = I and S 2 (F x ) = Λ, where the diagonal elements of Λ are λ 1... λ p > 0. Symmetric

Approach Based Scatter Matrices Assume that n(s1 (X) I) = O p (1) and n(s 2 (X) Λ) = O p (1), with λ 1 >... > λ p > 0, and assume that the diagonal elements of G(X) are set to be positive. Then n(g(x)ii 1) = 1 2 n(s1 (X) ii 1)+o p (1), Symmetric (λ i λ j ) ng(x) ij = ns 2 (X) ij λ i ns1 (X) ij + o p (1), i j, and n(l(x)ii λ i ) = n(s 2 (X) ii λ i ) λ i n(s1 (X) ii 1)+o p (1).

Approach Based Scatter Matrices It is interesting to note that the asymptotic behavior of the diagonal elements of G(X) does not depend on S 2 (X) at all. The three equations above are in fact true if λ i is distinct from all the other eigenvalues λ j, j i. The limiting joint distributions of the sample eigenvectors and sample eigenvalues for a subset with distinct population eigenvalues can then be derived from the limiting distributions of S 1 (X) and S 2 (X). Symmetric

Signed Ranks Signed Ranks Symmetric

Symmetric In symmetric it is assumed that the p-variate vector x = Ωz +µ (3) where Ω is a full-rank p p mixing matrix, µ is a location vector and z is a p-variate vector with mutually independent and symmetrically distributed components. Symmetric

Signed Ranks The parametrization of the (3) based on standardizing the mixing matrix leads to considering the model associated with x = Lz +µ, (4) where µ R p, L M, and z has independent and symmetrically distributed marginals with common median zero. The resulting collection of densities (of the form h(z) = p r=1 h r(z r ), where h r is the symmetric density of z r ) will be denoted as F. Symmetric

Signed Ranks The hypothesis under which n mutually independent observations x i, i = 1,...,n are obtained from (4), where z has density h, will be denoted as P (n) ϑ,h, with ϑ = (µ T,(vecd L) T ) T Θ = R p vecd (M), or alternatively, as P (n) µ,l,h. This leads to the semiparametric model P (n) = h P (n) h = h ϑ Θ {P (n) ϑ,h }. Symmetric

Assumptions As usual, ULAN at some specific g = f requires technical assumptions: in the present context, we need that f belongs to the collection F ulan of densities in F for which each f r, r = 1,...,p, is absolutely continuous, with a derivative f r that satisfies (below we let ϕ fr = f r /f r) σ 2 f r = y 2 f r (y) dy <, I fr = ϕ 2 f r (y)f r (y) dy <, Symmetric and J fr = y 2 ϕ 2 f r (y)f r (y) dy <.

For any f F ulan, we let γ rs (f) = I fr σ 2 f s, we define the optimal p-variate location score function ϕ f R p R p through z = (z 1,...,z p ) ϕ f (z) = (ϕ f1 (z 1 ),...,ϕ fp (z p )), and we denote by I f the diagonal matrix with diagonal entries I fr, r = 1,...,p. Further we write I l for the l-dimensional identity matrix and we define C = p 1 p (e r e r u s e s+δ s r ), r=1 s=1 Symmetric where e r and u r stand for the rth vectors of the canonical basis of R p and R p 1, respectively, and δ s r is equal to one if s r and to zero otherwise.

ULAN of symmetric Then the parametric model P (n) f is ULAN for any fixed f F ulan, with central sequence (n) ϑ,f = ( (n) ϑ,f;1 (n) ϑ,f;2 ) = ( n 1/2 (L 1 ) n i=1 ϕ f(z i ) n 1/2 C(I p L 1 ) n i=1 vec(ϕ f(z i )Z i I p ) where Z i = Z i (ϑ) = L 1 (X i µ), and full-rank information matrix ( ) Γ L,f = Γ L,f;1 0 0 Γ, L,f;2 where Γ L,f;1 = (L 1 ) I f L 1 and ), Symmetric [ p Γ L,f;2 = C(I p L 1 ) (J fr 1)(e r e r e r e r) r=1 p ( + γsr (f)(e r e r e s e s)+(e r e s e s e r) )] (I p L 1 )C. r,s=1,r s

Efficient inference ULAN property allows to derive parametric efficiency bounds at f and to construct the corresponding parametrically optimal inference procedures for a parameter. In the present context, when testing H 0 : L = L 0 against H a : L L 0, parametrically optimal tests reject the null at asymptotic level α whenever ϑ,f;2 Γ 1 L 0,f;2 ϑ,f;2 > χ 2 p(p 1),1 α, Symmetric where χ 2 k,1 α denotes the α-upper quantile of the χ2 k distribution.

Under local alternatives Under local alternatives of the form H a : L = L 0 + n 1/2 H, where H is an arbitrary p p matrix with zero diagonal entries, these tests have asymptotic power Ψ p(p 1) ( χ 2 p(p 1),1 α ;(vecd H) Γ L0,f;2(vecd H) ), where χ 2 k,1 α stands for the α-upper quantile of the χ2 k distribution, and Ψ k ( ;δ) denotes the cumulative distribution function of the non-central χ 2 k distribution with non-centrality parameter δ. This settles the parametrically optimal (at f ) performance for hypothesis testing. Symmetric

Semiparametrically efficient inference The underlying density f is often unspecified in practice, which leads to considering the semiparametric model. Semiparametrically efficient (at f ) inference procedures on L then may be based on the so-called efficient central sequence ϑ,f;2 resulting from ϑ,f;2 by performing adequate tangent space projections. Symmetric

Under local alternatives The performance of semiparametrically efficient tests on L can be characterized in terms of Γ L,f;2 : a test of H 0 : L = L 0 is semiparametrically efficient at f (at asymptotic level α) if its asymptotic powers under local alternatives of the form H a : L = L 0 + n 1/2 H, are given by Ψ p(p 1) ( χ 2 p(p 1),1 α ;(vecd H) Γ L 0,f;2 (vecd H) ). Symmetric

Testing We first consider the problem of testing H 0 : L = L 0 against H a : L L 0, where L 0 is fixed. Semiparametrically optimal procedures are based on the efficient central sequence ϑ,f. Classically, ϑ,f is obtained by performing tangent space computations. When, however, the semiparametric model at hand enjoys a strong invariance structure, the efficient central sequence ϑ,f can alternatively be obtained by conditioning the original central sequence ϑ,f with respect to the corresponding maximal invariant. Symmetric

Signed-ranks In the present setup, this maximal invariant is given by (S 1 (ϑ),...,s n (ϑ), R + 1 (ϑ),...,r+ n (ϑ)), with S i (ϑ) = (S i1 (ϑ),...,s ip (ϑ)) and R + i (ϑ) = (R + i1 (ϑ),...,r+ ip (ϑ)), where S ir (ϑ) is the sign of Z ir (ϑ) = (L 1 (X i µ)) r and R + ir (ϑ) is the rank of Z ir(ϑ) among Z 1r (ϑ),..., Z nr (ϑ). This is what leads to considering signed-rank procedures when performing inference on L in the present context. Symmetric

Signed-rank testing in symmetric s Let ˆϑ 0 = (ˆµ,(vecd L) ), where ˆµ is an estimator that is locally and asymptotically discrete, and n consistent under H 0. Then one can show that the nonparametric counterpart of the test statistic is given by where [ ( 1 vec odiag n and Q f = ( ˆϑ0,f;2 ) (Γ L 0,f;2 ) 1 ˆϑ0,f;2, ϑ,f;2 = C(I p L 1 ) n ( (S i (ϑ) ϕ f i=1 Γ L,f;2 = C(I p L 1 ) [ p r,s=1,r s F 1 + ( R + i (ϑ) n+1 )))( S i (ϑ) F 1 + ( R + i (ϑ) n+1 Symmetric )) ) ] ( γsr (f)(e r e r e s e s)+(e r e s e s e r) )] (I p L 1 )C.

Linear hypothesis Assume that Ω is p(p 1) l matrix with full rank l. Let V(Ω) denote the vector space that is spanned by the columns of Ω. We consider testing H 0 : vecd L {vecd L 0 + v v V(Ω)} against H a : vecd L {vecd L 0 + v v V(Ω)}. Symmetric

Test statistic Let where Q ϑ,f (L 0,Ω) = ( ϑ,f;2) P ϑ,ω ϑ,f;2, P ϑ,ω = (Γ L,f;2 ) 1 Ω(Ω Γ L,f;2 Ω) 1 Ω. Symmetric

One Step Estimation Based on Signed Ranks Let ϑ = ( µ T,(vecd L) T ) T denote a root-n consistent and locally asymptotically discrete preliminary estimator. Let [ p GL,f,h;2 = C(I p L 1 ) T ( γsr (f, h)(e r er T e s es T ) where r,s=1,r s + ρ rs (f, h)(e r e T s e se T r ))] (I p L 1 )C T, Symmetric γ rs (f, h) = and ρ rs (f, h) = 1 0 1 0 ϕ fr (F 1 r F 1 r (u))ϕ hr (Hr 1 (u)) du (u)ϕ hr (H 1 (u)) du r 1 0 1 0 F 1 s ϕ fs (F 1 s (u) Hs 1 (u) du (u)) Hs 1 (u) du

and let Ĝ L,f;2 denote an estimate of GL,f,h;2 formed by plugging in preliminary a estimator ϑ and estimators ˆγ rs (f) and ˆρ rs (f) that (i) are locally asymptotically discrete and (ii) satisfy ˆγ rs (f) = γ rs (f, h)+o P (1) and ˆρ rs (f) = ρ rs (f, h)+o P (1) as n, under ϑ Θ h Fulan {P (n) ϑ,h }. Symmetric

Signed Ranks Let vecd ˆLf = (vecd L)+n 1/2 (Ĝ L,f;2 ) 1 ϑ,f;2, where Ĝ L,f;2 is the consistent estimate of GL,f,h;2 just defined. Then n vecd (ˆL f L) d ( N p(p 1) 0,(Γ L,f;2 ) 1) as n, under µ R p{p (n) µ,l,f }. Symmetric

Symmetric

Due to the vast amount of different ICA estimates and algorithms, asymptotic as well as finite sample criteria are needed for their comparisons. While asymptotic results (convergence, asymptotic normality, etc.) are often missing, several finite-sample performance indices have been proposed in the literature to compare different estimates in simulation studies. Symmetric

First, one can compare the true sources z (which are of course known in the simulations) and the estimated sources ẑ = ˆΓx. Second, one can measure the closeness of the true unmixing matrix Ω 1 (used in the simulations) and the estimated unmixing matrix ˆΓ. In both cases the problem is that the order, signs and scales of the rows of the estimated unmixing matrix may not match as ˆΓ is typically not an estimate of Ω 1. For a good estimate, the gain matrix Ĝ = ˆΓΩ is close to a matrix PJD, where P is a permutation matrix, J is a sign-change matrix, and D is a scaling matrix. Symmetric

Let A denote a p p matrix. The shortest squared distance (divided by p 1) between the set {CA C C} of equivalent matrices (to A) and I p is given by D 2 (A) = 1 p 1 inf C C CA I p 2 Symmetric where is the matrix (Frobenius) norm.

Let A be any p p matrix having at least one nonzero element in each row. The shortest squared distance D 2 (A) fulfils the following four conditions: 1. 1 D 2 (A) 0, 2. D 2 (A) = 0 if and only if A I p, 3. D 2 (A) = 1 if and only if A 1 p a T for some p-vector a, and 4. the function c D 2 (I p + c odiag(a)) is increasing in c [0, 1] for all matrices A such that A 2 ij 1, i j. Symmetric

The shortest distance between the identity matrix and the set of matrices {CˆΓΩ : C C} equivalent to the gain matrix Ĝ = ˆΓΩ is as given in the following. The minimum distance index for ˆΓ is ˆD = D(ˆΓΩ) = 1 p 1 inf C C CˆΓΩ I p. Symmetric

It follows directly that 1 ˆD 0, and ˆD = 0 if and only if ˆΓ Ω 1. The worst case with ˆD = 1 is obtained if all the row vectors of ˆΓΩ point to the same direction. Thus the value of the minimum distance index is easy to interpret. Note that D(ˆΓΩ) = D(CˆΓΩ) for all C C. Also, if Symmetric x i = Ωz i and x i = (AΩ)z i = Ω z i, and ˆΓ is calculated from X = [x 1,..., x n], then D(ˆΓ Ω ) = D(ˆΓΩ). Thus the minimum distance index provides a fair comparison for different.

Assume that the model is fixed such that Γ(F x ) = Ω = I p and that n vec(ˆγ I p ) d N p 2(0,Σ). Then nˆd 2 = n p 1 odiag(ˆγ) 2 + o P (1) and the limiting distribution of nˆd 2 is that of (p 1) 1 k i=1 δ iχ 2 i where χ 2 1,...,χ2 k are independent chi squared variables with one degree of freedom, and δ 1,...,δ k are the k nonzero eigenvalues (including all algebraic multiplicities) of Symmetric ASCOV( n vec(odiag(ˆγ))) = (I p 2 D p,p )Σ(I p 2 D p,p ), with D p,p = p i=1 (e ie T i ) (e i e T i ).

Symmetric

Asymptotics for different scatter matrices, complex valued ICA, time series... Symmetric

I P. Ilmonen, On asymptotical properties of the scatter matrix based estimates for complex valued independent component analysis, submitted. P. Ilmonen, J. Nevalainen and H. Oja, Characteristics of multivariate distributions and the invariant coordinate system, Statistics and Probability Letters 80(23-24) (2010), 1844 1853. P. Ilmonen, K. Nordhausen, H. Oja and E. Ollila, On asymptotics of ICA estimators and their performance Indices, submitted. P. Ilmonen and D. Paindaveine, Semiparametrically efficient inference based on signed ranks in symmetric independent component models, the Annals of Statistics 39(5) (2011), 2448 2476. P. Ilmonen and D. Paindaveine, Signed rank tests in symmetric s, manuscript. Symmetric

II P. J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner, Efficient and Adaptive Statistical Inference for Semiparametric Models, Johns Hopkins University Press, Baltimore (1993). L. Le Cam, Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New York (1986). M. Hallin and B. J. M. Werker, Semiparametric efficiency, distribution-freeness, and invariance, Bernoulli 9 (2003), 55 65. H. Oja, D. Paindaveine and S. Taskinen, Parametric and nonparametric tests for multivariate independence in IC models, Submitted. E. Ollila and H.-J. Kim, On testing hypotheses of mixing vectors in the ICA model using FastICA, Proceedings of IEEE International Symposium on Biomedical Imaging (ISBI 11) (2011), 325 328. Symmetric

III A. Hyvärinen, J. Karhunen and E. Oja, Independent Component Analysis, John Wiley & Sons, New York (2001). H. Oja, Multivariate Nonparametric Methods With R, Springer-Verlag, New York (2010). Symmetric

Thank You! Symmetric

An example The first specific testing problem we consider, is testing if the element k of vecd L is some fixed c 0. Let o i and v i stand for the ith vectors of the canonical basis of R p(p 1) and R p(p 1) 1, respectively Now Ω can be chosen to be a p(p 1) p(p 1) 1 matrix having canonical basis vectors of R p(p 1), excluding the kth basis vector, as its column vectors i.e. Ω i = o i, i < k, Ω i = o i+1, i k, and vecd L 0 can be chosen to have c 0 as its element k and all other elements of vecd L 0 can be chosen to be 0. Here trace(p ϑ,ω Γ L,f;2 ) = 1. Symmetric

An example Testing if the rth vector of L is some fixed c 0 is equal to testing if elements ((r 1)(p 1)+1) (r(p 1)) of of vecd L are fixed. Let o i and w i stand for the ith vectors of the canonical basis of R p(p 1) and R p(p 2), respectively. Now Ω can be chosen to be a p(p 1) p(p 2) matrix having canonical basis vectors of R p(p 1), excluding the basis vectors ((r 1)(p 1)+1) (r(p 1)), as its column vectors i.e. Ω i = o i, i < ((r 1)(p 1)+1), Ω i = o i+(p 1), i ((r 1)(p 1)+1),and vecd L 0 can be chosen to have elements of c 0 (except the diagonal element) and all other elements of vecd L 0 can be chosen to be 0. Here trace(p ϑ,ω Γ L,f;2 ) = p 1. Symmetric

ULAN ULAN, ULAN, ULAN... Symmetric

ULAN A sequence of statistical models P (n) f = {P (n) ϑ,f ϑ Θ Rk, f F} is uniformly locally asymptotically normal (ULAN) if for any ϑ n = ϑ+o(n 1/2 ) and any bounded sequence (τ n ), there exists a symmetric positive definite matrix G ϑ,f such that, under P (n) ϑ,f as n, log(dp (n) ϑ n+n 1/2 τ n,f /dp(n) ϑ n,f ) = τ T n (n) ϑ n,f 1 2 τ T n G ϑ,f τ n + o P (1), Symmetric and that, still under P (n) ϑ,f, (n) ϑ n,f is asymptotically normal with mean zero and covariance matrix G ϑ,f.

ULAN ULAN property allows to derive parametric efficiency bounds at f and to construct the corresponding parametrically optimal inference procedures for ϑ. When testing H 0 : ϑ = ϑ 0 against H a : ϑ ϑ 0, parametrically optimal tests reject the null at asymptotic level α whenever (n)t ϑ 0,f G 1 ϑ 0,f (n) ϑ 0,f > χ 2 k,1 α, where χ 2 k,1 α denotes the α-upper quantile of the χ2 k distribution. Under sequences of alternatives of the form, these tests have the asymptotic power P (n) ϑ 0 +n 1/2 τ,f Ψ k (χ 2 k,1 α ;τ T G ϑ0,fτ), where Ψ k ( ;δ) stands for the cumulative distribution function of the non-central χ 2 k distribution with non-centrality parameter δ. This settles the parametrically optimal (at f ) performance for hypothesis testing. Symmetric

ULAN As for point estimation, an estimator ˆϑ is parametrically efficient at f if and only if n (ˆϑ ϑ) d Nk ( 0, G 1 ϑ,f). Symmetric

ULAN The underlying density f is often unspecified in practice, which leads to considering the semiparametric model P (n) = h ϑ Θ {P (n) ϑ,h }. In P(n), semiparametrically optimal (still at f ) inference procedures are based on the efficient central sequence (n) ϑ,f resulting from the original central sequence (n) ϑ,f by performing adequate tangent space projections. Under P (n) ϑ,f, the efficient central sequence (n) ϑ,f typically is still asymptotically normal with mean zero, but now with covariance matrix Gϑ,f (the efficient information matrix at f ). Semiparametrically optimal tests (at f ) reject the null at asymptotic level α whenever Symmetric (n)t ϑ 0,f (Gϑ 0,f ) 1 (n) ϑ 0,f > χ 2 k,1 α. They have asymptotic powers Ψ k (χ 2 k,1 α ;τ T (G ϑ 0,f )τ) under the sequences of alternatives considered above.

ULAN An estimator ˆϑ is semiparametrically efficient at f if and only if n (ˆϑ ϑ) d Nk ( 0,(G ϑ,f ) 1). Symmetric