HIGHER ORDER CUMULANTS OF RANDOM VECTORS, DIFFERENTIAL OPERATORS, AND APPLICATIONS TO STATISTICAL INFERENCE AND TIME SERIES

Similar documents
HIGHER ORDER CUMULANTS OF RANDOM VECTORS AND APPLICATIONS TO STATISTICAL INFERENCE AND TIME SERIES

Publications (in chronological order)

Supermodular ordering of Poisson arrays

Multivariate Normal-Laplace Distribution and Processes

OR MSc Maths Revision Course

Matrix Differential Calculus with Applications in Statistics and Econometrics

On Expected Gaussian Random Determinants

Large Sample Properties of Estimators in the Classical Linear Regression Model

Non-Gaussian Maximum Entropy Processes

YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL

The Equivalence of Ergodicity and Weak Mixing for Infinitely Divisible Processes1

Second-order approximation of dynamic models without the use of tensors

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Elementary maths for GMT

Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population

Long-range dependence

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

Information Theoretic Asymptotic Approximations for Distributions of Statistics

5: MULTIVARATE STATIONARY PROCESSES

The correct asymptotic variance for the sample mean of a homogeneous Poisson Marked Point Process

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

. Find E(V ) and var(v ).

Advanced Digital Signal Processing -Introduction

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

EVALUATION OF A FAMILY OF BINOMIAL DETERMINANTS

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES

Time Series Analysis. Asymptotic Results for Spatial ARMA Models

Chapter 7. Linear Algebra: Matrices, Vectors,

Mi-Hwa Ko. t=1 Z t is true. j=0

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

Diagonal and Monomial Solutions of the Matrix Equation AXB = C

KALMAN-TYPE RECURSIONS FOR TIME-VARYING ARMA MODELS AND THEIR IMPLICATION FOR LEAST SQUARES PROCEDURE ANTONY G AU T I E R (LILLE)

On Differentiability of Average Cost in Parameterized Markov Chains

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Markov Chains, Stochastic Processes, and Matrix Decompositions

ASSESSING A VECTOR PARAMETER

TAMS39 Lecture 2 Multivariate normal distribution

Stable Process. 2. Multivariate Stable Distributions. July, 2006

7. MULTIVARATE STATIONARY PROCESSES

Gaussian vectors and central limit theorem

RESEARCH REPORT. Estimation of sample spacing in stochastic processes. Anders Rønn-Nielsen, Jon Sporring and Eva B.

Linear Algebra Short Course Lecture 2

Measures and Jacobians of Singular Random Matrices. José A. Díaz-Garcia. Comunicación de CIMAT No. I-07-12/ (PE/CIMAT)

Multivariate Distributions

Linear Systems and Matrices

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

A Note on Skew Characters of Symmetric Groups Jay Taylor

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

We denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,

Hands-on Matrix Algebra Using R

On prediction and density estimation Peter McCullagh University of Chicago December 2004

High-dimensional asymptotic expansions for the distributions of canonical correlations

COMPUTER ALGEBRA DERIVATION OF THE BIAS OF LINEAR ESTIMATORS OF AUTOREGRESSIVE MODELS

Module 3. Function of a Random Variable and its distribution

Submitted to the Brazilian Journal of Probability and Statistics

MAT 2037 LINEAR ALGEBRA I web:

Invertibility and stability. Irreducibly diagonally dominant. Invertibility and stability, stronger result. Reducible matrices

Lattices and Hermite normal form

ELEMENTARY LINEAR ALGEBRA

Formulas for probability theory and linear models SF2941

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Fundamentals of Engineering Analysis (650163)

Gaussian interval quadrature rule for exponential weights

1 Appendix A: Matrix Algebra

On uniqueness of moving average representations of heavy-tailed stationary processes

First-order random coefficient autoregressive (RCA(1)) model: Joint Whittle estimation and information

Determinants - Uniqueness and Properties

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Orthogonal Symmetric Toeplitz Matrices

Consistent Bivariate Distribution

Section 3.2. Multiplication of Matrices and Multiplication of Vectors and Matrices

arxiv: v1 [math.co] 10 Aug 2016

[y i α βx i ] 2 (2) Q = i=1

A matrix over a field F is a rectangular array of elements from F. The symbol

STOCHASTIC GEOMETRY BIOIMAGING

CHAPTER 0 PRELIMINARY MATERIAL. Paul Vojta. University of California, Berkeley. 18 February 1998

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

a Λ q 1. Introduction

HOMEWORK Graduate Abstract Algebra I May 2, 2004

ECE 275A Homework 6 Solutions

Injective semigroup-algebras

Corrigendum to Inference on impulse. response functions in structural VAR models. [J. Econometrics 177 (2013), 1-13]

1 Matrices and Systems of Linear Equations. a 1n a 2n

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

PREDICTION AND NONGAUSSIAN AUTOREGRESSIVE STATIONARY SEQUENCES 1. Murray Rosenblatt University of California, San Diego

COVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE

Eigenvalues of Random Matrices over Finite Fields

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

component risk analysis

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Lectures on Linear Algebra for IT

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

AN INVERSE EIGENVALUE PROBLEM AND AN ASSOCIATED APPROXIMATION PROBLEM FOR GENERALIZED K-CENTROHERMITIAN MATRICES

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

BESSEL MATRIX DIFFERENTIAL EQUATIONS: EXPLICIT SOLUTIONS OF INITIAL AND TWO-POINT BOUNDARY VALUE PROBLEMS

Stochastic Design Criteria in Linear Models

MATH 304 Linear Algebra Lecture 10: Linear independence. Wronskian.

Local Whittle Likelihood Estimators and Tests for non-gaussian Linear Processes

Transcription:

HIGHER ORDER CUMULANTS OF RANDOM VECTORS DIFFERENTIAL OPERATORS AND APPLICATIONS TO STATISTICAL INFERENCE AND TIME SERIES S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK Abstract This paper provides a unified and comprehensive approach that is useful in deriving expressions for higher order cumulants of random vectors The use of this methodology is then illustrated in 3 diverse and novel contexts namely: i in obtaining a lower bound Bhattacharya bound for the variancecovariance matrix of a vector of unbiased estimators where the density depends on several parameters ii in studying the asymptotic theory of multivariable statistics when the population is not necessarily Gaussian and finally iii in the study of multivariate nonlinear time series models and in obtaining higher order cumulant spectra The approach depends on expanding the characteristic functions and cumulant generating functions in terms of the Kronecker products of differential operators Using Tensor calculus McCullagh and Speed have obtained similar results for cumulants of random vectors but our objective here is to derive such expressions using only elementary calculus of several variables and also to highlight some important applications in statistics Introduction and Review It is well known that cumulants of order greater than two are zero for random variables which are Gaussian In view of this higher order cumulants are often used in testing for Gaussianity and multivariate Gaussianity as well as to prove classical limit theorems These are also used in asymptotic theory of statistics such as in Edgeworth series expansions Consider a scalar random variable X and let us assume that all its moments µ j = EX j j = 2 exist Let the characteristic function of X be denoted by ϕ X λ and it then has the series expansion given by ϕ X λ = Ee iλx iλ j = + µ j λ R j! From we observe i [ j d j ϕλ/dλ j] = µ λ=0 j In other words the j th derivative of the Taylor series expansion of ϕ X λ evaluated at λ = 0 gives the j th moment The cumulant generating function ψ X λ is defined as see eg Leonov and Shiryaev959 2 ψ X λ = ln ϕ X λ = j= j= iλ j κ j j! where κ j is called as the j th cumulant of the random variable X As before we see κ j = i [ j d j ψ X λ/dλ j] λ=0 Comparing and 2 one can write the cumulants in terms of moments and vice versa For example κ = µ κ 2 = µ 2 µ 2 etc Now suppose the random variable X is normal with mean µ and variance σ 2 then we know ϕ X λ = expiλµ λ 2 σ 2 /2 which implies κ j = 0 for all j 3 We now consider 99 Mathematics Subject Classification Primary 62E7 62E20; Secondary 62H0 60E05 Key words and phrases Cumulants for multivariate variables Cumulants for likelihood functions Bhattacharya lower bound Taylor series expansion Multivariate time series This research of Gy Terdik is partially supported by the Hungarian NSF OTKA No T 032658 and the NATO fellowship 3008/02 This paper is in final form and no version of it will be submitted for publication elsewhere

2 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK generalizing the above results to the case when X is a d dimensional random vector The definition of the joint moments and the cumulants of the random vector X requires a Taylor series expansion of a function in several variables and also its partial derivatives in these variables and they are similar to and 2 Though these expansions may be considered straightforward generalizations the methodology and the mathematical notation gets quite cumbersome when dealing with the derivatives of characteristic functions and the cumulant generating functions of random vectors However we need such expansions in studying the asymptotic theory in classical multivariate analysis as well as in multivariate nonlinear time series see Subba Rao and Wong 999 A unified and streamlined methodology for obtaining such expressions is desirable and that is what we attempt to do here As an example consider a random sample X X 2 X n from a multivariate normal distribution with mean vector µ and variance covariance matrix Σ We know that the sample mean vector X has a multivariate normal distribution and the sample variance covariance matrix has a Wishart distribution and they are independent However when the random sample is not from a multivariate normal distribution one approach to obtaining such distributions is through the multivariate Edgeworth expansion whose evaluation requires expressions for higher order cumulants of random vectors Further applications of these in the time-series context can be found in the books of Brillinger 200 Terdik 999 and the recent papers of Subba Rao and Wong 999 and Wong 997 Similar results can be found in the works of McCullagh 987 and Speed 990 but the techniques we use here are quite different and require only knowledge of calculus of several variables instead of Tensor calculus Also we believe this to be a more transparent and streamlined approach Finally we derive several new results of interest in statistical inference and time series using these methods We derive Yule-Walker type difference equations in terms of higher order cumulants for stationary multivariate linear processes Also derived are expressions for higher order cumulant spectra of such processes which turn out to be useful in constructing statistical tests for linearity and Gaussianity of multivariate time series The Information inequality or the Cramer-Rao lower bound for the variance of an unbiased estimator is well known for both single parameter and multiple parameter cases A more accurate series of bounds for the single parameter case are given by Bhattacharya 946 and they depend on all higher order derivatives of the log-likelihood function Here we give a generalization of this bound for the multiparameter case based on partial derivatives of various orders We illustrate this with an example where we find a lower bound for the variance of an unbiased estimator of a nonlinear function of the parameters In Section 2 we derive the properties of differential operators which are useful in obtaining expressions for the partial derivatives of functions of several vectors In Section 3 we express Taylor series of such functions in terms of differential operators The methods obtained in Sections 2 and 3 are used in Section 4 to define the cumulants of several random variables In section 5 we consider applications of the above methods to statistical inference We also define multivariate measures of skewness and kurtosis and multivariate time series We also obtain properties of the cumulants of the partial derivatives of log-likelihood function of a random sample X X 2 X n drawn from a distribution F x Ω In this section we use expressions derived for the partial derivatives to obtain Bhattacharya-type bound and illustrate it with an example Section 6 the Appendix contains proofs of some results used in the earlier sections 2 Differential operators First we introduce the Jacobian matrix and higher order derivatives Let λ = λ λ 2 λ d R d and let φ λ = [φ λ φ 2 λ φ m λ] be a vector valued function which is differentiable in all its

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 3 arguments here and elsewhere denotes the transpose The Jacobian matrix of φ is defined by φ φ λ λ 2 D λ φ = φ [ ] λ = φ λ = λ λ 2 λ d φ 2 φ λ d λ φ m λ here and later on the differential operator / λ j is acting from right to left keeping the matrix calculus valid We can write this in a vector form as follows: Definition The operator D λ which is a column vector of order md is defined as φ D λ φ = Vec λ = Vec φ λ φ λ 2 φ 2 φ λ d λ φ m λ We refer to D λ as K derivative and we can also write D λ as a Kronecker product D λ φ = Vec φ λ = Vec λ φ = [φ λ φ 2 λ φ m λ] [ λ φ m λ d λ 2 If we repeat the differentiation D λ twice we obtain [ D 2 λ φ = D λ D λ φ = Vec φ ] λ λ 2 = φ = φ λ λ 2 and in general suppose the differentiability k times the k th K derivative is given by [ ] k D k λ φ = D λ D k λ φ = [φ λ φ 2 λ φ m λ] λ λ 2 λ d which is a column vector of order md k containing all possible partial derivatives of entries of φ according k to the Kronecker product λ λ 2 λ d In the following we give some additional properties of this operator D λ when applied to products of several functions Let K 3 2 m m 2 d denote the commutation matrix of size m m 2 d m m 2 d changing the order in a Kronecker product of three vectors of dimension m m 2 d see the Subsection 6 in the Appendix for details such that the second and third places are interchanged For example if a a 2 a 3 are vectors of dimension m m 2 d respectively then we have K 3 2 m m 2 d as the matrix defined such that K 3 2 m m 2 d a a 2 a 3 = a a 3 a 2 Property Chain Rule If λ R d φ R m and φ 2 R m 2 then 2 D λ φ φ 2 = K 3 2 m m 2 d D λ φ φ 2 + φ λ d φ m λ d ] D λ φ 2

4 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK where K 3 2 m m 2 d denotes the commutation matrix This Chain Rule 2 can be extended to products of several functions If φ k R m k k = 2 M then D λ :M φ k = M j= [ [ ] K p M+ j m :M d φ D :j k λ φ λ ] φ j j+:m k here the commutation matrix K pm+ j m :M d permutes the vectors of dimension m :M d in the Kronecker product according to the permutation p M+ j of the integers : M + = 2 M + Consider the special case φ λ = λ k Differentiating according to the definition gives D λ k λ λ k = Vec λ = λ k λ λ 2 λ d k = K j+ k d[k] 22 λ k I d j=0 where d [k] = [d d d] }{{} k Now suppose φ λ = x k λ k where the vector x is a vector of constants Now φ is a scalar valued function By using the Property and after differentiating r times we obtain [ 23 D r λ x k λ k = k k k r + x λ k r x r] The reason for 23 is that the Kronecker product x k is invariant under the permutation of its component vectors x ie for any l and j so that x k x l K j+ l d[l] = x l k K j+ k d[k] = kx k j=0 and thus we obtain 23 In particular if r = k D k λ x k λ k = k!x k 3 Taylor series expansion of functions of several variables Let φ λ = φ λ λ 2 λ d and assume φ is differentiable several times in each variable Here our object is to expand φ λ in Taylor series and the expression is given in terms of differential operators given above We use this expansion later to define the characteristic function and the cumulant functions in terms of the differential operators Let λ = λ :d = λ λ 2 λ d R d It is well known that the Taylor series of φ λ is 3 φ λ = where the coefficients are k k 2k d =0 c k = Σk φ λ λ k k! c k λk λ=0

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 5 here we used the notation k = k k 2 k d k! = k!k 2! k d! d λ k = λ kj j λk = λ k λk2 2 λk d d j= The Taylor series 3 can be written in a more informative form for our purposes namely φ λ = m! c m d λ m m=0 where c m d is a column vector which is the derivative K derivative of the function φ given by c m d = D m λ φ λ λ=0 see Subsection 62 for details 3 Characteristic function and moments of random vectors Let X be a d dimensional random vector and let X = [ X X 2] where X is of dimension d and X 2 is of dimension d 2 such that d = d +d 2 Let λ = [ λ λ 2] The characteristic function of X is given by ϕ λ λ 2 = E exp [ i X λ + X 2λ 2 ] = = kl=0 kl=0 i k+l k!l! E X λ k X 2 λ 2 l i k+l k!l! E X k X l 2 λ k λ l 2 Here the coefficients of λ k λ l 2 can be obtained by using the K derivative and the formula 23 Consider the second K derivative of ϕ D λ λ ϕ λ 2 λ 2 = D λ D 2 λ ϕ λ λ 2 = ϕ λ λ 2 λ λ 2 i k+l 2 = k! l! EX k λ k X l 2 λ l 2 X X 2 kl= Now by evaluating the derivative D λ λ ϕ λ 2 λ 2 we obtain EX X 2 similarly other moments kl=0 λ λ 2 =0 can be obtained from higher order derivatives Therefore the Taylor series expansion of ϕ λ λ 2 can be written in terms of derivatives and is given by i k+l ϕ λ λ 2 = D kl λ k!l! λ ϕ λ 2 λ 2 λ k λ l 2 λ λ 2 =0 We note that in general D kl λ λ ϕ λ 2 λ 2 λλ2=0 = ik+l EX k X l 2 i k+l EX l 2 X k = D lk λ 2 λ ϕ λ λ 2 λ λ 2 =0

6 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK which shows that the partial derivatives in this case are not symmetric Consider a set of vectors λ :n = [ λ λ 2 λ n] with dimensions [d d 2 d n ] We can define the operator D λ λ given in the previous section for the partitioned set of vectors λ 2 :n This is achieved recursively Recall that the K derivative with respect λ j is D λ j ϕ = Vec ϕ λ j Definition 2 The n th derivative D n λ :n is defined recursively by D n λ :n ϕ = D λ D n n λ :n ϕ where D n λ :n ϕ is a column vector of the partial differential operator of order n We see this is the first order derivative of the function which is already a n th order partial derivative The dimension of D n λ :n is d [n] :n = n j= d j where [n] denotes a row vector having all ones as its entries ie [n] = [ ] with dimension n The order of the vectors in λ :n is important The following definition generalizes to the multivariate case a similar well-known result for scalar valued random variables Here we assume the partial derivatives exist Definition 3 Suppose X :n = X X 2 X n is a collection of random column vectors with dimensions [d d 2 d n ] The Kronecker moment is defined by the following K derivative E X X 2 X n = E n X j = i n D n j= λ λ 2 λ ϕ X X n 2 X n λ λ 2 λ n λ:n =0 We note the order of the product in the expectations and the derivatives are important since the Kronecker moment is not symmetric if the variables X X 2 X n are different 4 Cumulant function and Cumulants of random vectors We obtain the cumulant Cum n X as the derivative of the logarithm of the characteristic function ϕ X λ of X = X X 2 X n and then evaluate the function at zero to obtain i n n ln ϕ X λ λ [n] = Cum n X = Cum n X X 2 X n λ:n =0 where λ [n] = λ λ 2 λ n See Terdik 999 for details Now consider the collection of random vectors X :n = X X 2 X n where each X i is of order d i The corresponding characteristic function of Vec X :n is ϕ X:n λ:n = ϕvec X:n Vec λ:n = E exp i Vec λ:n Vec X :n where λ :n = λ λ 2 λ n and d :n = d d 2 d n We call the logarithm of the characteristic function ϕ Vec X:n Vec λ:n as the cumulant function and denote it by ψ Vec X:n Vec λ:n = ln ϕvec X:n Vec λ:n We write ψ X:n λ:n for ψvec X:n Vec λ :n The first order K derivative of the cumulant function ψ X:n λ:n with respect to to λ:n is defined as the cumulant of X :n Now we use the operator D n λ :n ψ = D λ D n n λ :n ψ recursively and the result is a column vector of the partial differentials of

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 7 order n which is first order in each variable λ j The dimension of D n λ :n is d [n] :n = n j= d j Now we define the n th order cumulant of vectors X :n as Definition 4 4 Cum n X:n = i n D n λ :n ψ X:n λ:n λ:n =0 Therefore Cum n X:n is a vector of dimension d [n] :n containing all possible cumulants of the elements formed from the vectors X X 2 X n and the order is defined by the Kronecker products defined earlier see also Terdik 2002 This definition also includes the evaluation of the cumulants where all the random vectors X X 2 X n need not to be distinct In this case the characteristic function depends on the sum of the corresponding variables of λ :n and we use still the definition 4 to obtain the cumulant For example when n = we have and when n = 2 42 Cum X = EX Cum 2 X X 2 = E [X EX X 2 EX 2 ] = Vec CovX X 2 where CovX X 2 denotes the covariance matrix of the vectors X and X 2 formulae let us consider an example To illustrate the above X X 2 and assume X2 has a joint normal distribution with moment Example Let X 2 = µ µ and the variance covariance matrix C 2 X X 2 given by [ ] C X X 2 = C C 2 C 2 C 22 Then the characteristic function of X 2 is given by { ϕ X2 λ λ 2 = exp i µ λ + µ λ 2 2 λ 2 C λ + λ C 2λ 2 + λ 2 C 2λ + λ 2 C } 22λ 2 and the cumulant function of X 2 is ψ X2 λ λ 2 = ln ϕ X2 λ λ 2 = iµ λ + µ 2 λ 2 2 Now the first order cumulant is Cum Xj = id λ ϕ X2 λ j λ 2 = µ j λ C λ + λ C 2 λ 2 + λ 2C 2 λ + λ 2C 22 λ 2 λ =λ 2 =0 and it is clear that any cumulant of order higher than two is zero One can easily show that the second order cumulants are the vectors of the covariance matrices ie For instance if j = 2 and k = Cum 2 Xj X k = Vec Ckj j k = 2 D 2 λ λ 2 λ 2C 2 λ = D λ 2 D λ λ 2C 2 λ = Vec C 2

8 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK If j = k = then and by applying repeatedly D λ we obtain D λ λ C λ = 2 Vec λ C = 2C λ Cum 2 X X = D λ Dλ λ C λ /2 = Vec C 4 Basic Properties of the Cumulants For convenience of notation we set the dimensions of X X 2 X n equal to d The cumulants are symmetric in scalar valued case but not in vectors for example Cum 2 X X 2 Cum 2 X 2 X Here we have to use permutation matrices see Appendix for details as will be shown below Proposition Let p be a permutation of integers : n and let the function f λ :n R d be continuously differentiable n times in all its arguments then D n λ p:n f = I d K p:n d :n D n λ :n f Symmetry If d > then the cumulants are not symmetric but satisfy the relation Cum n X :n = K p:n d[n] Cumn X p:n where p : n = p p 2 p n belongs to the set of all possible permutations P n of the numbers : n d [n] = d d d }{{} and K p:n d[n] is the permutation matrix see Appendix n equation 6 For constant matrices A and B and random vectors Y Y 2 Cum n+ AY + BY 2 X :n = A I d n Cum n+ Y X :n + B I d n Cum n+ Y 2 X :n also Cum n+ AY BY 2 X :n = A B I d n Cum n+ Y Y 2 X :n assuming that the appropriate matrix operations are valid For any constant vectors a and b Cum n+ a Y + b Y 2 X :n = a Cum n+ Y X :n + b Cum n+ Y 2 X :n 2 Independence If X :n is independent of Y :m where n m > 0 then Cum n+m X :n Y :m = 0 In particular if the dimensions are same then Cum n X :n + Y :n = Cum n X :n + Cum n Y :n 3 Gaussianity The random vector X :n is Gaussian if and only if for all subsets k :m of : n is zero ie Cum m X k:m = 0 m > 2 For further properties of the cumulants we need the following Lemma which makes it easier to understand the relations between the moments and the cumulants see Barndorff-Nielsen and Cox 989 p 40 Remark Let P n for the set of all partitions K of the integers : n If K = {b b 2 b m } where each b j : n then K = m denotes the size of K We introduce an ordering among the blocks b j K b j b k if 43 2 l 2 l l b j l b k

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 9 and equality in 43 is possible if and only if j = k The partition K will be considered as ordered if both the elements of a block are ordered inside the block and the blocks are ordered by the above relation b j b k also We suppose that all partitions K of P n are ordered Denote λ = λ :M = [ λ λ 2 λ M ] R N where λ j R d j and N = d [n] :n In this case the differential operator D b λ b is well defined because the vector λ b = [ λ j j b ] denotes an ordered subset of vectors [ λ λ 2 λ ] M corresponding to the order in b The permutation p K of the numbers : n corresponds to the ordered partition KSee Andrews 976 for more details on partitions We can rewrite the formula of Faà di Bruno given for implicit functions see Lukács 955 as follows Lemma Let the implicit function f g λ λ R d where f and g are scalar valued functions and are differentiable M times Suppose that λ = λ :M = [ λ λ 2 λ ] M with dimensions [d d 2 d M ] Then for n M 44 D n λ :n f g λ = n f r g λ r= K P n K =r K pk d :n b K D b λ b where p K is a permutation of : n defined by the partition K see Remark g λ We consider particular cases of Equation 44 which are useful for proving some properties of cumulants 42 Cumulants in terms of moments and vice versa 42 Cumulants in terms of moments The results obtained here are generalizations of the well known results given for scalar random variables by Leonov and Shiryaev959 see also Brillinger and Rosenblatt 967 Terdik 999 To obtain the cumulants in terms of moments let us consider the function f x = ln x and g λ = ϕ X:n λ:n The r th derivative of f x = ln x is f r x = r r!x r So the left hand side of Equation 44 is the cumulant of X :n Hence we obtain n 45 Cum n X :n = m m! m= L P :n L =m K pl d:n E X j=:m k b k j where the second summation is taken over all possible ordered partition L P :n with L = m see Remark for details The expectation operator E defined such that E X X 2 = EX EX 2 422 Moments in terms of cumulants Let f x = exp x and g λ = ψ X:n λ:n Hence all the derivatives of f x = exp x are equal to exp x and therefore we have n exp g λ 46 = exp g λ λ λ 2 λ pk d :n D b n b K λ b g λ K P n K The expression for the moment EX [n] :n is quite general for example the moment EY k :m :m can be obtained from EX [n] :n where Y Y = Y [k] m[km] Y Y m Y m = X :n say } {{ } k } {{ } k m ie the elements in the product Y k :m :m are treated as they were distinct 47 EX [n] :n = K pl d:n Cum b X b b L L P :n

0 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK where the summation is over all ordered partitions L = {b b 2 b k } of : n 423 Cumulant of products via products of cumulants Let X K denote the vector where the entries are obtained from the partition K ie if K = {b b 2 b m } then X K = X b X b2 X bm The order of the elements of the subsets b K and the order of the subsets in K are fixed Now the cumulant of the products can be expressed by the cumulants of the individual set of variables X b = X j j b b L such that K L = O where O denotes the coarsest partition with one subset { : n} only Such partitions L and K are called indecomposable see Brillinger 200 Terdik 999 48 Cum k Xb Xb2 Xbm = K pl d:n Cum b L b X b where X b denotes the set of vectors from X s s b K L=O Example 2 Let X be a Gaussian random vector with EX = 0 A and B matrices with appropriate dimensions and CovX X = Σ Then 49 Cum X AX X BX = 2 Tr AΣB Σ see Taniguchi 99 We can use 48 to obtain 49 as follows: Cum X AX X BX = Cum 2 Vec A X X Vec B X X = [ Vec A Vec B ] Cum 2 X X X X = [ Vec A Vec B ] [ K 2 3 + K 2 4] [Vec Σ Vec Σ] = 2 Vec Σ A B Vec Σ = 2 Tr AΣB Σ 5 Applications to Statistical Inference 5 Cumulants of the log-likelihood function The above results can be used to obtain the cumulants of the partial derivatives of the log-likelihood function see Skovgaard 986 These expressions are useful in the study of the asymptotic theory of statistics Consider a random sample X X 2 X N = X R N with the likelihood function L X and let l denote the log likelihood function ie It is well known that under the regularity conditions 5 E l = ln L X R d l l = E 2 l 2 2 The result 5 can be extended to products of several partial derivatives see McCullagh and Cox 986 for d = 4 who use these expressions in the evaluation of Bartlett s correction We can arrive at the result 5 from 44 by observing L X = e l Suppose d = 2 we have and from 44 we have 2 e l = e l 2 and equating the above two expressions we get 2 e l 2 = 2 L X 2 [ l 2 L X = L X 2 l 2 ] + 2 l 2 l l + 2 l 2 2

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS The expected value of the left hand side of the above expression is zero as we are allowed changing the order of the derivative and the integral which gives the result 5 The same argument leads more generally to several partial derivatives d 52 E [ ] b r= b K j b l = 0 j K P d K =r and this is a consequence of 46 Proceeding in a similar fashion assuming the regularity conditions in higher order and using 47 we obtain the cumulant analogue of the above d b 53 Cum j b l b K = 0 j r= K P d K =r The equation 52 is in terms of the expected values of the derivatives of the log likelihood function where as 53 is in terms of the cumulants For example suppose we have a single parameter and let us denote [ ] m [ ] 2 m2 [ ] 3 m3 [ ] 4 m4 µ 4 m m 2 m 3 m 4 = E l 2 l 3 l 4 l then from the formula 52 we obtain 54 µ 4 0 0 0 + 4µ 4 0 0 + 6µ 4 2 0 0 + 3µ 4 0 2 0 0 + µ 4 4 0 0 0 = 0 To obtain 54 we proceed as follows Consider the partitions K P 4 if K = we have only one partition 2 3 4 if K = 2 we have 4 terms of type { 2 3 4} and 3 terms of type { 2 4 3} if K = 3 we have 6 terms of the type { 2 4 3} Now if = 2 = 3 = 4 = then m m 2 m 3 m 4 shows the number of the elements of the subsets in a partition for instance m m 2 m 3 m 4 = 2 0 0 corresponds to the partitions of the type { 2 4 3} and so on Hence the result 54 McCullagh and Cox 986 equ 0 p 42 obtained a similar result for cumulants 55 4 Cum 4 l + 4 Cum +3 Cum which is a special case of 53 l 2 2 2 l 2 2 l 2 l + 6 Cum + Cum l l 2 l l l l 2 l 5 Cumulants of the log-likelihood function the multiple parameter case The multivariate extension the elements of the parameter vector are vectors as well of the formula 52 can easily be obtained using Lemma If we partition the vector parameters into n subsets = :n = [ 2 n] with dimensions [d d 2 d n ] respectively then it follows 56 d r= K P d K =r K pk d :n E b K D b b l = 0 where b denotes the subset of vectors [ j j b ] Now in particular if n = 2 and = 2 = then 56 gives the well known result Cov D l D l = E D D l or in vectorized form the same can be written as E D l D l = E D 2 l = 0

2 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK In case n = 4 say and = 2 = 3 = 4 = then we have where µ 4 0 0 0 + 4µ 4 0 0 + 6µ 4 2 0 0 + 3µ 4 0 2 0 0 + µ 4 4 0 0 0 = 0 ] µ 4 m m 2 m 3 m 4 = E [D l m [ ] D 2 l m2 [ ] D 2 l m3 [ ] D 2 l m4 We can obtain a similar expression for the cumulants and it is given by d r= K P d K =r K pk d :n Cum r D b b l b K = 0 52 Multivariate Measures of Skewness and Kurtosis for Random Vectors In this section we define what we consider are natural measures of multivariate skewness and kurtosis and show their relation to the measures defined by Mardia 970 Let X be a d dimensional random vector whose first four moments exist Let Σ denote the positive-definite variance covariance matrix The skewness vector of X is defined by ζ X = Cum 3 Σ /2 X Σ /2 X Σ /2 X = Σ /2 3 Cum3 X X X and the total skewness is ζ X = ζ X 2 The kurtosis vector of X is defined by κ X = Cum 4 Σ /2 X Σ /2 X Σ /2 X Σ /2 X = Σ /2 4 Cum4 X X X X and the total kurtosis is κ X = Tr Vec κ X where Vec κ X is the matrix M such that Vec M =κ X The skewness and kurtosis for a multivariate Gaussian vector X is zero ζ X is also zero for any distribution which is symmetric The skewness and kurtosis are expressed in terms of the moments Suppose EX = 0 then 57 ζ X = Σ /2 3 EX 3 The total skewness ζ X which is just the norm square of the skewness vector ζ X coincides with the measure of skewness β d defined by Mardia 970 For any set of random vectors we have 58 Cum 4 X :4 = E X:4 Cum 2 X X 2 Cum 2 X 3 X 4 K p 2 3 d[4] Cum2 X X 3 Cum 2 X 2 X 4 K p 4 2 d[4] Cum2 X X 4 Cum 2 X 2 X 3

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 3 and therefore the kurtosis vector of X can be expressed in terms of the fourth order moments by noting X = X 2 = X 3 = X 4 = X in the above κ X = Σ /2 4 Cum4 X X X X = Σ /2 4 EX 4 I + K p 2 3 d[4] + K p 4 2 d[4] Σ /2 4 Cum2 X X Cum 2 X X = Σ /2 4 EX 4 I + K p 2 3 d[4] + K 59 p 4 2 d[4] [Vec Id Vec I d ] Mardia 970 defined the measure of kurtosis as β 2d = E X Σ X 2 and this is related to our total kurtosis measure κ X as follows β 2d = κ X + dd + 2 = Tr Vec κ X + dd + 2 Indeed [ Tr Vec Σ /2 ] 4 [ ] 2 [ ] 2 EX 4 = E Tr Σ /2 X Σ /2 X [ ] 2 = E Tr Σ /2 X Σ /2 X = E X Σ X 2 We note if X is Gaussian then κ X = 0 and hence β 2d = dd + 2 53 Multiple Linear Time Series Let X t be a d dimensional discrete time stationary time series Let X t satisfy the linear representation see Hannan 970 p 208 50 X t = H k e t k k=0 where H 0 is identity H k < e t are independent and identically distributed random vectors with Ee t = 0 Ee t e t = Σ Let κ m+ e = Cum m+ e t e t e t be the vector d m+ We note κ 2 e = Vec Σ and the cumulant of X t is 5 Cum m+ Xt X t+τ X t+τ2 X t+τm = H k H k + τ H k + τ m κ m+ e k=0 = C m+ τ τ 2 τ m Let X t satisfy the autoregressive model of order p given by X t + A X t + A 2 X t 2 + + A p X t p = e t which can be written as I + A B + A 2 B 2 + + A p B p X t = e t

4 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK We assume the coefficients {A j } satisfy the usual stationarity condition see Hannan 970 p 22 and proceed 52 X t = I + A B + A 2 B 2 + + A p B p et = H k B k e t where B is the backshift operator From 50 and 52 we have 53 from which we obtain k=0 I + A B + A 2 B 2 + + A p B p H k B k = I H 0 + H + A H 0 B + H 2 + A H + A 2 H 0 B 2 + k=0 + H p + A H p + A 2 H p 2 + + A p H 0 B p + + H p + + A H p + + = I Equating powers of B j j we get 54 H j + A H j + A 2 H j 2 + + A p H j p = 0 j here we use the convention H j = 0 if j < 0 Let τ substituting for H k + τ from 54 into 5 C m+ τ τ 2 τ m = H k [A H k + τ + A 2 H k + τ 2 + + A p H k + τ p] k=0 H k + τ m κ m+ e p = H k A j H k + τ j H k + τ 2 H k + τ m κ m+ e = = j= k=0 p j= k=0 [I d A j I d m ] [H k H k + τ j H k + τ 2 H k + τ m ] κ m+ e p I d A j I d m C m+ τ j τ 2 τ m j= Thus we obtain 55 C m+ τ τ 2 τ m = If we put m = in 55 we get which can be written in matrix form p I d A j I d m C m+ τ j τ 2 τ m j= C 2 τ = p I d A j C 2 τ j j= C 2 τ = p A j C 2 τ j j=

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 5 which is well known Yule-Walker equation in terms of second order covariances Therefore we can consider 55 as an extension of Yule-Walker equations in terms of higher order cumulants for multivariate autoregressive models The definition of the higher order cumulant spectra for stationary time series comes in a natural way Consider the time series X t with m + th order cumulant function Cum m+ Xt X t+τ X t+τ2 X t+τm = Cm+ τ τ 2 τ m and define the m th order cumulant spectrum as the Fourier transform of the cumulants m S m ω ω 2 ω m = C m+ τ τ 2 τ m exp i τ j ω j τ τ 2 τ m = provided that the infinite sum converges We note here that the connection between the usual matrix notation for the second order spectrum S 2 ω is that see the 42 S 2 ω = Vec [S 2 ω] 54 Bhattacharya-type lower bound for the multiparameter case In this section we obtain a lower bound for the variance covariance matrix of an unbiased vector of statistics which is a linear function of the first k partial derivatives This corresponds to the well known Bhattacharya bound see Bhattacharya 946 Linnik 970 for the multiparameter case which does not seem to have been considered anywhere in the literature Consider a random sample X X 2 X n = X R nd 0 with likelihood function L X R d Suppose we have a vector of unbiased estimators say ĝ X of g R d Define the random vectors Υ Df = L X D L X L X D 2 L X L X D k L X Υ = ĝ X Υ Df where the dimension of Υ is d + d + d 2 + + d k The second order cumulant between ĝ X and the derivatives LX D j L X j = 2 k is as follows [ ] Cum ĝ X D j L X ĝ x dx L X D j The covariance matrix between ĝ X and L X = Vec = LX D j j= ĝ x D j L X dx = D j g Var Υ Df is singular because the elements of the derivatives D j L X is calculated using 42 The variance matrix L X are not distinct Therefore we reduce the vector of derivatives using distinct elements only To make it precise we first consider second order derivatives We define the duplication matrix D 2d which reduces the symmetric matrix V d to the matrix ν 2 V d which is the vector of lower triangular elements of V d We define D 2d as follows: D 2d ν 2 V d = Vec V d The dimension of ν 2 V d is d d + /2 and D 2d is of d 2 d d + /2 It is easy to see that D 2d D 2d is non-singular the columns of D 2d are linearly independent each row has exactly one nonzero element therefore the Moore-Penrose inverse D + 2d of D 2d is D + 2d = D 2dD 2d D 2d

6 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK such that ν 2 V d = D + 2d Vec V d see Magnus and Neudecker 999 Ch 3 Sec 8 for details The operator D 2 2 which is actually λ The matrix the duplication matrix D 2 λ λ = Vec λ λ is defined by is symmetric and therefore we can use the inverse D + 2d of D + 2d D 2 = ν 2 to get the necessary elements of the derivatives We can extend this procedure for higher order derivatives by defining D + kd D k = ν k D k where ν k D k is a vector of the distinct elements of D k listed in the original order in D k Now let C gj = Cov ĝ X D + jd L X D j L X D 2 where the entries of C gj are those of the entries of the cumulant matrix Cum ĝ X D + jd L X D j L X Now considering the vector of all distinct and nonzero derivatives Υ Df = L X D L X Υ = ĝ X Υ Df L X D+ 2d D 2 L X L X D+ kd D k L X we obtain the generalized Bhattacharya lower bound in the case of multiple parameters This is obtained by considering the variance matrix of Υ which is positive semi definite which implies 56 Var ĝ X C gdf Var Υ Df C gdf 0 where the matrix C gdf = [C g C g2 C gk ] The Cramer- Rao inequality is obtained by setting k = ie by considering only the first derivative vector Let us now consider an example to illustrate the Bhattacharya bound given by 56 Example 3 Let X X 2 X n = X R nd 0 be a sequence of independent Gaussian random vectors with mean vector R d0 and variance matrix I d0 Suppose we want to estimate the function g = 2 R Here d = d 0 d = The unbiased estimator for g is d ĝ X = X 2 k n k= where X k is the sample mean computed using the random sample consisting of n observations on the k th random variable of the random vector X The variance of the estimator ĝ X is Var ĝ X d 4 2 = k n + 2 n 2 57 k= = 4 n 2 + 2d n 2

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 7 The Cramer-Rao bound for this estimator is 4 n 2 which is strictly less than the actual variance The derivatives D j L X for j > 2 are zero For j = 2 we have D L X = n X L X [ ] X D 2 L X = 2 n2 n Vec I d L X therefore we obtain using all the elements of second partial derivative matrix Υ Df = L X D L X L X D 2 L X = n X [ ] n 2 X 2 n Vec I d Note that if we consider only the vector of first derivatives then the second element of above vector will not be included in the lower bound making the Cramer-Rao bound smaller If we use the reduced number of elements for Υ Df we have Υ Df = n X [ ] n 2 D + 2 2d X n D+ 2d Vec I d and the variance matrix of Υ Df will contain [ ] 2 X n 2 C 2 = n 2 Vec d 2 d D + 2 2 2d Cum2 n Vec I d 2 [ K = Vec d 2 d D + 2 2d p 2 3 d[4] + K p 3 d[4] Vec Id 2] = D + [ ] 2d Id 2 + K p 2 d[4] D + 2d Denote and then the matrices satisfy [ ] Id 2 + K p 2 d[4] = Nd 2 N d = N d = N 2 d N d = D 2d D + 2d see Magnus and Neudecker 999 Ch 3 Sec 7-8 Theorem and 2 We obtain n 2 C 2 = 2D + 2d N d D + N d = 2D + 2d D 2d + = 2 D 2d D 2d which is invertible The inverse of the variance matrix of Υ Df is given by [ [ ] Var ΥDf = n I ] d 0 0 2n D 2 2d D 2d Now to obtain the matrix C gdf = [C g C g2 ] we need Cum ĝ X L X D L X = D g = D = 2 C g = 2

8 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK and Finally we obtain Cum ĝ X L X ν D 2 L X = 2D + 2d Vec I d C g2 = 2D + 2d Vec I d C gdf Var Υ Df C gdf = 4 n 2 + 2 n 2 Vec I d D 2d D + 2d = 4 n 2 + n 2 Vec I d N d Vec I d D2d D + 2d Vec I d = 4 n 2 + 2d n 2 which is the Bhattacharya bound and is same as the variance of the statistic ĝ X given by 57 6 Appendix 6 Commutation Matrices The Kronecker products have the advantage in the sense that we can commute the elements of the products using linear operators called commutation matrices see Magnus and Neudecker 999 Ch 3 Sec 7 for details We use these operators here in the case of vectors Let A be a matrix of order m n and the vector Vec A is a permutation of the vector Vec A Therefore there exists a permutation matrix K m n of order mn mn called commutation matrix which is defined by the relation K m n Vec A = Vec A Now suppose if a is m and b is n then K m n b a = K m n Vec ab = Vec ba = a b From now on in the sequel we shall use a more convenient notation K m n = K n m which means that we are changing the order in a K product b a of vectors b R n and a R m Now consider a set of vectors a a 2 a n with dimensions d :n = d d 2 d n respectively Define the matrix K j+ j d:n = I d i K d j d j+ I d i i=:j i=j+2:n where i=:j stands for the Kronecker product of the matrices indexed by : j = 2 j Clearly K j+ j d :n i=:n a i = = i=:j I d i a i K d j d j+ a j a j+ i=:j a i a j+ a j i=j+2:n a i i=j+2:n I d i a i Therefore one is able to transpose interchange the elements a j and a j+ in a Kronecker product of vectors by the help of the matrix K j j+ d :n In general K j j+ d :n = K j j+ d :n but K j+ j K j j+ because of the dimensions d j+ and d j are not necessarily equal If they are equal then K j+ j = K j j+ = K j j+ = K j j+ We remind that P n denotes the set of all permutations of the numbers : n = 2 n if p P n then p : n = p p 2 p n From this it follows that for each permutation p : n = p p 2 p n p P n there exists a matrix K p:n d :n such that 6 K p:n d:n a i = a pi i=:n i=:n just because any permutation p : n can obtained from the product by transposition of neighboring elements Since there is an inverse of the permutation p : n therefore there exists an inverse K p:n d :n

DIFFERENTIAL OPERATORS AND STATISTICAL APPLICATIONS 9 for K p:n d :n as well Note that the entries of d :n are not necessary equal they are the dimensions of the vectors a i i = 2 n which is fixed The following example shows that K p:n d :n is uniquely defined by the permutation p : n and the set d :n The permutation p 2 4 is the product of two interchanges p 2 3 and p 3 4 ie K p2 4 d :4 = K p3 4 d d 3 d 2 d 4 K p2 3 d :4 = I d I d3 K d4 d 2 I d K d3 d 2 I d4 This process can be followed for any permutation p : n and for any set d :n of the dimensions In particular transposing two elements only j and k in the product will be denoted by K pj k d :n It will not be confusing to use both notations K j k and K pj k also K j k and K pj k for the same operators It can be seen that 62 K j k = K j k = K k j Let A be m n and B be p q matrices it is well known that K 2 m p A B K 2 q n = B A The same argument to the case of vectors Kronecker product leads to the technic of permuting matrices in a Kronecker product by the help of commutation matrix K p Using the above notation we can write 63 Vec A B = I n K m q I p Vec A Vec B = K 2 3 n m q p Vec A Vec B 62 Taylor series in terms of differential operators We have ψ λ = c k λk k! this can be re-written in the form where c m d is a column vector = k k 2 k d =0 m=0 ψ λ = m! c m d = m=0 m k k 2 k d =0 Σk j=m m! k! c k λk m! c m d λ m D m λ ψ λ λ=0 with appropriate entries from the vectors {c k Σk j = m} the dimension of c m d is same as λ m ie d m To obtain the above expansion we proceed as follows Let x R d be a real vector and consider m d x λ m = x j λ j and we can also write Therefore = j= m k k 2 k d =0 Σk j =m m! k! xk λ k x λ m = x λ m = x m λ m m k k 2k d =0 Σk j =m m! k! xk λ k = x m λ m

20 S RAO JAMMALAMADAKA T SUBBA RAO AND GYÖRGY TERDIK The entries of the vector c m d correspond to the operator Σk λ k having the same symmetry as xk therefore if x m is invariant under some permutation of its factors then c m d is invariant as well From Equation 23 we obtain that c m d = D m λ ψ λ λ=0 References [] G E Andrews 976 The theory of partitions Addison-Wesley Publishing Co Reading Mass-London-Amsterdam Encyclopedia of Mathematics and its Applications Vol 2 [2] O E Barndorff-Nielsen and D R Cox 989 Asymptotic techniques for use in statistics Monographs on Statistics and Applied Probability Chapman & Hall London [3] A Bhattacharya 946 On some analogues to the amount of information and their uses in statistical estimation Sankhya 8-4 [4] D R Brillinger 200 Time Series; Data Analysis and Theory Society for Industrial and Applied Mathematics SIAM Philadelphia PA Reprint of the 98 edition [5] D R Brillinger and M Rosenblatt 967 Asymptotic Theory of k th order spectra Spectral Analysis of Time Series ed B Harris 53-88 New York Wiley [6] E J Hannan 970 Multiple time series John Wiley and Sons Inc New York-London-Sydney [7] V P Leonov and A N Shiryaev 959 On a method of calculation semi-invariants Theor Prob Appl 4 39-329 [8] Yu V Linnik 970 A note on Rao-Cramer and Bhattacharya inequalities Sankhya Ser A vol 32 449 452 [9] E Lukacs 955 Applications of Faa di Bruno s formula in statistics Am Math Monthly 62 340-348 [0] K V Mardia 970 Measures of multivariate skewness and kurtosis with applications Biometrika 57 59 530 [] P McCullagh 987 Tensor methods in statistics Monographs on Statistics and Applied Probability Chapman & Hall London [2] P McCullagh and D R Cox 986 Invariants and likelihood ratio statistics Ann Statist 4 no 4 49 430 [3] J R Magnus and H Neudecker 999 Matrix differential calculus with applications in statistics and econometrics John Wiley & Sons Ltd Chichester Revised reprint of the 988 original [4] Ib Skovgaard 986 A note on the differentiation of cumulants of log likelihood derivatives Internat Statist Rev54 29 32 [5] T P Speed 990 Invariant moments and cumulants Coding theory and design theory Part II IMA Vol Math Appl vol 2 39 335 Springer-Verlag New York [6] T Subba Rao and W K Wong 999 Some Contributions to Multivariate Nonlinear Time Series and to Bilinear Models Asymptotics Nonparametrics and Time Series ed Ghosh S -42 Marcel Dekker Inc New York [7] M Taniguchi 99 Higher order asymptotic theory for time series analysis Lecture Notes in Statistics vol 68 Springer-Verlag New York [8] Gy Terdik 999 Bilinear Stochastic Models and Related Problems of Nonlinear Time Series Analysis; A Frequency Domain Approach vol 42 of Lecture Notes in Statistics Springer Verlag New York [9] Gy Terdik 2002 Higher order statistics and multivariate vector Hermite polynomials for nonlinear analysis of multidimensional time series Teor Ver Matem StatTeor Imovirnost ta Matem Statyst 66 47 68 [20] W K Wong 997 Frequency domain tests of multivariate Gaussianity and linearity J Time Ser Anal 82:8 94 Department of Statistics and Applied Probability University of California Santa Barbara CA 9306-30 USA E-mail address: rao@pstatucsbedu URL: http://wwwpstatucsbedu/faculty/jammalam/ University of Manchester Institute of Science and Technology PO Box 88 MANCHESTER M60 QD UK E-mail address: TataSubbaRao@umistacuk Department of Informatics University of Debrecen 400 Debrecen Pf2 HU E-mail address: terdik@delfinunidebhu URL: http://dragonunidebhu/~terdik/