Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

Similar documents
j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture 3: Probability Distributions

Multi-dimensional Central Limit Theorem

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

Primer on High-Order Moment Estimators

Homework Notes Week 7

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

From Biot-Savart Law to Divergence of B (1)

Foundations of Arithmetic

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Difference Equations

Multi-dimensional Central Limit Argument

The Geometry of Logit and Probit

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

Errors for Linear Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

APPENDIX A Some Linear Algebra

Composite Hypotheses testing

Assignment 2. Tyler Shendruk February 19, 2010

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

1 (1 + ( )) = 1 8 ( ) = (c) Carrying out the Taylor expansion, in this case, the series truncates at second order:

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

1 Matrix representations of canonical matrices

The Jacobsthal and Jacobsthal-Lucas Numbers via Square Roots of Matrices

Lecture 3. Ax x i a i. i i

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

2.3 Nilpotent endomorphisms

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

GMM Method (Single-equation) Pongsa Pornchaiwiseskul Faculty of Economics Chulalongkorn University

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

9.07 Introduction to Probability and Statistics for Brain and Cognitive Sciences Emery N. Brown

Radar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed.

NUMERICAL DIFFERENTIATION

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Expected Value and Variance

Conjugacy and the Exponential Family

Goodness of fit and Wilks theorem

Time-Varying Systems and Computations Lecture 6

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

P A = (P P + P )A = P (I P T (P P ))A = P (A P T (P P )A) Hence if we let E = P T (P P A), We have that

Hidden Markov Models

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Trees and Order Conditions

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Srednicki Chapter 34

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Strong Markov property: Same assertion holds for stopping times τ.

Global Sensitivity. Tuesday 20 th February, 2018

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

THE SUMMATION NOTATION Ʃ

e - c o m p a n i o n

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method

First Year Examination Department of Statistics, University of Florida

where the sums are over the partcle labels. In general H = p2 2m + V s(r ) V j = V nt (jr, r j j) (5) where V s s the sngle-partcle potental and V nt

Lecture 12: Discrete Laplacian

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

be a second-order and mean-value-zero vector-valued process, i.e., for t E

7. Multivariate Probability

ECEN 5005 Crystals, Nanocrystals and Device Applications Class 19 Group Theory For Crystals

Linear Approximation with Regularization and Moving Least Squares

Implicit Integration Henyey Method

Differentiating Gaussian Processes

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

A how to guide to second quantization method.

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

The Dirac Equation for a One-electron atom. In this section we will derive the Dirac equation for a one-electron atom.

x = , so that calculated

Norms, Condition Numbers, Eigenvalues and Eigenvectors

The KMO Method for Solving Non-homogenous, m th Order Differential Equations

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

Review: Fit a line to N data points

Joint Statistical Meetings - Biopharmaceutical Section

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

7. Multivariate Probability

Tensor Analysis. For orthogonal curvilinear coordinates, ˆ ˆ (98) Expanding the derivative, we have, ˆ. h q. . h q h q

Solutions to Problem Set 6

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Error Bars in both X and Y

1 Derivation of Point-to-Plane Minimization

Math 261 Exercise sheet 2

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Transcription:

Effects of Ignorng Correlatons When Computng Sample Ch-Square John W. Fowler February 6, 0 It can happen that ch-square must be computed for a sample whose elements are correlated to an unknown extent. We denote the sample elements as x, to. In general, when there are correlatons, the form of ch-square s χ j w x x x x ( where the w are the elements of the nverse of the covarance matrx Ω, Ω σ σσ K σσ σσ σ L σσ M M O M σ σ σ σ L σ ( If the covarance matrx s not known numercally, the only formula avalable for computng chsquare s that for the case of zero off-dagonal correlaton. Snce ths uncorrelated form s erroneous when non-zero off-dagonal correlatons are beng gnored, we attach a subscrpt e: var χ χ e ( x x σ As shown below, both forms (Equatons and have the expectaton value, but the erroneous form has a larger varance. The mean and varance of the correct form are well known to be and, respectvely, whereas the varance for the erroneous form s e + + Ths s found by ntegratng the moments of the erroneous ch-square over the jont densty functon for the correlated Gaussan random varables nvolved. In general, for degrees of freedom, the jont densty s p ( x e χ / / ( π Ω ( ( (5

where χ and Ω are used formally as defned n Equatons and, and Ω s the determnant of the covarance matrx. The varance of the erroneous ch-square n Equaton can be wrtten more smply by usng the fact that the dagonal correlatons are always unty and are pcked up once n the summaton below, whereas the others are pcked up twce: χ e χ var χ e ote that ths s just twce the square of the Frobenus norm of the correlaton matrx. If there are no off-dagonal correlatons, Equaton shows (and less obvously, Equaton 6 that the varance wll reduce to, and the ch-square wll not be erroneous. Otherwse, any non-zero off-dagonal correlaton can only ncrease the varance above. The fact that the expectaton value of the erroneous ch-square s, the same as for the correct chsquare, can be seen as obvous from the fact that the former s smply the sum of terms, each of whch has the expectaton value, snce each term has the expectaton value of the numerator n the denomnator. The varance of the erroneous ch-square s not as obvous, snce the expectaton value of the square depends on the underlyng densty functon. For notatonal brevty, defne χ e z where the z are generally correlated zero-mean unt-varance Gaussan random varables. Then obvously The varance s where e z z z σ e e e z zj z zj z z j j For any combnaton of and j, the form of the expectaton value <z z j > depends only on the propertes of z and z j, namely ther correlaton and the fact that both are zero-mean unt-varance Gaussan random varables, so we can compute the expectaton value of ths product usng a bvarate jont Gaussan densty functon,.e., Equaton 5 wth all means zero and all varances unty and all the random varables ntegrated out except the th and j th : (6 (7 ( (9 (0 z z z z p( z, z dz dz j j j j + (

ote that for j, ths reduces to <z >, the well known value for the th moment of a zero-mean unt-varance Gaussan random varable. Usng Equaton n Equaton 0, ( e ( + χ + ( and Equaton 9 becomes χ χ e e ( whch s just Equaton 6, hence equvalent to Equaton. Smlar computatons can be used to show that the thrd raw moment s ( χe + ( + jk + k + jk k k ( jk k jk k k jk k j k k j k 6 + jk k k + + + + + + + + + jk k ( Denotng the n th raw moment m n and the n th central moment µ n, the thrd central moment s μ m m m + m (5 So the thrd central moment of the erroneous ch-square s μ + 6 + k + + k jk k jk k (6 The skewness s Equaton (6 dvded by the rght-hand sde of ether Equaton or 6 (or rased to the / power:

jk k k skew e If all the off-dagonal correlatons are set to zero, then / (7 m + μ m + 6 + μ skew e ( skew / ( Contnung to the fourth raw moment, we ntegrate z z j z k z m over the jont densty functon to obtan <z z j z k z m >, whch s then summed over all four ndexes to gve us m e k m ( k m jk jm km ( km k jm m jk ( k jk m jm k m km jk jm km ( k jm km m jk km k m jk jm + + + + + + + + + + + + + + 6 + + + + k m + + k jk k k m km k jm km (9 The fourth central moment s μ + m m m 6m m m (0 The fourth central moment of the erroneous ch-square s therefore

μ + + Expandng and cancelng, + + + 6 + k jk k ( km k jm km μ + k m k jk k k m + 6 + k m km k jm km ( ( So the kurtoss s kurt e k m ( km + k jm km ( If all the off-dagonal correlatons are set to zero, then μ + kurt e + + ( kurt (