V. Nollau Institute of Mathematical Stochastics, Technical University of Dresden, Germany

Similar documents
The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

A proposed discrete distribution for the statistical modeling of

Random Variables, Sampling and Estimation

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

TAMS24: Notations and Formulas

Section 14. Simple linear regression.

Lecture 7: Properties of Random Samples

Output Analysis (2, Chapters 10 &11 Law)

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Lecture 33: Bootstrap

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

CONTROL CHARTS FOR THE LOGNORMAL DISTRIBUTION

Stochastic Simulation

STATISTICAL METHODS FOR BUSINESS

1 Inferential Methods for Correlation and Regression Analysis

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

STATISTICAL INFERENCE

Ismor Fischer, 1/11/

Expectation and Variance of a random variable


S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Chapter 6 Principles of Data Reduction

Probability and statistics: basic terms

Exponential Families and Bayesian Inference

11 Correlation and Regression

MA Advanced Econometrics: Properties of Least Squares Estimators

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010

of the matrix is =-85, so it is not positive definite. Thus, the first

Module 1 Fundamentals in statistics

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Properties and Hypothesis Testing

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Topic 9: Sampling Distributions of Estimators

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Quick Review of Probability

Quick Review of Probability

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

y ij = µ + α i + ɛ ij,

Regression, Inference, and Model Building

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Estimation of the Mean and the ACVF

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

APPLIED MULTIVARIATE ANALYSIS

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Sample Size Determination (Two or More Samples)

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Topic 9: Sampling Distributions of Estimators

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Solutions: Homework 3

LECTURE 8: ASYMPTOTICS I

Linear Regression Models

Final Examination Solutions 17/6/2010

Stat 200 -Testing Summary Page 1

(all terms are scalars).the minimization is clearer in sum notation:

Bayesian Methods: Introduction to Multi-parameter Models

Stat 421-SP2012 Interval Estimation Section

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Large Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution

Last Lecture. Wald Test

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Mathematical Statistics - MS

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Stat 319 Theory of Statistics (2) Exercises

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

Lecture 20: Multivariate convergence and the Central Limit Theorem

Asymptotic distribution of products of sums of independent random variables

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Topic 9: Sampling Distributions of Estimators

A statistical method to determine sample size to estimate characteristic value of soil parameters

Sample Correlation. Mathematics 47: Lecture 5. Dan Sloughter. Furman University. March 10, 2006

NCSS Statistical Software. Tolerance Intervals

Correlation and Regression

32 estimating the cumulative distribution function

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Logit regression Logit regression

Transcription:

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau CORRELATION ANALYSIS V. Nollau Istitute of Mathematical Stochastics, Techical Uiversity of Dresde, Germay Keywords: Radom vector, multivariate ormal distributio, simple correlatio, partial correlatio, multiple correlatio, caoical correlatio. Cotets. Correlatio Betwee Two Radom Variables (Simple Correlatio. Partial Correlatio 3. Multiple Correlatio 4. Caoical Correlatio Ackowledgemets Glossary Bibliography Biographical Sketch Summary Correlatio aalysis is oe of the most importat aspects of multivariate statistical theory. Based o the differet defiitios of correlatio coefficiets (ordiary, partial, multiple ad caoical, which (geerally measure the liear associatio betwee radom variables or groups of radom variables, a statistical aalysis eables to explore the joit performace of the variables ad to determie the effect of each of these variables i the presece of the others.. Correlatio Betwee Two Radom Variables (Simple Correlatio X Let X be a -dimesioal radom vector the expectatio E ( X μ X X EX μ (that meas E μ ad the covariace matrix X EX μ σ σ Γ X. σ σ The the (simple or ordiary correlatio coefficiet of X ad X is defied by ( X X ( X var ( X cov, X, X ( var Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau ( ( X X ( X X ( X X ( X X cov, cov, E E E ( ( ad ( X ( X X var E E >0,,. (3 i i i i This correlatio coefficiet is a quatitative measure for the (liear associatio - called correlatio - betwee the radom variables X ad X the followig properties resp. is called positive( egative resp. maximal correlatio. ( ( If ad oly if (maximal correlatio there exist real costats a, a, ay + ay + b 0. If oe relabels the radom variables Y ad Y by ( Y ax + b a >0, b real ad Y cx + d ( c > 0, d real, b the the correlatio coefficiet betwee Y ad Y is the same as the correlatio coefficiet betwee X ad X : Y, Y X, X. (This property especially shows that the correlatio coefficiet is a quatitative measure for the liear associatio betwee two radom variables. If a radom d-dimesioal vector X has the covariace matrix ( jk,..., Γ X Σ σ j d (4 k,..., d ( j ( >0 ( j, k var X j k cov X X j k, σ jk (5 Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau the ( Xj Xk ( X var( X σjk cov, X, j X (6 k σ σ var jj kk j k is the correlatio coefficiet betwee two compoets of X, say, Give a (mathematical sample X,...,X X j ad X k. Xi X i, i,..., Xi X (idepedet observatios of X, the correlatio coefficiet X X, X is estimated by the (ordiary sample correlatio coefficiet ( X If i ( Xi Xi( Xi Xi ( Xi Xi ( Xi Xi i i i X i ad Xi X i. i i X X is ormally distributed the covariace matrix X Γ X Σ σ σ, σ σ (7 the the desity of X : T ( ( ( ( ( xμ Σ xμ f : det X fx x Σ e x ( x, x < x, x< π (8 ( Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau has the followig form f X or σ( xμ σ( xμ( xμ + σ( xμ ( σσσ e π σ σ σ ( x, x (9 f X μ μ σ ( x ( ( μ x μ x μ ( x μ + ( σ σσ σ e πσσ ( x, x E E ( X ( X σ var ( X ( X ( X X σ σ σ σ cov, σ σ var I this case >0 >0. (0 ( ( X i, (3 μ X, μ i (4 X X, ( i i σ i (5 X X, (6 ( i i σ i X X X X ad σ ( i i( i i (7 i are the so-called maximum likelihood estimators of μ, μ,σ, σ, (compare Statistical Iferece, that meas ad σ resp. Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau L ( X,..., X ; μ, μ, σ, σ, σ max + + ( μ, μ, σ, σ, σ L ( X,..., X ; μ, μ, σ, σ, σ (8 the likelihood fuctio L: L ( X,..., X; μ, μ, σ, σ, σ i (, fx X X i i ( πσσ σ ( π σσ σ i e e ( Xi ( Xi ( Xi + ( Xi σ μ σ μ μ σ μ ( σσ σ ( L L( μ μ σ σ σ ( σσσ (9 ( σ ( ( ( ( Xi μ σ Xi μ Xi μ + σ Xi μ i x,..., x ;,,,,, x,..., x, is the desity fuctio of : T X X... X. the -dimesioal radom vector ( Furthermore, it holds μ μ Eμ E, (0 μ μ ( ( ( E μμ E μμ μμ Γ μ ( E μ μ ( μ μ ( μ μ E ( μ ( μ μ var cov, cov ( μ, μ var( μ ( Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau σ σ, σ σ ad the sample covariace matrix σ σ Γ ( σ σ has the (probability desity (,, (,, f s s s f s s s Γ σ σ σ 4 ( σs σs + σs if ( ss s s ( >0, s >0 σσ σ e 4πΓ ( ad s < s s ( σσ σ 0 elsewhere the Gamma-fuctio Γ:Γ ( p t p e t d t( p>0 This implies the (probability desity f of the sample correlatio coefficiet ( ( ( r x x π 0( 0 0 4 d if r f r rx x (3 ad the sample fuctio (statistic T is t-distributed degrees of freedom.. (4 elsewhere (5 Thus to test the hypothesis H0 : 0 (versus the alterative H : 0 oe uses the statistic (5. The problem is somewhat difficult if oe wishes to test the hypothesis H :, < is specified, versus the alterative (hypothesis 0 0 0 0 ( A Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau HA : 0 (That meas, the correlatio coefficiet is assumed equal to a give value 0.. I this case R.A. Fisher (9 (cf. Nollau, V. ad Srivastava, M.S. ad Carter, E.M. suggested a trasformatio (Fisher s Z-trasformatio, c.f. Eq. (74: Z + l (6 + + ( Z 3 E Z l ad var. (7 ( With ζ l + + < < ( ( a ormal distributio ( ζ 3 the hypothesis H0: 0 the test statistic ( Z ζ 3 Fisher s Z-trasformatio has asymptotically Ν,, if the sample size teds to ifiity. Hece, uder 0 (8 ( ( + Z l, ( ad ζ 0 + 0 l + 0 0 ( (cf.eq.(7, (9 (30 is asymptotically stadardized ormally distributed. The asymptotic distributio of Z also implies that a asymptotic cofidece iterval for is Z z α Z + z α P tah < < tah α 3 3 (3 for a give cofidece level α( 0 < α<. Moreover, a asymptotic test for comparig the correlatio coefficiets ad of Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau two ormally distributed radom vectors X ad Y ca also be costructed by Fisher s trasformatio: Let X X X X, X,..., X ( 4 X X X ad (3 Y Y Y Y, Y,..., Y 4 Y Y Y ( idepedet radom samples from two two-dimesioal ormal populatios N EX ( μ Σ ad N(, Σ, i i μ ( i,..., ( i EY μ,...,, the covariace matrices Γ Γ Xi Yi μ the expectatio vectors σ σσ Σ σσ σ σ σσ Σ σσ σ ( i,..., ( i,..., ad the correlatio coefficiets ( i,..., ( i X, X i i,...,. Y, Y i i Uder the hypothesis H 0 : ( The correlatio coefficiets of both the populatios are equal. the (test statistic T Z Z + 3 3 (33 Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau Z + l ad l, + Z i ( Xi Xi( Xi Xi ( Xi Xi ( Xi Xi i i (34 (35 ad ( Yi Yi( Yi Yi i, ( Yi Yi ( Yi Yi i i X X j ad Y Y (, ji ji ji ji i i is asymptotically stadardized ormally distributed. Thus the hypothesis is to reject, if for a realizatio t of T based o cocrete samples (cf.eq. (3 holds t > z α respect to a give sigificace level α - - - ( α 0 < <. (36 TO ACCESS ALL THE 4 PAGES OF THIS CHAPTER, Visit: http://www.eolss.et/eolss-sampleallchapter.aspx Bibliography Johso N.L. ad Kotz S. (970, (97. Distributio i Statistics (cotiuous uivariate distributios-,, cotiuous multivariate distributios. New York: Joh Wiley & Sos. [This is a very importat stadard work for statistical research ad applicatios for three decades]. Müller P.H.(ed. (98. Lexiko der Stochastik. 5. Auflage. Berli: Akademie-Verlag. [This is a dictioary for all fields of stochastics a comprehesive descriptio of correlatio aalysis]. Muirhead R.J. (98. Aspects of Multivariate Statistical Theory. New York: Joh Wiley & Sos. [This Ecyclopedia of Life Support Systems (EOLSS

PROBABILITY AND STATISTICS Vol. III - Correlatio Aalysis - V. Nollau book presets all aspects of moder multivariate statistics, especially correlatio theory icludig caoical correlatio]. Nollau V. (979. Statistische Aalyse.. Auflage. Basel ud Stuttgart: Birkhäuser.[A importat chapter of this book presets simple, partial ad multiple correlatio aalysis]. Röhr: M. (987. Kaoische Korrelatiosaalyse. Berli: Akademie-Verlag. [This book presets a very comprehesive study about caoical correlatio aalysis may applicatios]. Seber G.A.F.(984. Multivariate Observatios. New York: Joh Wiley & Sos. [This moograph deals multivariate distributios, iferece for the multivariate ormal distributio, dimesioal reductios ad discrimiat aalysis, cluster aalysis ad MANOVA (multivariate aalysis of variace ad covariace]. Srivastava M.S. ad Carter E.M. (983. A Itroductio to Applied Multivariate Statistics. New York, Amsterdam, Oxford: North Hollad. [This is a textbook the mai topics: multivariate techiques as ANOVA, multivariate regressio, discrimiatio ad correlatio]. Biographical sketch V. Nollau was bor i 94 ad studied mathematics ad theoretical physics at the Techical Uiversity of Dresde (Germay. He graduated i 964, obtaiig doctorate i 966 ad 97 (Dr. habil.. From 969 he was assistat professor at TU Dresde. His mai research topics were operator theory, stochastic processes ad radom search. I 97 he made the first cotributios to stochastic optimizatio ad decisio processes theory. Sice 990 the author is professor for stochastic aalysis ad cotrol. He wrote several text works icludig "Statistische Aalyse" (Liear Models i Statistics. The author is dea of the faculty of mathematics i Dresde. Ecyclopedia of Life Support Systems (EOLSS