The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Similar documents
Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Stat 421-SP2012 Interval Estimation Section

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

The standard deviation of the mean

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Parameter, Statistic and Random Samples

Random Variables, Sampling and Estimation

Lecture 7: Properties of Random Samples

1.010 Uncertainty in Engineering Fall 2008

Expectation and Variance of a random variable

Chapter 6 Sampling Distributions

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Lecture 33: Bootstrap

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Common Large/Small Sample Tests 1/55

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

(7 One- and Two-Sample Estimation Problem )

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Lecture 18: Sampling distributions

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Estimation of the Mean and the ACVF

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Efficient GMM LECTURE 12 GMM II

APPLIED MULTIVARIATE ANALYSIS

STATISTICAL INFERENCE

4. Partial Sums and the Central Limit Theorem

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Properties and Hypothesis Testing

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Module 1 Fundamentals in statistics

TAMS24: Notations and Formulas

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Binomial Distribution

Chapter 6 Principles of Data Reduction

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Distributions of Functions of. Normal Random Variables Version 27 Jan 2004

STAT431 Review. X = n. n )

Estimation for Complete Data

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Distribution of Random Samples & Limit theorems

Asymptotic Results for the Linear Regression Model

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

32 estimating the cumulative distribution function

Simulation. Two Rule For Inverting A Distribution Function

Lecture 23: Minimal sufficiency

Summary. Recap ... Last Lecture. Summary. Theorem

Statistics 511 Additional Materials

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Questions and Answers on Maximum Likelihood

Machine Learning Brett Bernstein

7.1 Convergence of sequences of random variables

Statistical Properties of OLS estimators

Mathematical Statistics - MS

Statistical inference: example 1. Inferential Statistics

Stat 319 Theory of Statistics (2) Exercises

7.1 Convergence of sequences of random variables

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

Lecture 12: September 27

Quick Review of Probability

This section is optional.

MA Advanced Econometrics: Properties of Least Squares Estimators

Section 14. Simple linear regression.

Lecture 6: Coupon Collector s problem

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

LECTURE 8: ASYMPTOTICS I

Exponential Families and Bayesian Inference

Basis for simulation techniques

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

4. Hypothesis testing (Hotelling s T 2 -statistic)

EE 4TM4: Digital Communications II Probability Theory

Unbiased Estimation. February 7-12, 2008

AMS570 Lecture Notes #2

Introductory statistics

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Probability and Statistics

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Statistics 20: Final Exam Solutions Summer Session 2007

Quick Review of Probability

Lecture 19: Convergence

Sampling Distributions, Z-Tests, Power

STAT Homework 1 - Solutions

Statisticians use the word population to refer the total number of (potential) observations under consideration

1 Introduction to reducing variance in Monte Carlo simulations

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Solutions: Homework 3

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

HOMEWORK I: PREREQUISITES FROM MATH 727

Transcription:

SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample momets provide estimates of the momets of f(x) We eed to kow how they are distributed The mea x of a radom sample is a ubiased estimate of the populatio momet µ = E(x), sice ( xi ) E( x) =E = 1 E(xi )= µ = µ The variace of a sum of idepedet variables is the sum of their variaces, sice covariaces are zero Therefore V ( x) =V ( xi ) = 1 V (xi )= σ = σ Observe that V ( x) 0 as Sice E( x) = µ, the estimates become icreasigly cocetrated aroud the true populatio parameter Such a estimate is said to be cosistet 1

The sample variace is ot a ubiased estimate of σ = V (x), sice { 1 } E(s )=E (xi x) [ 1 { = E (xi µ)+(µ x) } ] [ 1 { = E (xi µ) +(x i µ)(µ x)+(µ x) }] = V (x) E{( x µ) } + E{( x µ) } = V (x) V ( x) Here, we have used the result that { 1 } E (xi µ)(µ x) = E{(µ x) } = V ( x) It follows that E(s )=V(x) V ( x) =σ σ ( 1) = σ Therefore, s is a biased estimator of the populatio variace For a ubiased estimate, we should use ˆσ = s 1 = (xi x) 1 However, s is still a cosistet estimator, sice E(s ) σ as ad also V (s ) 0 The value of V (s ) depeds o the distributio of uderlyig populatio, which is ofte assumed to be a ormal

Theorem Let x 1,x,,x be a radom sample from the ormal populatio N(µ, σ ) The, y = a i x i is ormally distributed with E(y) = a i E(x i )=µ a i ad V (y) = a i V (x i )=σ a i Ay liear fuctio of a set of ormally distributed variables is ormally distributed If is a ormal radom sample the x i N(µ, σ ); x N(µ, σ /) i =1,, Let µ = [µ 1,µ,,µ ] = E(x) be the expected value of x = [x 1,x,,x ] ad let Σ=[σ ij ; i, j =1,,,] be the variace covariace matrix If a =[a 1,a,,a ] is a costat vector, the a x N(a µ, a Σa) is a ormally distributed with a mea of E(a x)=a µ = a i µ i ad a variace of V (a x)=a Σa = i = a i σ ii + i i a i a j σ ij j a i a j σ ij j i 3

Let ι =[1, 1,,1] The, if x =[x 1,x,,x ] has x i N(µ, σ ) for all i, there is x N(µι, σ I ), where µι =[µ,µ,,µ] ad I is a idetity matrix of order Writig this explicitly, we have The, there is x = x 1 x x N µ µ µ, σ 0 0 0 σ 0 0 0 σ x =(ι ι) 1 ι x = 1 ι x N(µ, σ /) ad 1 σ = ι {σ I}ι = σ ι ι = σ, where we have used repeatedly the result that ι ι = If we do ot kow the form of the distributio from which the sample has bee take, we ca still say that, uder very geeral coditios, the distributio of x teds to ormality as : The Cetral Limit Theorem states that, if x 1,x,,x is a radom sample from a distributio with mea µ ad variace σ, the the distributio of x teds to the ormal distributio N(µ, σ /) as Equivaletly, ( x µ)/(σ/ ) teds i distributio to the stadard ormal N(0, 1) distributio 4

Distributio of the sample variace To describe the distributio of the sample variace, we eed to defie the chi-square distributio: Defiitio If z N(0, 1) is distributed as a stadard ormal variable, the z χ (1) is distributed as a chi-square variate with oe degree of freedom If z i N(0, 1); i =1,,, are a set of idepedet ad idetically distributed stadard ormal variates, the the sum of their squares is a chi-square variate of degrees of freeedom: u = zi χ () The mea ad the variace of the χ () variate are E(u) = ad V (u) = respectively Theorem The sum of two idepedet chi-square variates is a chi-square variate with degrees of freedom equal to the sum of the degrees of freedom of its additive compoets If x χ () ad y χ (m), the (x + y) χ (m + ) If x 1,x,,x x i χ () is a radom sample from a stadard ormal N(0, 1) distributio, the If x 1,x,,x is a radom sample from a N(µ, σ ) distributio, the (x i µ)/σ N(0, 1) ad, therefore, (x i µ) χ () σ 5

Cosider the idetity (xi µ) = ({x i x} + { x µ}) = ( {xi x} +{ x µ}{x i x} + { x µ} ) = {x i x} + { x µ}, which follows from the fact that the cross product term is { x µ} {x i x} = 0 This decompositio of a sum of squares features i the followig result: The Decompositio of a Chi-square statistic If x 1,x,,x is a radom sample from a stadard ormal N(µ, σ ) distributio, the (x i µ) (x i x) ( x µ) σ = σ + σ, with (1) () (x i µ) σ (x i x) χ (), σ χ ( 1), ( x µ) (3) σ χ (1), where the statistics uder () ad (3) are idepedetly distributed 6

Samplig Distributios (1) If u χ (m) ad v χ () are idepedet chi-square variates with m ad degrees of freedom respectively, the F = { / } u v m F (m, ), which is the ratio of the chi-squares divided by their respective degrees of freedom, has a F distributio of m ad degrees of freedom, deoted by F (m, ) () If z N(0, 1) is a stadard ormal variate ad if v χ () is a chi-square variate of degrees of freedom, ad if the two variates are distributed idepedetly, the the ratio / v t = z t() Notice that has a t distributed of degrees of freedom, deoted t() t = z v/ { χ / (1) χ } () = F (1,) 1 7

CONFIDENCE INTERVALS Let z N(0, 1) From the tables of the stadard ormal, we ca fid umbers a, b such that, for ay Q (0, 1), there is P (a z b) =Q The iterval [a, b] is called a Q 100% cofidece iterval for z The legth of the iterval is miimised whe it is cetred o E(z) =0 A cofidece iterval for the mea Let x i N(µ, σ ); i =1,,be a radom sample The ) x N (µ, σ x µ ad σ/ N(0, 1) Therefore, we ca fid umbers ±β such that ( P β x µ ) σ/ β = Q But, the followig evets are equivalet: ( β x µ ) σ/ β ( x β σ µ x + β σ ) The probability that [ x βσ/, x + βσ/ ] falls over µ is P ( x β σ µ x + β σ ) = Q, which meas that we Q 100% cofidet that µ lies i the iterval 8

A cofidece iterval for µ whe σ is ukow The ubiased estimate of σ is ˆσ = (x i x) /( 1) Whe σ is replaced by ˆσ, ( x µ) σ N(0, 1) is replaced by ( x µ) t( 1) ˆσ To demostrate this result, cosider writig { / (xi } ( x µ) ( x µ) x) = ˆσ σ σ, ( 1) ad observe that σ is cacelled from the umerator ad the deomiator The deomiator cotais (x i x) /σ χ ( 1) The umerator is a stadard ormal variate Therefore { / } χ ( 1) N(0, 1) t( 1) 1 To costruct a cofidece iterval, ±β, from the table of the N(0, 1) distributio, are replaced by correspodig umbers ±b, from the t( 1) table The P ( x b ˆσ µ x + b ˆσ ) = Q 9

A cofidece iterval for the differece betwee two meas Imagie a treatmet that affects the mea of a ormal populatio without affectig its variace To establish a cofidece iterval for the chage i the mea, take samples before ad after the treatmet Before treatmet, there is x i N(µ x,σ ); i =1,, ad x N ) (µ x, σ, ad, after treatmet, there is y j N(µ y,σ ); j =1,,m ad ȳ N ) (µ y, σ m Assumig that the samples are mutually idepedet, the differece betwee their meas is ( x ȳ) N ) (µ x µ y, σ + σ m Hece ( x ȳ) (µ x µ y ) σ + σ m N(0, 1) 10

If σ were kow, the, for ay give value of Q (0, 1), a umber β ca be foud from the N(0, 1) table such that { } σ P ( x ȳ) β + σ σ m µ x µ y ( x ȳ)+β + σ = Q, m givig a cofidece iterval for µ x µ y Usually, σ has to be estimated from the sample iformatio There are (xi x) σ χ ( 1) ad (yj ȳ) σ χ (m 1), which are idepedet variates with expectatios equal to the umbers of their degrees of freedom The sum of idepedet chi-squares is itself a chi-square with degrees of freedom equal to the sum of those of its costituet parts Therefore, (xi x) + (y j ȳ) σ χ ( + m ) has a expected value of + m, whece the ubiased estimate of the variace is ˆσ = (xi x) + (y j ȳ) + m 11

If the estimate is used i place of the ukow value of σ, the we get ( x ȳ) (µ x µ y ) ˆσ + ˆσ m = / (xi ( x ȳ) (µ x µ y ) x) + (y j ȳ) σ ( + m ) σ N(0, 1) χ (+m ) +m which is the basis for a cofidece iterval + σ m = t( + m ), A cofidece iterval for the variace If x i N(µ, σ ); i =1,, is a radom sample, the (x i x) /( 1) is a ubiased estimate of the variace ad (x i x) /σ χ ( 1) Therefore, from the appropriate chi-square table, we ca fid umbers α ad β such that ( (xi x) ) P α β = Q for some chose Q (0, 1) From this, it follows that P ( 1 α σ (xi x) 1 ) β σ = Q P ad the latter provides a cofidece iterval for σ ( (xi x) 1 β σ (xi x) ) = Q α

The cofidece iterval for the ratio of two variaces Imagie a treatmet that affects the variace of a ormal populatio It is possibile that the mea is also affected Let x i N(µ x,σ x); i =1,,be a radom sample take from the populatio before treatmet ad let y j N(µ y,σ y); j =1,,m be a radom sample take after treatmet The (xi x) σ χ ( 1) ad (yj ȳ) σ χ (m 1), are idepedet chi-squared variates, ad hece { (xi / (yj } x) ȳ) F = σx( 1) σy(m 1) F ( 1,m 1) It is possible to fid umbers α ad β such that P (α F β) =Q, where Q (0, 1) is some chose probability value Give such values, we may make the followig probability statemet: ( (yj ȳ) ) ( 1) P α (xi x) (m 1) σ y (yj ȳ) ( 1) σx β (xi x) = Q (m 1) 13