Lecture 3. Properties of Summary Statistics: Sampling Distribution

Similar documents
Topic 9: Sampling Distributions of Estimators

Parameter, Statistic and Random Samples

Lecture 33: Bootstrap

Topic 9: Sampling Distributions of Estimators

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

1.010 Uncertainty in Engineering Fall 2008

Topic 9: Sampling Distributions of Estimators

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Lecture 19: Convergence

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Chapter 6 Sampling Distributions

MA Advanced Econometrics: Properties of Least Squares Estimators

Properties and Hypothesis Testing

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Expectation and Variance of a random variable

Estimation of the Mean and the ACVF

TAMS24: Notations and Formulas

Asymptotic Results for the Linear Regression Model

4. Partial Sums and the Central Limit Theorem

Lecture 12: September 27

Lecture 7: Properties of Random Samples

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Estimation for Complete Data

Solutions: Homework 3

Stat 421-SP2012 Interval Estimation Section

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Distribution of Random Samples & Limit theorems

Statistics 511 Additional Materials

Understanding Samples

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

UNIT 8: INTRODUCTION TO INTERVAL ESTIMATION

Lecture 18: Sampling distributions

Parameter, Statistic and Random Samples

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

MATH/STAT 352: Lecture 15

Machine Learning Brett Bernstein

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Exponential Families and Bayesian Inference

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Bayesian Methods: Introduction to Multi-parameter Models

11 Correlation and Regression

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Lecture 2: Monte Carlo Simulation

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Efficient GMM LECTURE 12 GMM II

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Random Variables, Sampling and Estimation

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Statisticians use the word population to refer the total number of (potential) observations under consideration

The standard deviation of the mean

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

32 estimating the cumulative distribution function

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

1 Introduction to reducing variance in Monte Carlo simulations

Introductory statistics

This section is optional.

Lecture 11 and 12: Basic estimation theory

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

An Introduction to Asymptotic Theory

AMS570 Lecture Notes #2

Statistical inference: example 1. Inferential Statistics

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

1 Inferential Methods for Correlation and Regression Analysis

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Computing Confidence Intervals for Sample Data

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Lecture 9: September 19

(7 One- and Two-Sample Estimation Problem )

Module 1 Fundamentals in statistics

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 10 October Minimaxity and least favorable prior sequences

LECTURE 8: ASYMPTOTICS I

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments

PRACTICE PROBLEMS FOR THE FINAL

Stat 319 Theory of Statistics (2) Exercises

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture Chapter 6: Convergence of Random Sequences

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Transcription:

Lecture 3 Properties of Summary Statistics: Samplig Distributio

Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio?

Lecture Summary Today, we focus o two summary statistics of the sample ad study its theoretical properties i=1 Sample mea: X = 1 Sample variace: S 2 = 1 1 X i i=1 X i X 2 They are aimed to get a idea about the populatio mea ad the populatio variace (i.e. parameters) First, we ll study, o average, how well our statistics do i estimatig the parameters Secod, we ll study the distributio of the summary statistics, kow as samplig distributios.

Setup Let X 1,, X R p be i.i.d. samples from the populatio F θ F θ : distributio of the populatio (e.g. ormal) with features/parameters θ Ofte the distributio AND the parameters are ukow. That s why we sample from it to get a idea of them! i.i.d.: idepedet ad idetically distributed Every time we sample, we redraw members from the populatio ad obtai (X 1,, X ). This provides a represetative sample of the populatio. R p : dimesio of X i For simplicity, we ll cosider uivariate cases (i.e. p = 1)

Loss Fuctio How good are our umerical summaries (i.e. statistics) i capturig features of the populatio (i.e. parameters)? Loss Fuctio: Measures how good the statistic is i estimatig the parameter 0 loss: the statistic is the perfect estimator for the parameter loss: the statistic is a terrible estimator for the parameter Example: l T, θ = T θ 2 where T is the statistic ad Λ is the parameter. Called square-error loss

Sadly It is impossible to compute the values for the loss fuctio Why? We do t kow what the parameter is! (sice it s a ukow feature of the populatio ad we re tryig to study it!) More importatly, the statistic is radom! It chages every time we take a differet sample from the populatio. Thus, the value of our loss fuctio chages per sample

A Remedy A more maageable questio: O average, how good is our statistic i estimatig the parameter? Risk (i.e. expected loss): The average loss icurred after repeated samplig R θ = E l T, θ Risk is a fuctio of the parameter For the square-error loss, we have R θ = E T θ 2

Bias-Variace Trade-Off: Square Error Risk After some algebra, we obtai aother expressio for square error Risk R θ = E T θ 2 = E T θ 2 + E[(T E T ) 2 ] = Bias θ T 2 + Var T Bias: O average, how far is the statistic away from the parameter (i.e. accuracy) Bias θ T = E T θ Variace: How variable is the statistic (i.e. precisio) I estimatio, there is always a bias-variace tradeoff!

Sample Mea, Bias, Variace, ad Risk Let T X 1,, X = 1 X i=1 i. We wat to see how well the sample mea estimates the populatio mea. We ll use square error loss. Bias μ T = 0, i.e. the sample mea is ubiased for the populatio mea Iterpretatio: O average, the sample mea will be close to the populatio mea, μ Var T = σ2 Iterpretatio: The sample mea is precise up to a order 1. That is, we decrease the variability of our estimate for the populatio mea by a factor of 1 Thus, R μ = σ2 Iterpretatio: O average, the sample mea will be close to the populatio mea by σ2. This holds for all populatio mea μ

Sample Variace ad Bias Let T X 1,, X = 1 X 1 i=1 i X 2. We wat to see how well the sample variace estimates the populatio variace. We ll use square error loss. Bias σ 2 T = 0, i.e. the sample variace is ubiased for the populatio mea Iterpretatio: O average, the sample variace will be close to the populatio variace, σ 2 Var T =?, R σ 2 =? Depeds o assumptio about fourth momets.

I Summary We studied how good, o average, our statistic is i estimatig the parameter Sample mea, X, ad sample variace,σ 2, are both ubiased for the populatio mea,μ, ad the populatio variace, σ 2

But But, what about the distributio of our summary statistics? So far, we oly studied the statistics average behavior. Samplig distributio!

Samplig Distributio whe F is Normal Case 1 (Sample Mea): Suppose F is a ormal distributio with mea μ ad variace σ 2 (deoted as N(μ, σ 2 )). The X is distributed as X = 1 i=1 X i N(μ, σ2 ) Proof: Use the fact that X i N μ, σ 2.

Samplig Distributio Example Suppose you wat to kow the probability that your sample mea, X, is ε > 0 away from the populatio mea. We assume σ is kow ad X i N μ, σ 2, i. i. d. P X μ ε = P X μ σ ε σ = P Z ε σ where Z N(0,1).

Samplig Distributio whe F is Normal Case 1 (Sample Variace): Suppose F is a ormal distributio with mea μ ad variace σ 2 (deoted as N(μ, σ 2 )). The 1 σ2 is σ distributed as 2 1 σ2 σ 2 = 1 σ 2 i=1 X i X 2 2 χ 1 2 where χ 1 is the Chi-square distributio with 1 degrees of freedom.

Some Prelimiaries from Stat 430 Fact 0: (X, X 1 X,, X X) is joitly ormal Proof: Because X ad X i X are liear combiatios of ormal radom variables, they must be joitly ormal. Fact 1: For ay i = 1,,, we have Cov X, X i X = 0 Proof: E X X i X = 1 μ2 + 1 μ2 + σ 2 (μ 2 + σ2 ) = 0 E X E X i X = μ μ μ = 0. Thus, Cov X, X i X = E X X i X E X E X i X = 0

Sice X ad X i X are joitly ormal, the zero covariace betwee them implies that X ad X i X are idepedet Furthermore, because σ 2 = 1 X 1 i=1 i X 2 is a fuctio of X i X, σ 2 is idepedet of X

2 Fact 2: If W = U + V ad W χ a+b, V χ 2 b, ad U 2 ad V are idepedet, the U χ a Proof: Use momet geeratig fuctios Now, we ca prove this fact. (see blackboard) W = U = X i μ σ 2 = i=1 X i X+X μ σ 2 = i=1 U + V χ 2 1 S2 σ 2 X μ V = σ 2 Thus, U χ 1 2 χ1 2

Samplig Distributio whe F is ot Normal Case 2: Suppose F is a arbitrary distributio with mea μ ad variace σ 2 (deoted as F(μ, σ 2 )). The as, lim X μ σ N 0,1 Proof: Use the fact that X i F μ, σ 2 ad use the cetral limit theorem

Properties As sample size icreases, X is ubiased Asymptotically ubiased As sample size icreases, X approaches the ormal distributio at a rate 1 Eve if we do t have ifiite sample size, usig N as a approximatio to X is meaigful for large samples How large? Geeral rule of thumb: 30 μ, σ2

Example Suppose X i Exp λ i i.i.d. Remember, E X i = λ ad Var X i = 1 λ 2 The, for large eough, X N( 1 λ, 1 λ 2 )

A Experimet

Lecture Summary Risk: Average loss/mistakes that the statistics make i estimatig the populatio parameters Bias-variace tradeoff for squared error loss. X ad σ 2 are ubiased for the populatio mea ad the populatio variace Samplig distributio: the distributio of the statistics used for estimatig the populatio parameters. If the populatio is ormally distributed: X N μ, σ2 1 σ2 ad σ 2 2 χ 1 If the populatio is ot ormally distributed X μ σ N(0,1) or X N μ, σ2