Random Variables, Sampling and Estimation

Similar documents
Expectation and Variance of a random variable

Properties and Hypothesis Testing

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Statistics 511 Additional Materials

Quick Review of Probability

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Quick Review of Probability

Estimation for Complete Data

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Module 1 Fundamentals in statistics

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The standard deviation of the mean

Simulation. Two Rule For Inverting A Distribution Function

Lecture 2: Monte Carlo Simulation

1 Inferential Methods for Correlation and Regression Analysis

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

EE 4TM4: Digital Communications II Probability Theory

Mathematical Statistics - MS

Stat 421-SP2012 Interval Estimation Section

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Computing Confidence Intervals for Sample Data

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

An Introduction to Randomized Algorithms

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Probability and statistics: basic terms

Statistical inference: example 1. Inferential Statistics

Lecture 1 Probability and Statistics

Chapter 6 Principles of Data Reduction

Unbiased Estimation. February 7-12, 2008

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

AMS570 Lecture Notes #2


Final Review for MATH 3510

Econ 325: Introduction to Empirical Economics

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

Chapter 6 Sampling Distributions

Statisticians use the word population to refer the total number of (potential) observations under consideration

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Convergence of random variables. (telegram style notes) P.J.C. Spreij

CS 330 Discussion - Probability

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Topic 8: Expected Values

This is an introductory course in Analysis of Variance and Design of Experiments.

1.010 Uncertainty in Engineering Fall 2008

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

7.1 Convergence of sequences of random variables

CSE 527, Additional notes on MLE & EM

4. Partial Sums and the Central Limit Theorem

Axioms of Measure Theory

Binomial Distribution

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

Topic 10: Introduction to Estimation

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

7.1 Convergence of sequences of random variables

Lecture 19: Convergence

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

What is Probability?

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Distribution of Random Samples & Limit theorems

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Basis for simulation techniques

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

6.3 Testing Series With Positive Terms

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

PRACTICE PROBLEMS FOR THE FINAL

Statistical Properties of OLS estimators

Parameter, Statistic and Random Samples

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Economics Spring 2015

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Lecture 7: Properties of Random Samples

11 Correlation and Regression

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Stat410 Probability and Statistics II (F16)

Advanced Stochastic Processes.

Infinite Sequences and Series

Transcription:

Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig i the ext chapters. The key topics that we will review are the followig: Descriptive statistics. e.g. mea ad variace. Probability. e.g. evets, relative frequecy, margial ad coditioal probability distributios. Radom variables, probability distributios, ad expectatios. Samplig. e.g. simple radom samplig. Estimatio. e.g. the distictio betwee ad estimator ad a estimate. Statistical iferece. t ad F tests. 1.2 Probabilities 1.2.1 Evets Radom experimet. Process leadig to two or more possible outcomes, with ucertaity as to which outcome will occur. Flip of a coi, toss of a die, a studets takes a class ad either obtais a A or ot. Sample space. Set of all basic outcomes of a radom experimet. 1

2 1 Radom Variables, Samplig ad Estimatio Whe flippig a coi, S = [head, tail]. Whe takig a class, S = [A, B, C, D, F, drop]. Whe tossig a die, S = [1, 2, 3, 4, 5, 6]. No two outcomes ca occur simultaeously. Evet. Subset of basic outcomes i the sample space. Evet E 1 : Pass the class the the subset of basic outcomes is A, B, C. Itersectio of evet. Whe two evets E 1 ad E 2 have some basic outcomes i commo. It is deoted by E 1 E 2. Evet E 1 : Idividuals with college degree. Evet E 2 : Idividuals who are married. E 1 E 2 : Idividuals who have college degree ad are married. Joit probability. Probability that the itersectio occurs. Mutually exclusive evets. E 1 ad E 2 are mutually exclusive if E 1 E 2 is empty. Uio of evets. Deoted by E 1 E 2. At least oe of these evets occurs. Either E 1, E 2, or both. Complemet. The complemet of E is deoted by Ē ad it is the set of basic outcomes of a radom experimet that belogs to S, but ot to E 1. E 1 is the complemet of Ē 1 Evet E 2 : Idividuals who are married. E 1 ad Ē are mutually exclusive evets. 1.2.2 Probability postulates Give a radom experimet, we wat to determie the probability that a particular evet will occur. A probability is a measure from 0 to 1.

1.3 Discrete radom variables ad expectatios 3 0 the evet will ot occur. 1 the evet is certai. Whe the outcomes are equally likely to occur, the probability of a evet E is: P(E)=N E /N N E : Number of outcomes i evet E. N: Total umber of outcomes i the sample space S. Example 1: Flip of a coi, Evet E is head the P(E) = 1/2. N E = 1 ad N = 2. Example 2: Evet E is wiig the lottery the if there are 1000 lottery tickets ad you bought, 2 P(E) = 2/1000 = 0.002. Some probability rules P(E Ē) = P(E) + P(Ē) = 1. P(Ē) = 1 - P(E). Coditioal probability P(E 1 E 2 ): Probability that E 1 occurs, give that E 2 has already occurred. P(E 1 E 2 ) = P(E 1 E 2 ) / P(E 2 ) give that P(E 2 ) > 0. Additio rule P(E 1 E 2 ) = P(E 1 ) + P(E 2 ) - P(E 1 E 2 ). Statistically idepedet evets P(E 1 E 2 ) = P(E 1 )P(E 2 ). P(E 1 E 2 ) = P(E 1 )P(E 2 ) / P(E 2 ) = P(E 1 ). 1.3 Discrete radom variables ad expectatios 1.3.1 Discrete radom variables Radom variable. Variable that takes umerical values determied by the outcome of a radom experimet. Examples: Hourly wage, GDP, iflatio, the umber whe tossig a die. Notatio: Radom variable X ca take possible values x 1,x 2, x. Discrete radom variable. A radom variable that takes a coutable umber of values.

4 1 Radom Variables, Samplig ad Estimatio Examples: Number of years of educatio. Cotiuous radom variable. A radom variable that ca take ay value o a iteral. Examples: Wage, GDP, exact weight. Cosider tossig two dies (gree ad red). This will yield 36 possible outcomes because the gree ca take 6 possible values ad the red ca take also 6 values, 6 6=36. The possible outcomes. Let s defie the radom variable X to be the sum of two dice. Therefore X ca take 11 possible values, from 2 to 12. This iformatio is summarized i the followig tables. Table 1.1 Outcomes with two dies red / gree 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 10 11 12 Table 1.2 Frequecies ad probability distributios Value of X 2 3 4 5 6 7 8 9 10 11 12 Frequecy 1 2 3 4 5 6 5 4 3 2 1 Probability (p) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1.3.2 Expected value of radom variables Let E(X) be the expected value of the radom variable X. The expected value of a discrete radom variable is the weighted average of all its possible values, takig the probability of each outcome as its weight. Radom variable X ca take particular values x 1,x 2,...,x ad the probability of x i is give by p i. The we have that the expected value is give by: E(X)=x 1 p 1 + x 2 p 2 + +x p = i=1 x i p i. (1.1) We ca also write the expected value as: E(X)= µ X. For the previous example we ca calculate that the expected value is:

1.3 Discrete radom variables ad expectatios 5 E(X)=2 1/36+3 2/36+ +12 1/36=252/36=7 (1.2) Table 1.3 Expected value of X, two dice example X p X p 2 1/36 2/36 3 2/36 6/36 4 3/36 12/36 5 4/36 20/36 6 5/36 30/36 7 6/36 42/36 8 5/36 40/36 9 4/36 36/36 10 3/36 30/36 11 2/36 22/36 12 1/36 12/36 Total E(X)= i=1 x i p x 252/36 = 7 1.3.3 Expected value rules E(X+Y + Z)=E(X)+E(Y)+E(Z) (1.3) E(bX)=bE(X) for a costat b (1.4) E(b) = b (1.5) For the example where Y = b 1 + b 2 X, b 1 ad b 2 are costats we wat to calculate E(X). E(Y) = E(b 1 + b 2 X) (1.6) = E(b 1 )+E(b 2 X) = b 1 + b 2 E(X) 1.3.4 Variace of a discrete radom variable Let var(x) be the variace of the radom variable X. var(x) is a useful measure of the dispersio of its probability distributio. It is defied as the expected value of the square of the differece betwee X ad its mea. That is, (X µ X ) 2, where µ X is the populatio mea of X. var(x) = σ 2 X = E[(X µ X ) 2 ] (1.7)

6 1 Radom Variables, Samplig ad Estimatio = (x 1 µ X ) 2 p 1 +(x 2 µ X ) 2 p 2 + +(x µ X ) 2 p (1.8) = i=1 (x i µ X ) 2 p i Takig the square root of the variace (σx 2 ) oe ca obtai the stadard deviatio, σ X. The stadard deviatio also serves as a measure of dispersio of the probability distributio. A useful way to write the variace is: σ 2 X = E(X 2 ) µ 2 X. (1.9) From the previous example of tossig two dies, we have that the populatio variace ca be calculated as follows: Table 1.4 Populatio variace, X from the two dice example X p X µ X (X µ X ) 2 (X µ X ) 2 p 2 1/36-5 25 0.69 3 2/36-4 16 0.89 4 3/36-3 9 0.75 5 4/36-2 4 0.44 6 5/36-1 1 0.14 7 6/36 0 0 0.00 8 5/36 1 1 0.14 9 4/36 2 4 0.44 10 3/36 3 9 0.75 11 2/36 4 16 0.89 12 1/36 5 25 0.69 Total 5.83 1.3.5 Probability desity Because discrete radom variables, by defiitio, ca oly take a fiite umber of values, they are easy to summarize graphically. The probability distributio is the graph that liks all the values that a radom variable ca take with its correspodig probabilities. For the two dice example above, see Figure 1.1.

1.4 Cotiuous radom variables 7 Fig. 1.1 Discrete probabilities, X from the two dice example 1.4 Cotiuous radom variables 1.4.1 Probability desity Cotiuous radom variables ca take ay value o a iterval. This meas that it ca take a ifiite umber of differet values, hece it is ot possible to obtai a graph like the oe preseted i Figure 1.1 for a cotiuous radom variable. Istead, we will defie the probability of a radom variable lyig withi a give iterval. For example, the probability that the height of a idividual is betwee 5.5 ad 6 feet. This is depicted i Figure 1.2 as the shaded area below the probability desity curve for the values of X betwee 5.5 ad 6. The probability of the radom variable X writte as a fuctio of the radom variable is kow as the probability desity fuctio. We ca write this oes as f(x). The, if we use a little math we ca easily fid the area uder the curve. Recall that the are uder a curve ca be obtaied by takig the itegral. Probability desity fuctio. Is a fuctio that describes the relative likelihood for a radom variable to occur at a give poit. 6 5.5 0 f(x) = 0.18 (1.10) f(x) = 1

8 1 Radom Variables, Samplig ad Estimatio Fig. 1.2 Cotiuous probabilities, X from the height example The first lie i the equatio above just calculates the itegral uder the curve f(x) betwee the poits 5.5 ad 6. The secod lie shows that the whole area uder the curve preseted i Figure 1.2 is equal to oe. This is for the same reaso why the summatio of all the bars i Figure 1.1 are also equal to oe; the total probability is always equal to oe. 1.4.2 Normal distributio The ormal distributio is the most widely kow cotiuous probability distributio. The graph associated with its probability desity fuctio has a bell-shape ad its is kow as the Gaussia fuctio or bell curve. Its probability desity fuctio is give by: f(x)= 1 2πσ 2 e (x µ2 ) 2σ 2 (1.11) where µ is the mea ad σ 2 is the variace. Figure 1.1 is a example of this distributio.

1.5 Covariace ad correlatio 9 1.4.3 Expected value ad variace of a cotiuous radom variable The basic differece betwee a discrete ad a cotiuous radom variable is that the secod ca take o ifiite possible values, hece the summatios sigs that are used to calculate the expected value ad the variace of a discrete radom variable caot be used for a cotiuous radom variable. Istead, we use itegral sigs. For the expected value we have: E(X) = X f(x)dx (1.12) where the itegratio is performed over the iterval for which f(x) is defied. For the variace we have: σx 2 = E[(X µ X ) 2 ]= (X µ X ) 2 f(x)dx (1.13) 1.5 Covariace ad correlatio 1.5.1 Covariace Whe dealig with two variables, the first questio you wat to aswer is whether these variables move together or whether they move i opposite directios. The covariace will help us aswer that questio. For two radom variables X ad Y, the covariace is defied as: cov(x,y) = σ XY = E[(X µ X )(Y µ Y )] (1.14) where µ X ad µ Y are the populatio meas of X ad Y, respectively. Whe to radom variables are idepedet, their covariace is equal to zero. Whe σ XY > 0 we say that the variables move together. Whe σ XY < 0 they move i opposite directios. 1.5.2 Correlatio Oe cocer whe usig the cov(x,y) as a measure of associatio is that the result is measured i the uits of X times the uits of Y. The correlatio coefficiet, that is dimesioless, overcomes this difficulty. For variables X ad Y the correlatio coefficiet is defied as:

10 1 Radom Variables, Samplig ad Estimatio corr(x,y) = ρ Y X = σ Y X σx 2σ Y 2 (1.15) The correlatio coefficiet is a umber betwee 1 ad 1. Whe it is positive, we say that there is a positive correlatio betwee X ad Y ad that these two variables move i the same directio. Whe it is egative, we say that they move i opposite directios. 1.6 Samplig ad estimators Notice that i the two dice example we kow the populatio characteristics, that is, the probability distributio. From this probability distributio it is easy to obtai the populatio mea a variace. However, what happes most of the time is that we eed to rely o a data set to get estimates of the populatio parameters (e.g the mea ad the variace). I that case the estimates of the populatio parameters are obtaied usig estimators, ad the sample eeds to have certai characteristics. The estimators ad the samplig are the subject of this sectio. 1.6.1 Samplig The most commo way to obtai a sample from the populatio is through simple radom samplig. Simple radom samplig. It is a procedure to obtai a sample from the populatio, where each of the observatios is chose radomly ad etirely by chace. This meas that each observatio i the populatio has the same probability of beig chose. Oce the sample of the radom variable X has be geerated, each of the observatios ca be deoted by{x 1,x 2,,x }. 1 1 The textbook Dougherty (2007) makes the distictio betwee the specific values of the radom variable X before ad after they are kow, ad emphasizes this distictio by usig uppercase ad lowercase letter. This distictio is useful oly i some cases ad that is why most textbooks do ot make this distictio. We will follow emphasize the distictio ad we will use oly lowercase letters.

1.7 Ubiasedess ad efficiecy 11 1.6.2 Estimators Estimator. It is a geeral rule (mathematical formula) for estimatig a ukow populatio parameter give a sample of data. For example, a estimator for the populatio mea is the sample mea: X = 1 (x 1+ x 2 + +x )= 1 i=1 x i. (1.16) A iterestig feature of this estimator is that the variace of X is 1/ times the variace of X. The derivatio is the followig: σ 2 X = var( X) (1.17) σ 2 X = var{1 (x 1+ x 2 + +x )} (1.18) σ 2 X = 1 2 var{1 (x 1+ x 2 + +x )} (1.19) σ 2 X = 1 2{var(x 1)+var(x 2 )+ +var(x )} (1.20) σ 2 X = 1 2{σ 2 X + σ 2 X + +σ 2 X} (1.21) σ 2 X = 1 2{σ 2 X}= σ 2 X (1.22) Graphically, this result is show i Figure 1.3. The distributio of X has a higher variace (it is more dispersed) tha the distributio of X. 1.7 Ubiasedess ad efficiecy 1.7.1 Ubiasedess Because estimators are radom variables, we ca take expectatios of the estimators. If the expectatio of the estimator is equal to the true populatio parameter, the we say that this estimator is ubiased. Let θ be the populatio parameter ad let ˆθ be a poit estimator of θ. The, ˆθ is ubiased if: E( ˆθ)=θ (1.23) Example. The sample mea of X is a ubiased estimator of the populatio mea µ X :

12 1 Radom Variables, Samplig ad Estimatio X µ X Fig. 1.3 Probability desity fuctios of X ad X. E( X) = E( 1 i )= i=1x 1 E( x i ) (1.24) i=1 = 1 i ))= i=1(e(x 1 i=1 µ X = 1 µ X = µ X Ubiased estimator. A estimator is ubiased if its expected value is equal to the true populatio parameter. The bias of a estimator is just the differece betwee its expected value ad the true populatio parameter: Bias( ˆθ)=E( ˆθ) θ (1.25) 1.7.2 Efficiecy It is ot oly importat that a estimator is o average correct (ubiased), but also that it has a high probability of beig close to the true parameter. Whe comparig two estimators, ˆθ 1 ad ˆθ 2, we say that ˆθ 1 is more efficiet if var( ˆθ 1 ) < var( ˆθ 2 ). A compariso of the efficiecy betwee these two estimators i preseted i Figure 1.4. The estimator with higher variace,( ˆθ 2 ), is more dispersed.

1.8 Estimators for the variace, covariace, ad correlatio 13 θˆ 1 θˆ 2 µ X Fig. 1.4 Efficiecy of estimators ˆθ 1 ad ˆθ 2, with var( ˆθ 1 )<var( ˆθ 2 ). Most efficiet estimator. The estimator with the smallest variace from all ubiased estimators. 1.7.3 Ubiasedess versus efficiecy Both, ubiasedess ad efficiecy, are desired properties of a estimator. However, there may be coflicts i the selectio betwee two estimators ˆθ 1 ad ˆθ 2, if, for example, ˆθ 1 is more efficiets, but it is also biased. This case is preseted i Figure 1.5. The simplest way to select betwee these two estimators is to pick the oe that yields the smallest mea square error (MSE): MSE( ˆθ)=var( ˆθ)+bias( ˆθ) 2 (1.26) 1.8 Estimators for the variace, covariace, ad correlatio While we have already see the populatios formulas for the variace, covariace ad correlatio, it is importat to keep i mid that we do ot have the whole populatio. The data sets we will be workig with are just samples of the populatios. The formula for the sample variace is:

14 1 Radom Variables, Samplig ad Estimatio θˆ 2 θˆ 1 µ X Fig. 1.5 ˆθ 2 is ubiased, but ˆθ 1 is more efficiet. s 2 X = 1 1 i=1 (x i X) 2 (1.27) Notice how we chaged the otatio from σ 2 to s 2. The first oe deotes the populatio variace, while the secod oe refers to the sample variace. A estimator for the populatio covariace is give by: s XY = 1 1 Fially, the formula for the correlatio coefficiet, r XY, is: r XY = i=1 (x i X)(y i Ȳ). (1.28) i=1 (x i X)(y i Ȳ) i=1 (x i X) 2 i=1 (y (1.29) i Ȳ) 2. 1.9 Asymptotic properties of estimators Asymptotic properties of estimators just refers to their properties whe the umber of observatios i the sample grows large ad approached to ifiity.

1.9 Asymptotic properties of estimators 15 = 1000 = 250 = 40 θ Fig. 1.6 The estimator is biased for small samples, but cosistet. 1.9.1 Cosistecy A estimator ˆθ is said to be cosistet if its bias becomes smaller as the sample size grows large. Cosistecy is importat because may of the most commo estimators used i ecoometrics are biased, the the miimum we should expect from these estimators is that the bias becomes small as we are able to obtai larger data sets. Figure 1.6 illustrates the cocept of cosistecy by showig how a estimator of the populatio parameter θ becomes ubiased as. 1.9.2 Cetral limit theorem Havig ormally distributed radom variables is importat because we ca the costruct, for example, cofidece itervals for its mea. However, what if a radom variable does ot follow a ormal distributio? The cetral limit theorem gives us the aswer. Cetral limit theorem. States the coditios uder which the mea of a sufficietly large umber of idepedet radom variables (with fiite mea ad variace) will be approximate a ormal distributio. Hece, eve if we do ot kow the uderlyig distributio of a radom variable, we will still be able to costruct cofidece itervals that will be approximately valid. I a umerical example, let s assume that the radom variable X follows a

16 1 Radom Variables, Samplig ad Estimatio = 100 = 20 = 10 Fig. 1.7 Distributio of the sample mea of a uiform distributio. uiform distributio [-0.5,0.5]. Hece, it is equally likely that this radom variable takes ay value withi this rage. Figure 1.7 shows the distributio of the average of this radom variable for = 10, 20, ad 100. All of these three distributios look very close to a ormal distributio.