http://wiki.stat.ucla.du/socr/id.php/socr_courss_2008_thomso_econ261 INTRODUCTION TO SAMPLING DISTRIBUTIONS By Grac Thomso
INTRODUCTION TO SAMPLING DISTRIBUTIONS Itro to Samplig 2 I this chaptr w will lar about 3 importat topics: 1. Samplig rror 2. Samplig Distributio of th ma 3. Samplig Distributio of a proportio This chaptr itroducs iformatio about Samplig ad its objctivs. I Chaptr 1 w had studid som tchiqus for samplig ad data collctio. Rmmbr wh w talkd about th systmatic radom samplig, th stratifid samplig, amog othrs? Wll, chaptr 6 gts ito th rquirmts to sur that th sampl that you hav chos mts quality ad validity critria. 1. Samplig rror W hav discussd bfor how ffctiv is to work with a sampl istad of a larg populatio, for coomic ad logistic rasos; but oc you hav your sampl, w qustios aris: Is th sampl Ma qual to th populatio ma X = μ? If, X μ how clos is sampl ma to th actual populatio? Is a sampl Ma of siz () a good stimat of th populatio ma? Sampls ar diffrt Thr ar may combiatios Sampl ma may b diffrt Sampl % may b diffrt
Itro to Samplig 3 Do you d to icras to mak sampl ma closr to populatio ma? ( X μ) Objctiv of Samplig To gathr data that mirrors a populatio Howvr, w would rarly kow if objctiv data would b achivd!!! W would d th populatio cout iformatio. Samplig ds to b chos radomly to avoid bias: to sur that it rflcts charactristics of th populatio Samplig rror Diffrc btw Sampl valu (statistic) Vs Populatio valu (paramtr) X μ ΣX = ΣX μ = N Simpl Radom Samplig (ach possibl sampl of a giv siz has a qual chac of big slctd) C Sampl ma: chags dpdig o th sampl w tak. Populatio ma: always th sam, o mattr how may tims w calculat it Pottial for trm samplig rror is gratr wh smallr sizd sampl ar usd Howvr, thr ar cass wh largr sampls ar o guarat of smallr rror
Itro to Samplig 4 2. Samplig Distributio of th ma Busiss applicatios us Simpl radom samplig C C Tru Samplig Distributio Distributio of th possibl valus of a statistic for a giv-siz radom sampl slctd from a populatio abov blow populatio W ca us Ecl faturs for samplig; lt s rmmbr th procdur. Lt s say that w wat to pick radom sampls of 10 obsrvatios =10 out of a populatio of siz 200. W kow that th populatio ma is μ= 2.505, lt s procd: Tools Ecl Slct rpatd sampls Data Aalysis Samplig Populatio (X1 X200) Radom Samplig = 10 Output optio (i th sam pag) ok You ca calculat Sampl ma, stadard dviatio ad all th statistics that you hav lard. If you rpat this sam samplig opratio 500 tims, you ca build a histogram with th mas of ach sampl, somthig lik this: f r q u c y Samplig distributio of 500 combiatios of =10 f r q u Populatio distributio of 200 obsrvatios = 2.41 c μ= 2.505 y # Sampl Mas = 500 # Obsrvatios = 200 Compar it with frqucy distributio of populatio
1. Samplig Distributio taks th shap of a bll curv Itro to Samplig 5 2. = 2.41 is th Ma of sampl mas vs. μ = 2. 505 Ma of populatio Almost qual is ubiasd stimator of th paramtr Wh avrag of all possibl valus of th sampl statistic quals a paramtr 3. =1. 507 > S = 0.421 f r q u c y Samplig distributio of 500 combiatios of =20 = 2.53 S= 0.376 If distributio of Sampl ma will bcom shapd mor lik a ormal It s almost impossibl to calculat a TRUE Samplig distributio, as thr ar so may ways to choos sampls, ad ach o of thm may hav diffrt mas, stadard dviatios ad statistics. W wo t kow which th right o is ulss w compard it to th Populatio (if w gt to hav it availabl). Thrfor, i ordr to mak th procss simplr w ca us two thorms: Thorm 6-1 If populatio is ormally distributd With ma μ ad stadard dviatio Sampl distributio of sampl ma is also ormally distributd with: μ = μ ad = (usd wh populatio is Normally distributd)
Itro to Samplig 6 W ca us th Stadard Normal Distributio, ad asily mak coclusios about th bhavior of paramtrs, by lookig at th Statistics. W us Z valu to prss th Samplig Distributio of. Z tlls how distat is from μ Z = μ Howvr If = 5%N (larg sampl!) ad samplig is w/o rplacmt, w us Fiit populatio corrctio factor Z = μ N N 1 N N 1 Thorm 6-2 Th Ctral Limit Thorm Ay populatio with μ, ; will rsult i a sampl with ma μ = μ ad = (usd wh populatio is ot ormally Distributd.g. wight, icom i a If is sufficitly larg. Th largr th sampl siz, th bttr th approimatio to Normal distributio rgio)
Itro to Samplig 7 What is it? Shap of distributio of populatio A larg dpds o Symmtric = 2 or 3 Eough to provid ormally distributd samplig distributios Highly skwd = must b 25 or 30 if > 30 Cosrvativ dfiitio of a sufficitly larg sampl siz. 3. Samplig Distributio for Proportios Wh iformatio about populatio is giv i proportios, th samplig procdur rquirs slight modificatios to apply th Ctral Limit Thorm, lt s plai it: Populatio proportio X p = N Sampl proportio p = Samplig rror = p p Notic that populatio variabls ar capitalizd. Notic that Sampl variabls ar lowrcas. p has BINOMIAL as udrlyig distributio, but wh p ad (1-p) ar larg p is tratd as ormal distributio
Itro to Samplig 8 Samplig Distributio of p p μ = p p( 1 p) = Z = p p p If p> 5% N p = p(1 p) N N 1 SOCR CLT Eprimts http://wiki.stat.ucla.du/socr/id.php/socr_edumatrials_activitis_gralctrallimitthorm To start th this Eprimt, go to SOCR Eprimts (socr.ucla.du/htmls/socr_eprimts.html) ad slct th SOCR Samplig Distributio CLT Eprimt from th drop-dow list of primts i th lft pal. Th imag blow shows th itrfac to this primt. Notic th mai cotrol widgts o this imag (bod i blu ad poitd to by arrows). Th gric cotrol buttos o th top allow you to do o or multipl stps/rus, stop ad rst this primt. Th two tabs i th mai fram provid graphical accss to th rsults of th primt (Histograms ad Summaris) or th Distributio slctio pal (Distributios). Rmmbr that choosig sampl-sizs <= 16 will aimat th sampls (scod graphig row), whras largr sampl-sizs (N>20) will oly show th updats of th samplig distributios (bottom two graphig rows).
Eprimt 1 Itro to Samplig 9 Epad your Eprimt pal (right pal) by clickig/draggig th vrtical split-pa bar. Choos th two sampl sizs for th two statistics to b 10. Prss th stp-butto a fw of tims (2-5) to s th primt ru svral tims. Notic how data is big sampld from th ativ populatio (th distributio of th procss o th top). For ach stp, th procss of samplig 2 sampls of 10 obsrvatios will grat 2 sampl statistics of th 2 paramtrs of itrst (ths ar dfaultd to ma ad variac). At ach stp, you ca s th plots of all sampl valus, as wll as th computd sampl statistics for ach paramtr. Th sampl valus ar show o th scod row graph, blow th distributio of th procss, ad th two sampl statistics ar plottd o th bottom two rows. If w ru this primt may tims, th bottom two graphs/histograms bcom good approimatios to th corrspodig samplig distributios. If w did this ifiitly may tims ths two graphs bcom th samplig distributios of th chos sampl statistics (as th obsrvatios/masurmts ar idpdt withi ach sampl ad btw sampls). Fially, prss th Rfrsh Stats Tabl butto o th top to s th sampl summary statistics for th ativ populatio distributio (row 1), last sampl (row 2) ad th two samplig distributios, i this cas ma ad variac (rows 3 ad 4). Eprimt 2 For this primt w'll look at th ma, stadard dviatio, skwss ad kurtosis of th sampl-avrag ad th sampl-variac (ths two sampl data-driv statistical stimats). Choos sampl-sizs of 50, for both stimats (ma ad variac). Slct th Fit Normal Curv chck-bos for both sampl distributios. Stp through th primt a fw tims (by clickig th Ru butto) ad th click Rfrsh Stats Tabl butto o th top to s th sampl summary statistics. Try to udrstad ad rlat ths sampl-distributio statistics to thir aalogus from th ativ populatio (o th top row). For ampl, th ma of th multipl sampl-avrags is about th sam as th ma of th ativ populatio, but th stadard dviatio of th samplig distributio of th avrag is about procss/distributio., whr is th stadard dviatio of th origial ativ
Itro to Samplig 10 Eprimt 3 Now lt's slct ay of th SOCR Distributios, sampl from it rpatdly ad s if th ctral limit thorm is valid for th procss w hav slctd. Try Normal, Poisso, Bta, Gamma, Cauchy ad othr cotiuous or discrt distributios. Ar our mpirical rsults i agrmt with th CLT? Go to th Distributios tab o th top of th graphig pal. Rst th primts pal (butto o th top). Slct a distributio from th drop-dow list of distributios i this list. Choos appropriat paramtrs for your distributio, if ay, ad click th Sampl from this Currt Distributio butto to sd this distributio to th graphig pal i th Histograms ad Summaris tab. Go to this pal ad agai ru th primt svral tims. Notic how w ow sampl from a No-Normal Distributio for th first tim. I this cas w had chos th Bta distributio (α = 6.7,β = 0.5). Eprimt 4 Suppos th distributio w wat to sampl from is ot icludd i th list of SOCR Distributios, udr th Distributios tab. W ca th draw a shap for a hypothtical distributio by clickig ad draggig th mous i th top graphig cavas (Histograms ad Summaris tab pal). This away you ca costruct cotiguous ad discotiuous, symmtric ad asymmtric, uimodal ad multi-modal, lptokurtic ad msokurtic ad othr typs of distributios. I th figur blow, w had dmostratd this fuctioality to study diffrcs btw two data-driv stimats for th populatio ctr - sampl ma ad sampl mdia. Look how th samplig distributio of th sampl-avrag is vry clos to Normal, whr as th samplig distributio of th sampl mdia is ot.
Qustios Itro to Samplig 11 What ffcts will asymmtry, gaps ad cotiuity of th ativ distributio hav o th applicability of th CLT, or o th asymptotic distributio of various sampl statistics? Wh ca w rasoably pct statistics, othr tha th sampl ma, to hav CLT proprtis? If a ativ procss has X = 10 ad w tak a sampl of siz 10, what will b? Dos it dpd o th shap of th origial procss? How larg should th sampl-siz b so that?