Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
|
|
- Elaine Taylor
- 5 years ago
- Views:
Transcription
1 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece iterval for, ukow (the t distributio) 5.5 Beroulli data, cofidece iterval for p 5.6 The Cetral Limit Theorem ad a Geeral Approximate Cofidece Iterval for Big Picture We ow move to lookig at usig data to estimate parameters of models. We begi by cosiderig estimatio of the mea of a distributio. The mea ad variace are the two parameters that describe a Normal model. We saw that as the sample size gets big, the sample average x should get close to the mea. What determies how close? Ca we quatify the accuracy?
2 5. Estimatig the Mea of a Normal distributio Cosider a plat which fills cereal boxes. The maager eeds to kow how much cereal is goig i to the boxes, at least, o average. How accurate will the sample average be as a estimate for the true mea? The setup The distributio of cereal box weights are Normal(345,5 2 ). So the true mea (log ru average weight) is 345. The maager does t kow =345 so she radomly grabs boxes that have bee filled ad uses the sample average of their weights as a estimate for the ukow true mea (345). 2
3 0 Our Approach First, I ll show that if we kow the true distributio of cereal box weights (say N(345,5 2 )), we ca describe how likely it is that the estimate costructed from the sample average lies ear of far from the true value. Next, we ll use the results from above to quatify the accuracy of our estimates i the realistic settig where we do t kow the true value of the mea. Here are the time series ad histogram of the observed weights for 500 boxes: weights Histogram looks Normal! Looks iid observatio # weights The weights of cereal boxes are iid ormal with = 345 ad = 5. 3
4 With 500 observatios, our guess for, is probably pretty good (we get , very close). But what if you had fewer observatios? Suppose you oly had the first 0! x How would you guess? The solid black lie is the sample average of the first 0 obs.. It is further from the true value 345. first 0 x 0 x 500 We saw that the sample average of a large umber of iid draws should coverge to the mea of the distributio we are drawig from. I our cereal box example, the weights are iid draws from a N(345,5 2 ) This meas that the sample average should be close to 345. I geeral: E() (for large ) i i 4
5 Give a sample of size of observatios that look iid ormal, the sample mea, is our estimate of i i ( ) E i is sometimes called the populatio mea sice it is the mea of the etire populatio of all potetial values, while the sample mea is just the average of some of them. 5.2 The Distributio of the Normal Sample Mea How bad ca the estimate be if you oly have 0 observatios? To ivestigate this we perform a coceptual Experimet. Let s take our 500 observatios ad break them up ito 50 groups of 0 cosecutive observatios. Each group represets a sample of size 0 that you might have gotte. For each group we calculate the mea. This will show us what kids of values we could get for the average of just 0 observatios. 5
6 I wat to see how oisy the sample average is whe we have a sample of size 0 so I will look at a buch of sample averages costructed usig differet datasets of 0 observatios. We will look at how close or far the sample averages lie from the true mea. I reality we would have just a sigle sample of size 0, we could have gotte ay of the 50 samples we look at. The little solid segmets are plotted at the mea of the correspodig 0 umbers
7 Here is the histogram of the 50 sample averages These are the 50 sample averages, ot the idividual cereal boxes Histogram of 50 sample averages 20 The look Normal too!! 5 So the distributio of the types of values we Frequecy C2 get for our sample averages looks Normal too! Suppose the maager is about to grab a ew sample of size 0 usig observatios ad use that sample average as their estimate for the mea. What values might they get for the sample average? i i 20 Histogram of 50 sample averages 5 Frequecy C Recall empirically we foud this histogram for our coceptual experimet. 7
8 0 With the ew sample, the maager could get ay value like the oes we saw i our coceptual experimet (or other values). Whe we take a ew sample it is like a radom outcome, why is it radom? Because the data are radom outcomes. Each i is a radom draw from a N(345,5 2 ) Key idea: Before we get the sample, each i is radom. So we thik of the sample mea as a radom variable!! It is a liear Combiatio of iid Normals! Q? What is the value that we will get for the first observatio,? As. It s ukow. It will be the outcome of a radom draw from a N(345,5 2 ) weights 8
9 9 So, the big idea is that before we collect our observatios, we ca thik of the sample average as a radom variable. Whe we fially take our sample it gives us oe realizatio of the sample average. It is radom because it is a liear combiatio of iid radom variables. Note that the otatio will remai the same, but we ow thik of the sample average before we take the sample as radom. i i E E E E i i i ) ( Sice the expected value of is equal to the thig we are tryig to estimate,, we say our is a ubiassed estimate of the populatio mea.
10 What is the variace of the sample average? 2 2 Var() Var( ) Var( 2) Var( ) So the sample average is ubiassed ad the variace of the sample average ca be quatified. Ideally we would like the variace to be small so that the sample average should be close to the mea. 0
11 The variace of the sample average depeds o two thigs: the variace of the populatio from which we are samplig 2 ad the sample size. The variability of our sample average is decreasig with larger sample sizes (larger values of ) The variability of our sample average is larger whe the populatio variace is larger. Larger populatio variace meas that our idividual draws of the s are more spread out. Why do t ay covariaces appear i the variace of? The i must be idepedet. Does this make sese?
12 Fact: sice the average is a combiatio of idepedet Normals, it is also Normally distributed. Let the, N i ~ N(, ) 2, 2, ~ (, ) 2 i This is the same 2. I the top lie it represets the variace of the distributio of cereal box weights. I the secod lie, the ratio of 2 / provides the variace of sample averages costructed by averagig cereal box weights. iid Same 2 Relatioship betwee the distributio of cereal box weights ad the sample average of te Cereal Box Average of 0 2
13 Example 2 5 ~N(345, ) 50 For differet sample sizes we get differet distributios for the sample averages: 2 5 ~N(345, ) ~N(345, ) 0 desity xbar For differet sample sizes, the ormal curves tell us how close we ca expect our estimate to be to the true value! 2 ( ) If we assume =345 ad =5: 2 5 ( ) 0 ( 345) 2 5 ( ) ( )
14 5.3 Cofidece Itervals : How do we use the results from the previous sectio whe we do t kow? We just figured out that if we sample from a N(, 2 ), we ca figure out what kid of sample averages we will get from a sample of size. What we really wat to kow is, give a sample average, where do we thik is? At first, we will still assume that we kow but we do t kow. I the ext sectio we will relax this urealistic assumptio. We are assumig that the data are iid ormal. 4
15 First let s add a bit of otatio: Let, 2 This will simplify the look of the formulas ad emphasize that the sample mea has its ow stadard deviatio. Now we stadardize so, ~N(, ) ~N(0,) 2 Pr( 2 2).95 (really the 2 is.96!!) 5
16 so, Pr( 2 2 ).95 This says that there is a 95% chace that the sample mea x lies withi two stadard deviatios of the true mea. Remember that gets smaller as the sample size gets larger. So we should expect the sample mea to be closer to the true mea i larger samples % chace that x falls i here N 5 2 ~ 345, 0 Next lets Rearrage some more to get somethig useful! Alteratively, if there is a 95% chace that lies withi two stadard deviatios of, the there is a 95% chace that lies withi two stadard deviatios of x : x Mathematically, we rearrage the last iequality to get: Pr( 2 2 ).95 Pr(2 2 ).95 6
17 For iid ormal data, with kow stadard deviatio, a 95% cofidece iterval for the true mea is, 2 95% of the time the true value will be cotaied i the iterval. 95% CI for : All values we caot rule out based o the data A picture of the process: Our sample gives us a value for x. We wat to ask what values for are reasoable. x 2 Cosider a possible value. The red curve is the samplig distributio of x. Is this reasoable? NO. If that were the right value of, it s extremely ulikely we d see a like the oe we got i the data. x Cosider 2. If this were the right value of, it s perfectly possible we d see a like the oe we saw i the data. x 7
18 Example: Remember our weight data? Give 500 observatios, what do we kow about? Assume =5..67 IN ECEL, use the formula: =5/sqrt(500) ECEL gives us: The sample average was The 95% ci is (343.5, 346.7) IN ECEL, use the formulas: = *.67 = *.67 ECEL gives us the values: Example: Give just the 0 observatios, what do we kow about? Assume = IN ECEL, use the formula: = 5/sqrt(0) ECEL gives us: The sample average was The 95% ci is (339.02, ) IN ECEL, use the formulas: = *4.74 = *4.74 ECEL gives us the values:
19 Cofidece itervals aswer the basic questios, what do you thik the parameter is ad how sure are you. I particular, a 95% CI meas that if we took 00 samples ad created 00 differet cofidece itervals, we would expect 95 of them to cotai the true (but ukow) value. small iterval: good, you kow a lot big iterval: bad, you do t kow much. Clearly there is othig special (outside of covetio) i usig a 95% CI. We ca have costructed ay cofidece iterval we like. For example: A 68% CI is give by More geerally we ca compute a 00(-% cofidece iterval by: z z 2 z 2 9
20 Here are some tabulated values: / 2 z / The (-) 00% C.I. for is the give by z /2 5.4 Normal data, cofidece iterval for, ukow Now we will exted our ci to the more realistic situatio where is ukow. Typically you do t, so we have to estimate it as well. How do we estimate? Just as we ow thik of the sample mea as a estimate of, we ca thik of the sample sd as a estimate of. 20
21 Estimatig s (x x) 2 2 x i i is our estimate for 2 we divide by - so that the estimator is ubiased. Fact: E s 2 2 x the estimate of is, s x i ( x x) i 2 2
22 Now our big idea is that i the formula istead of usig, we use a estimate of it: se() s x This is called the stadard error. Clearly, it is a estimate of the true stadard deviatio. We might thik that N(0,) se() givig the ci: x 2se() (squiggly lies mea approximately distributed as ) (just replace with its estimate) This is approximately right for large (>30). But it turs out that for iid ormal data we ca get a exact result. First we eed to lear about the t-distributio. 22
23 The t distributio The t is just aother cotiuous distributio. It has oe parameter called the degrees of freedom which is usually deoted by the symbol. Each value of gives you a differet distributio. Compariso of Normal ad t distributios for differet values of 23
24 Whe is bigger tha about 30 the t is very much like the stadard ormal Oe of these is t with 30 df, the other is stadard ormal. u t dist with =3 df t For smaller, it puts more prob i the tails. for our Normal mea problem we use =-. Now, let, t,.025 be such that t rv with - df. P( t t t ).95,.025, f(x) t,.025 x t,
25 For ->about 30, the t - is so much like the stadard ormal that t 2,.025 For smaller, the t value gets bigger tha 2. Here is a table of t values ad. We ca see that for >30 (or eve about 20) the t value is about 2. t 025, IN ECEL, use the formula: =TINV( 0.05, 0) Degrees of freedom Probability i the tails There is.025 prob less tha ad.025 prob greater tha 2.22 for the t dist with 0 degrees of freedom. ECEL gives us:
26 Our basic result is, se() ~t for small, the t distributio accouts for our estimatio of with s x. thus, Pr( t,.025 t,.025 ).95 se() Just a before, we ca rearrage this to obtai the iterval: x t se(),
27 A exact 95% cofidece iterval for with ukow is x t se(),.025 Usig the t value istead of the z value will make the iterval bigger for smaller. This reflects the fact that we are ot sure that our estimate for is quite right. Example Back to our weight data. With =500 the sample sd is 5.455, ad the sample mea is The t dist with =499 is just like the stadard ormal so the t-value is about se() ci: /-.4 IN ECEL, use the formulas: = = ECEL gives us the values:
28 T Cofidece Itervals se( ) sx Variable N Mea StDev SE Mea 95.0 % CI weights ( , ) Histogram of weights (with 95% t-cofidece iterval for the mea) 60 Frequecy _ [ ] weights For the first 0 observatios, the sample sd = 4.6, ad the sample mea was The t 9,.025 value is se() ci: / * /- 0.4 =(338., 358.9) 28
29 T Cofidece Itervals Variable N Mea StDev SE Mea 95.0 % CI weights ( , ) Histogram of weights0 (with 95% t-cofidece iterval for the mea) 4 3 Frequecy 2 0 _ [ ] weights0 Example Let s get a 95% ci for the true mea of Caadia returs. IN ECEL, use the pull-dow meu: StatPro > Statistical Iferece > Oe-sample aalysis Results for oe-sample aalysis for caada Histogram of caada (with 95% t-cofidece iterval for the mea) Summary measures Sample size 07 Sample mea Sample stadard deviatio Cofidece iterval for mea Cofidece level 95.0% Sample mea Std error of mea Degrees of freedom 06 Lower limit Upper limit 0.06 sx se( ) Frequecy x t se(),.025 _ [ ] caada Is the cofidece iterval big? 29
30 Example: 95% CI for true mea of NYSE stock idex over same period. T Cofidece Itervals Variable N Mea StDev SE Mea 95.0 % CI yse ( , ) Histogram of yse (with 95% t-cofidece iterval for the mea) 30 Frequecy _ [ ] yse Of course, just as for the case of the Normal, we ca fid ay cofidece iterval that we would like. The (- ) 00% C.I.for is the give by t ( /2,-) s defied similarly to z /2 for the N(0,) 30
31 5.5 Beroulli data, cofidece iterval for p Now we cosider cofidece itervals for p give iid Beroulli observatios. Suppose we had this data where meas a default ad 0 meas o default. C What do you thik the true default rate is ad how sure are you? 0.0 Idex Our data cosist of Beroulli outcomes where a mortgage either defaults () or does ot (0). Our best estimate of p will be the sample fractio of defaults. That is: xi i pˆ For our data it is 2/50. 3
32 We play the same game as before: before we take our sample we ask what ca happe? This time the outcomes are realizatios of iid Beroulli(p). The sum of iid Beroulli s is a Biomial distributio so the umerator is the outcome of a Biomial(,p) where is the sample size ad p is the parameter we wat to kow. For iid Beroulli data, the estimate of p is observed umber of successes i the trials ˆp umber of trials Y~B(,p) Y 32
33 Before we get a sample of size, what kid of estimate ca we expect to get? Y E(p) ˆ E E(Y) p p (ubiased) p( p) ˆ Var(p) p( p) 2 Two thigs: ) The variace of pˆ is agai decreasig i the sample size. 2) The variace of depeds o the value of p. pˆ Ulike the ormal case, oly approximate results are available. Sice our estimate is a combiatio of idepedet beroullis, the cetral limit theorem tells us that it should be approximately ormal: pˆ p( p) N(p, ) 33
34 We make a fial approximatio. ˆ se(p) p( ˆ p) ˆ so, ˆp p N(0,) se(p) ˆ The 95% iterval is for the true proportio p is, pˆ 2se(p) ˆ I our example our iterval would be: IN ECEL, use the formulas: =.24-2*sqrt(.24*(-.24)/50) = *sqrt(.24*(-.24)/50) ECEL gives us the values:
35 Example Remember the discrimiatio case? We used.07 for p. yy Not coutig the firm beig sued, we had 28 parters 77 of which were female p The cofidece iterval is (. 07) This iterval tells us where we thik p is: (.0548,.0852) 35
36 Suppose we had oly 00 parters 7% of whom are female. The iterval would be (. 07) 00 (.02,.2) This iterval is much bigger, tellig us that with oly 00 observatios, our estimate could be a lot farther from the truth. 00 observatios has less iformatio tha 28. Natioal Poll of likely voters (CNN) 36
37 Trump tops Clito 58% to 56% i ufavorable poll by CNN/ORC The CNN/ORC Poll was coducted by telephoe October amog a radom atioal sample of,07 adults, icludig 779 who were determied to be likely voters. The margi of samplig error for results amog the sample of likely voters is plus or mius 3.5 percetage poits. What is the margi of error? It is actually a 95% Cofidece Iterval..58*.58 sepˆ * se pˆ 2* % 37
38 Why is the Margi of Error ofte 3%? Geerally the sample size is a little over 000. The umerator of the stadard error depeds o p, so why are the errors ot depedet o the value of the estimate of p? The media uses the largest iterval that is obtaied whe pˆ * sepˆ 2*.036 3% 000 So if the estimate for p is differet from.5, the cofidece iterval (margi of error) will by smaller tha 3%. This meas that we are at least 95% sure that the true value of p lies i the plus or mius three percetage poits. 38
39 5.6 The Cetral Limit Theorem ad a Geeral Approximate Cofidece Iterval for Suppose we are willig to assume that our data are iid but ot willig to assume that they are ormally distributed. ad they are ot Beroulli. We might still wat to estimate E( i) It turs out that the approach we used for ormal data is approximately correct (with large sample sizes): 2 N(, ) 2 N(, se() ) (first squiggle is the clt, secod squiggle we just hope our estimate of is good) so, N(0,) se() 39
40 This is a extremely powerful result. It says that eve if we do t kow the distributio of the populatio we are samplig from is Normal, the distributio of the sample average will still be Normally distributed i large samples! Give iid observatios i, ad approximate 95% cofidece iterval for =E( i ) is give by, x 2se() 40
MATH/STAT 352: Lecture 15
MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationConfidence Intervals for the Population Proportion p
Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationUnderstanding Samples
1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We
More informationEstimation of a population proportion March 23,
1 Social Studies 201 Notes for March 23, 2005 Estimatio of a populatio proportio Sectio 8.5, p. 521. For the most part, we have dealt with meas ad stadard deviatios this semester. This sectio of the otes
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationChapter 18 Summary Sampling Distribution Models
Uit 5 Itroductio to Iferece Chapter 18 Summary Samplig Distributio Models What have we leared? Sample proportios ad meas will vary from sample to sample that s samplig error (samplig variability). Samplig
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationMBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS
MBACATÓLICA Quatitative Methods Miguel Gouveia Mauel Leite Moteiro Faculdade de Ciêcias Ecoómicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACatólica 006/07 Métodos Quatitativos
More informationIntroduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 3 (This versio August 17, 014) 015 Pearso Educatio, Ic. Stock/Watso
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationConfidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.
MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationEconomics Spring 2015
1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationTopic 10: Introduction to Estimation
Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationHomework 5 Solutions
Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationSTATS 200: Introduction to Statistical Inference. Lecture 1: Course introduction and polling
STATS 200: Itroductio to Statistical Iferece Lecture 1: Course itroductio ad pollig U.S. presidetial electio projectios by state (Source: fivethirtyeight.com, 25 September 2016) Pollig Let s try to uderstad
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationComputing Confidence Intervals for Sample Data
Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios
More informationSampling Distributions, Z-Tests, Power
Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationIf, for instance, we were required to test whether the population mean μ could be equal to a certain value μ
STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially
More information(7 One- and Two-Sample Estimation Problem )
34 Stat Lecture Notes (7 Oe- ad Two-Sample Estimatio Problem ) ( Book*: Chapter 8,pg65) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye Estimatio 1 ) ( ˆ S P i i Poit estimate:
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationRead through these prior to coming to the test and follow them when you take your test.
Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1
More informationACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the
More informationInstructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?
CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter
More informationEco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State
Eco411 Lab: Cetral Limit Theorem, Normal Distributio, ad Jourey to Girl State 1. Some studets may woder why the magic umber 1.96 or 2 (called critical values) is so importat i statistics. Where do they
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More informationAgreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times
Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log
More informationOctober 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1
October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationModule 1 Fundamentals in statistics
Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationBecause it tests for differences between multiple pairs of means in one test, it is called an omnibus test.
Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal
More informationPH 425 Quantum Measurement and Spin Winter SPINS Lab 1
PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationMathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence
PSet ----- Stats, Cocepts I Statistics Cofidece Iterval Guesswork with Cofidece VII. CONFIDENCE INTERVAL 7.1. Sigificace Level ad Cofidece Iterval (CI) The Sigificace Level The sigificace level, ofte deoted
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationStatisticians use the word population to refer the total number of (potential) observations under consideration
6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space
More informationClass 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2
More informationInferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.
Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More informationMEASURES OF DISPERSION (VARIABILITY)
POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationSimple Random Sampling!
Simple Radom Samplig! Professor Ro Fricker! Naval Postgraduate School! Moterey, Califoria! Readig:! 3/26/13 Scheaffer et al. chapter 4! 1 Goals for this Lecture! Defie simple radom samplig (SRS) ad discuss
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More informationStatistics 300: Elementary Statistics
Statistics 300: Elemetary Statistics Sectios 7-, 7-3, 7-4, 7-5 Parameter Estimatio Poit Estimate Best sigle value to use Questio What is the probability this estimate is the correct value? Parameter Estimatio
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationLecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationSTAT431 Review. X = n. n )
STAT43 Review I. Results related to ormal distributio Expected value ad variace. (a) E(aXbY) = aex bey, Var(aXbY) = a VarX b VarY provided X ad Y are idepedet. Normal distributios: (a) Z N(, ) (b) X N(µ,
More informationIntroducing Sample Proportions
Itroducig Sample Proportios Probability ad statistics Aswers & Notes TI-Nspire Ivestigatio Studet 60 mi 7 8 9 0 Itroductio A 00 survey of attitudes to climate chage, coducted i Australia by the CSIRO,
More informationUnderstanding Dissimilarity Among Samples
Aoucemets: Midterm is Wed. Review sheet is o class webpage (i the list of lectures) ad will be covered i discussio o Moday. Two sheets of otes are allowed, same rules as for the oe sheet last time. Office
More informationProblems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:
Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationChapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008
Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationY i n. i=1. = 1 [number of successes] number of successes = n
Eco 371 Problem Set # Aswer Sheet 3. I this questio, you are asked to cosider a Beroulli radom variable Y, with a success probability P ry 1 p. You are told that you have draws from this distributio ad
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationCONFIDENCE INTERVALS STUDY GUIDE
CONFIDENCE INTERVALS STUDY UIDE Last uit, we discussed how sample statistics vary. Uder the right coditios, sample statistics like meas ad proportios follow a Normal distributio, which allows us to calculate
More informationTopic 6 Sampling, hypothesis testing, and the central limit theorem
CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationCommon Large/Small Sample Tests 1/55
Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio
More informationPRACTICE PROBLEMS FOR THE FINAL
PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22
CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationExam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234
STA 291 Lecture 19 Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Locatio CB 234 STA 291 - Lecture 19 1 Exam II Covers Chapter 9 10.1; 10.2; 10.3; 10.4; 10.6
More informationClass 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 7 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 013 by D.B. Rowe 1 Ageda: Skip Recap Chapter 10.5 ad 10.6 Lecture Chapter 11.1-11. Review Chapters 9 ad 10
More informationRoberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series
Roberto s Notes o Series Chapter 2: Covergece tests Sectio 7 Alteratig series What you eed to kow already: All basic covergece tests for evetually positive series. What you ca lear here: A test for series
More informationBIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics
BIOTAT 640 Itermediate Biostatistics Frequetly Asked Questios Topic FAQ Review of BIOTAT 540 Itroductory Biostatistics. I m cofused about the jargo ad otatio, especially populatio versus sample. Could
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9
Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I
More information