Distributio of Samle Proortios Probability ad statistics Aswers & Teacher Notes TI-Nsire Ivestigatio Studet 90 mi 7 8 9 10 11 12 Itroductio From revious activity: This activity assumes kowledge of the material covered i the activity Itroducig samle roortios, which itroduced statistical iferece. This ivolves usig a samle statistic, i this case the roortio of a samle with a articular attribute to estimate the corresodig oulatio arameter: the true oulatio roortio. I that activity, it was assumed the cout X of X Bi,, where is the samle size ad is successes i a samle is a biomial radom variable, the roortio of successes i the oulatio. The samle roortio is also a radom variable, ˆ X P., ad the stadard deviatio of ˆP, ˆ 1 SD P. Overview of this activity: I this activity you will ivestigate further the distributio of the samle It was show that the mea of ˆP, EPˆ roortio, ˆP ; i articular, the effect of chagig the samle size ad the oulatio roortio. A model for the distributio of samle roortios is also exlored. Why simulate reeated radom samlig? I statistical iferece we see how trustworthy a rocedure is by askig what would hae if we reeated it may times. This leads to the idea of the samlig distributio of the statistic. The samlig distributio of the samle roortio is the robability distributio of values take by this statistic i all ossible samles of the same size from the same oulatio. I ractice, whe coductig oiio olls ad the like, it is t feasible to reeat the samlig may times. However, the use of simulated radom samlig, usig TI-Nsire, allows us to imagie the results of all the ossible radom samles that the ollster did t take - as illustrated i the followig exercise. Mobile Iteret Subscriber Simulatio Oe the TI-Nsire documet Dist_roortios. Navigate to Page 1.2 ad follow the istructios to seed the radom umber geerator. Australia Bureau of Statistics (ABS) data shows that i 2015, 50% of Australia iteret service subscribers had a mobile wireless coectio 1. The oulatio roortio of mobile subscribers i this oulatio is therefore kow to be: 0.5. 1 htt://www.abs.gov.au/ausstats/abs@.sf/mf/8153.0 Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
Distributio of samle roortios Aswers & Teacher Notes 2 Navigate to Page 1.4. Assume that the sigle result show reresets the samle roortio ˆ of mobile subscribers from a surveyed radom samle of 10 iteret subscribers ( 10 ), draw from the large oulatio for which: 0.5. We ca imitate carryig out the survey may more times usig radom umber simulatio. Each slider icremet will draw a ew radom samle, ad the samle roortio for the samle will be added to the sreadsheet ad to the grah. Add 99 samles, so that the slider value is 100. To reset the simulatio, set the slider value to 1. The click o the sreadsheet cell A= ad ress. Questio 1 a. What are the lowest ad highest observed values of ˆ? Aswers will vary b. What is the modal value of samle roortios? Aswers will vary c. What is the mea of the samle roortios (show by the vertical lie) for the 100 samles? Aswers will vary, but should be close to 0.5 d. How close is the mea of the samle roortios to the oulatio roortio? Aswers will vary, but should be close to samle ad oulatio roortios are likely to be close. Reset the simulatio ad avigate to Page 1.5. Use the slider to reeat the simulatio of 100 samles. A histogram of the distributio of samle roortios will emerge. Navigate to Page 1.6, where the summary statistics for the histogram are calculated. Questio 2 Use the histogram, dot lot (Page 1.4) ad summary statistics to describe, as fully as ossible, the distributio of samle roortios of 10 ad 0.5. The distributio is likely to be cetred close to 0.5, ad likely to be quite sread out, i the iterval [0, 1]. The shae might be fairly symmetrical. The mode ad mea are likely to be ear 0.5. Chagig the samle size Suose that for the mobile iteret subscriber survey, we chage the size of the samle that is draw from the oulatio with 0.5. Questio 3 Predict how the shae of the distributio of the samle roortio will chage, as the samles size icreases. Aswers will vary Navigate to Page 2.2. Whe you select a value of with the slider, 200 radom samles of that samle size are draw. The sier allows you to redraw a ew set of 200 samles of that size. Usig the slider, rogressively chage the samle size from 10 to 250. Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
Distributio of samle roortios Aswers & Teacher Notes 3 Navigate to Page 2.3. Reeat the above o this age. Questio 4 a. As the samle size icreases, what asects of the distributio of samle roortios remai the same? As icreases, the mea ad mode of the distributio are likely to remai ear 0.5. The distributio is also likely to remai fairly symmetrical b. As the samle size icreases, describe how the distributio chages. As icreases, the most romiet chages are likely to be a decrease i the sread of the distributio, ad a decrease, ad a icrease i ossible values that ˆ ca take (e.g. for = 10, ˆ ca oly chage i icremets of 0.1, but for = 100, ˆ ca chage i icremets of 0.01). c. How did your redictio i Questio 3 comare with what you actually observed? Aswers will vary. Navigate to Page 2.4, where the theoretical stadard deviatio of ˆP, ad observed summary statistics for the 200 samles are dislayed. Questio 5 a. For samle sizes of 10, 50, 100 ad 200, record the theoretical ad observed stadard deviatio for the distributio of samle roortios, correct to 4 decimal laces. Theoretical SD Observed SD 10 0.1581 Aswers will vary, but 50 0.0707 are likely to be close to 100 0.0500 the corresodig theoretical values. 200 0.0354 b. What asect of the distributio is measured by the stadard deviatio? SD P ˆ is a measure of the sread of the values of ˆ c. What tred do you observe i the value of the stadard deviatio as the samle size icreases? As icreases, SDP ˆ : the sread of the values of ˆ, decreases. d. How does the formula for the stadard deviatio of ˆP exlai this tred? SD Pˆ 1, therefore, as (large samle size), ˆ SD P 0 (the sread of the distributio of ˆP is small). e. I terms of usig a samle roortio to estimate the true oulatio roortio, exlai why a small stadard deviatio for the distributio of ˆP is desirable. Sice the distributio is cetred at, a small sread makes it more likely that the samle roortio is close to the true value of the oulatio roortio. Chagig the oulatio roortio I this sectio, you will exlore the samlig distributio for samles of a fixed size, draw from oulatios with differet oulatio roortios, withi the iterval 0.05 0.95. Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
Distributio of samle roortios Aswers & Teacher Notes 4 Questio 6 Suose that you draw reeated radom samles of size 10, from differet oulatios for which the oulatio roortios are 0.5, 0.6, 0.7,0.8,0.9. a. Predict how the shae ad features of the distributio of samle roortios will chage, as the value of icreases. Aswers will vary Suose that you draw reeated radom samles of size 10, from differet oulatios for which the oulatio roortios are 0.5, 0.4, 0.3,0.2,0.1. b. Predict how the shae ad features of the distributio of samle roortios will chage, as the value of decreases. Aswers will vary Navigate to Page 3.2. Set the samle size to 10 ad use the other slider to systematically icrease the oulatio roortio from 0.05 to 0.95. Navigate to Page 3.3 ad observe the same data i a histogram. Questio 7 a. For what value(s) of is the distributio least symmetrical? Least symmetrical for at or ear = 0.05 ad = 0.95. b. For what value(s) of is the distributio most symmetrical? Most symmetrical for at or ear = 0.5 c. Aart from symmetry, what other asects of the distributio chage as the value of chages? The skew of the distributio ositively skewed for < 0.5 ad egatively skewed for > 0.5. The amout of skew icreases as the value of moves away from 0.5. The sread of the distributio is greatest for = 0.5, ad the sread decreases as the value of moves away from 0.5. Set the samle size to 20 ad systematically icrease the oulatio roortio from 0.05 to 0.95. Reeat for 50 ad 100. As you systematically chage values, avigate to Page 3.4, where the theoretical stadard deviatio of ˆP is calculated. Questio 8 a. Use the results from Page 3.4 to comlete the table below, correct to 4 decimal laces. 0.05 0.25 0.5 0.75 0.95 10 0.0689 0.1369 0.1581 0.1369 0.0689 100 0.0218 0.0433 0.0500 0.0433 0.0218 b. What treds do you otice i the table of values? How do these treds accord with the observed chages to the shae of the distributio of ˆP? SD P ˆ is greatest for = 0.5 For a give samle size, the For a give samle size, the value of SDP ˆ is symmetrical about = 0.5 The above dot oits accord with the observed sread of the distributio of ˆP. Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
Distributio of samle roortios Aswers & Teacher Notes 5 Focus o the oulatio roortios = 0.1 ad = 0.9 You will ow examie more closely the distributio of ˆP for 0.1. Navigate to Page 3.2. Set the oulatio roortio to 0.1 ad use the other slider to systematically icrease the samle size from 10 to 100. Navigate to Page 3.3 ad observe the same data i a histogram. Reeat for 0.9 Questio 9 a. For what value(s) of samle size is the distributio least symmetrical? For = 0.1 or = 0.9, the distributio of ˆP is least symmetrical for = 10 b. For what value(s) of is the distributio most symmetrical? For = 0.1 or = 0.9, the distributio of ˆP is most symmetrical for = 100. Theoretical distributio of ˆP from Bi(,) Navigate to Page 4.3. Recall that ˆ X P, where X Bi,. I the sreadsheet, the robabilities of 0, 1, successes are calculated for the theoretical distributio of Bi,. Navigate to Page 4.2. The sliders allow the values of the arameters for Bi, to be varied, ad the resultat roortio of successes is lotted agaist their robabilities. O Page 4.2, set the value of the oulatio roortio to 0.1. Systematically icrease the samle size from 10 to 100. Questio 10 What chages do you observe i the lot as the samle size icreases? The umber of values that ˆ ca take icreases, ad the busiess ed of the grah (i.e. the ed where robabilities are ot ear zero) becomes less skewed; more symmetrical. Reeat the revious rocedure for oulatio roortio of 0.2 to 0.9. Questio 11 What are some similarities ad differeces, for corresodig values of ad, betwee the lot o Page 4.2, ad the grah o Page 3.2? The basic shaes ad other features of the two reresetatios are essetially the same, for corresodig combiatios of ad values. Modellig how samle roortios vary from samle to samle O Pages 3.2 you observed that as the samle size icreases, there are more ossible values of the samle roortio. Cosequetly, for large samle sizes, the dot lot starts to resemble a cotiuous distributio. You also observed that for larger values of, the distributio of ˆP was fairly symmetrical; ot just for oulatio roortios close to 0.5, but also for, say, 0.1. Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
Distributio of samle roortios Aswers & Teacher Notes 6 I this sectio, you will exlore the viability of modellig the discrete samlig distributio of ˆP with a 2 cotiuous ormal distributio, N ˆ,, where ˆ E ˆ 1 P ad. Navigate to Page 5.2. A ormal df (robability desity fuctio) curve is suerimosed o the histogram of the distributio of ˆP. Use the sliders to adjust the samle size,, ad oulatio roortio,. Questio 12 Based o a visual comariso of the histogram ad the corresodig ormal df curve, for what values of ad does the ormal distributio aear to be a good fit? For ay give value of, the best fit is likely to occur for larger samle sizes: = 100. For ay give value of, the best fit is likely to occur for values of that are close to 0.5. Navigate to Page 5.3, which shows a ormal robability lot. You ca adjust the values of ad. Questio 13 What do you thik this ormal robability lot is showig? Aswers will vary. Hoefully, studets will areciate that it is a lot of actual values of ˆ versus exected values, for a ormally distributed radom variable. Navigate to Page 5.4, which combies Pages 5.2 ad 5.3 ito a sigle slit age. I the left-had ael, the ormal robability lot is dislayed, with the exected z o the vertical axis, where z is the stadard ormal radom variable. Questio 14 a. What is the sigificace of the regressio lie show i the ormal robability lot? The regressio lie shows what the lot will look like if ˆ is ormally distributed, i.e. it is the lie for which actual values corresod to redicted values. 2 b. How does the ormal robability lot tell you whether N ˆ, ˆ of ˆP? is a good fit for the distributio The closer the scatterlot is to the lie, the better the fit, ad the closer ˆP is to ormality. c. Based o your observatios of the ormal robability lot, for what values of ad is the ormal distributio model most aroriate? As with Questio 12, for ay give value of, the best fit is likely to occur for larger samle sizes: = 100. For ay give value of, the best fit is likely to occur for values of that are close to 0.5. Normal aroximatio to the theoretical distributio of ˆP from Bi(,) Navigate to Page 6.2. I slit age, the left had ael shows the same iformatio as reviously observed i Page 4.2: the theoretical distributio of ˆP, based o observatios from the Bi, distributio. The right-had ael shows the ormal distributio with mea ad stadard deviatio corresodig to those of Bi, - that is, P P Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses
2 ˆ, where ˆ E ˆ N, Adjust the values of ad. P ad Distributio of samle roortios Aswers & Teacher Notes 7 2 1. Questio 14 a. As the values of ad are varied, what are some similarities ad differeces betwee the ormal ad biomial distributios, dislayed o Page 6.2. For > 0.5 or < 0.5, the biomial distributio grah resembles the ormal grah for larger values of. b. From the grahs o Page 6.2, for what values of ad is the ormal distributio model of the biomial distributio most aroriate? Does this accord with your observatios o Page 5.4? The behaviour of these grahs is likely to be aalogous to the grahs o Page 5.4. Questio 15 Write a brief summary i oit form, of what you have leart about the distributio of samle roortios, as a result of doig this activity. Aswers will vary. Follow u o this activity I this activity we cosidered the use of roortios from samles to estimate oulatio roortios. The follow-u activity, Cofidece itervals for roortios, exlores obtaiig itervals withi which we are reasoably sure (with a give level of cofidece) that the value of the oulatio roortio will lie. END OF ACTIVITY Texas Istrumets 2015. You may coy, commuicate ad modify this material for o-commercial educatioal uroses