Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for aalysis (Roma letter (x, s, ) are used for sample data) A parameter is a umerical measure that describes a characteristic of a populatio A statistic is a umerical measure that describes a characteristic of a sample rage Class itervals: Width of iterval Arithmetic Mea: X X 1+X 2 + X Media (Positio): +1 Rage X o.of desired class groupigs 2 max X mi Z Score: Z X X Z Outliers > 3.0 or <-3.0 S Measures of Cetral Tedecy: Arithmetic Mea, Media, Mode Quartile (Positio): Q 1 0.25( + 1), Q 2 0.50( + 1), Q 3 0.75( + 1) Iter-Quartile Rage: IQR Q 3 Q 1 Measures of Dispersio: Variace, Stadard Deviatio, Coefficiet of Variatio Covariace tells us oly the directio of associatio Sample coefficiet of correlatio r: r covar where s s x s x & s y S.Dev formula y Numerical Descriptive Measures Reordered data: 3, 4, 7, 9 Variace: firstly fid x 5.75 s 2 i1 (x x )2 Sample Variace 1 [(3 7) 2 + (4 7) 2 + (7 7) 2 + (9 7) 2 ] 5.75 1 [( 4)2 + ( 3) 2 + (0) 2 + (2) 2 ] 4.75 16 + 9 + 0 + 4 29 4.75 4.75 6.10 Stadard deviatio: s s 2 6.1 2.46 Coefficiet of variatio: CV s x 100% 2.46 100% 61.7% 4 Questio 3 Cotiuous Probability Distributio Fid the followig probabilities 1. P(Z < 1.67) 0.0475 Read straight from the table. Note: P(Z<1.846) we ca oly look up z values to two decimal places so roud 1.846 up to 1.85 2. P(Z > 2.78)? 1 P(Z < 2.78)? 1 0.27 0.9973 3.P(0.15 < Z < 1.99)? P(Z < 1.99) 0.9767 P(Z < 0.15) 0.5596 0.9767 0.5596 0.4171 Solve the followig iverse problems for the stadard ormal distributio P(Z > ) 0.01 Look up the Iverse Normal Table P(Z > 2.3263) 0.01 The Iverse table oly gives the Z values for upper-tail areas, but because the ormal distributio is symmetric about zero, we fid the upper-tail Z value, ad the lower-tail Z value that we eed is the same value but egative. Fid the two values of Z (symmetrically distributed aroud the mea) such that the followig statemets are true: P( < Z < ) 0.80 Each tail will have a area of 0.10, so lookig up the Iverse table to get the two Z values: Z LOWER 1.2816 Z UPPER 1.2816 P( 1.2816 < Z < 1.2816) 0.80 Samplig Distributio cot. Numerical data is measured o a atural umerical scale (age) Iferetial Statistics - Drawig coclusios about a populatio Cotiuous Data that ca take o ay real umber (time/legth) based o sample data Discrete - Coutable umber of resposes (caot have 0.5) Frequecy Distributios - summary table i which data are Categorical data ca oly be amed or categorised arraged ito umerically ordered classes or itervals Nomial o order, o respose is cosidered better (geder) Ordered array: sequece of data i rak order Ordial There is a order (very good, good, average) Time Series Data collect through time (Moths sales for May) Descriptive Statistics - Collect, Preset, Characterise data Cross Sectioal Collected for a poit i time (My height today) Sample of 4: (2, 3), (7, 9), (4, 5), (4, 6) x 2 + 7 + 4 + 4 17 4 4 4.25 y 3 + 9 + 5 + 6 23 4 4 5.75 x y (x x ) (y y ) (x x )(y y ) 2 3-2.25-2.75 6.19 7 9 2.75 3.25 8.94 4 5-0.25-0.75 0.19 4 6-0.25 0.25-0.06 (x x )(y y ) 15.26 (x x )(y y ) covariace 15.26 1 4 1 5.09 (Directio) correlatio r covar 5.09 s x s y 2.06 2.5 0.99 (Stregth) Cotiuous Probability Distributio cot. Betwee what two values of Z (symmetrically distributed aroud the mea) will 68.26% of all possible Z values be cotaied? Each tail has a area, α 0.1587 (i.e. (1-0.6826)/2, so if we use the Cumulative Normal Distributio table ad look for the area of 0.1587, we fid that P(Z < -1) 0.1587. Therefore the right tail where Z +1 has the same area. So the two values of Z that we are lookig for are -1 ad +1. i.e. P( -1 < Z < 1) 0.6826 as i the diagram. Usig Iverse Normal table, oly look up a area to two decimal places: 0.16 (i.e. 0.1587 rouded to two decimal places) ad we would coclude that the two values of Z were Z 0.9945 ad Z -0.9945 i.e. P( -0.9945 < Z < 0.9945) 0.68 Samplig Distributio cot. I Iterpretig Correlatio Coefficiet r Iterpretatio r -1 PERFECT egative liear -1 < r -0.7 STRONG egative liear -0.7 < r -0.3 MODERATE egative liear -0.3 < r < 0 WEAK egative liear r 0 No relatioship 0 < r < 0.3 WEAK positive liear 0.3 r < 0.7 MODERATE positive liear 0.7 r < 1 STRONG positive liear 1 PERFECT positive liear Populatio mea μ Sample mea - X Populatio variace - 2 Sample Proportio p Stadard Deviatio S Variace S 2 Questio 4. Samplig Distributio Estimatio cot. / Cofidece Itervals. Estimatio Studet Name: Studet No: Is it for μ? No X 2 ( 1) 2 Yes Is kow? No t X μ 2 S Yes Quatitative Z X μ Qualitative Z p π π(1 π)
Questio 2 Simple Liear Regressio & Probability Probability & Discrete Probability Distributios Probability & Discrete Probability Distributios Biomial Distributio (Questio will provide, x ad % (portio) L Questio 5 Hypothesis testig Hypothesis Testig cot. Two populatio Proportio Example Two Sample (Rejectio regio use iverse ormal table) Pooled-Variace t Test Example Two Sample (Sigma Ukow, Variace Equal, Assume 30mi (Cetral Limit T) F Test Example Two Sample (F table for reject regios) 1.6449 (t0.05, 1998) df 1 + 2 2 1000 + 1000 2 1998 FL 1 Fu 1 1.67 0.599 Fu F 0.025, 99, 71 F 0.025, 60, 60 1.67 Fu* F 0.025, 71, 90 F 0.025, 60, 60 1.67 Aalysis of Variace (ANOVA)
BSB123 Data Aalysis Semester 2 2015 Workshop 8 (Week 10) Estimatio Questio 1 The quality cotrol maager at a light bulb factory eeds to estimate that mea life of a large shipmet of light bulbs. The stadard deviatio is 100 hours. A radom sample of 64 light bulbs idicates a sample mea life of 350 hours. (a) Costruct a 95% cofidece iterval estimate of the populatio mea life of light bulbs i this shipmet. (b) Do you thik that the maufacturer has the right to state that the light bulbs last a average of 400 hours? Explai. The first approach is purely to say it s outside the cofidece iterval. The secod approach is to take that value of 400 covert it to a Z value, so you ca determie the probability that the statemet is correct.
(c) Must you assume that the populatio of light bulb life is ormally distributed? Explai. No because my sample size is >30. Therefore accordig to the CLT (cetral limit theorem) at the very least I will ed up with approximate ormal distributio I other words if we have 30 observatios or more, uder the CLT we have a Normal Questio 2 If X 75, S 24, 36, ad assumig that the populatio is ormally distributed, costruct a 95% cofidece iterval estimate of the populatio mea μ.
Questio 3 A study coducted by the Australia Stock Exchage foud that 46% of 2,405 Australia adults surveyed i 2006 held shares, either directly or idirectly through maaged fuds or self-maaged superauatio fuds (2006 Australia Share Owership Study, ASX). (a) Costruct a 95% cofidece iterval for the proportio of Australia adults who held shares i 2006. Whe dealig with populatios proportios we always use a Z. (b) Iterpret the iterval costructed i (a). As above. I am 95% cofidet that the true proportio of Australia adults who held shares i 2006 is betwee 44 ad 48% (c) To costruct a follow-up study to estimate the populatio proportio of adults who curretly hold shares to withi 0.01 with 95% cofidece, how may adults would you iterview?