Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal

Hypthesis Testing and Cnfidence Intervals (Part 1): Using the Standard Nrmal Lecture 8 Justin Kern April 2, 2017

Inferential Statistics Hypthesis Testing One sample mean / prprtin Tw sample means / prprtins ANOVA (mre than tw sample means) Chi-Square (Gdness f Fit, Independence) If we have time. Cnfidence Intervals One sample mean / prprtin Tw sample means / prprtins Crrelatin and Regressin Relatinship between tw variables

Inferential Statistics In real life, we usually d nt knw the true characteristics f the ppulatin f interest What is the mean weight f teenagers in the US? Hw many hurs d UIUC students spend studying per week n average? What prprtin f peple in Eurpe suffer frm depressin? In rder t find ut smething abut the ppulatin, we cnduct a research study by cllecting data frm a sample frm the ppulatin We cllect data frm a sample because it is almst always impssible r highly impractical t cllect data frm the entire ppulatin This is actually the whle pint f statistics: we want t infer smething abut the ppulatin by analyzing data frm a sample frm that ppulatin Samples shuld be representative f the ppulatin f interest Prper representatin is generally achieved by taking a randm sample

Hypthesis testing Suppse yu are a researcher, maki g sme c aim (i.e., a hypthesis). T eva uate this c aim, it is ecessary t c ect data, a d the test the c aim (usi g the data) agai st sme srt f be chmark. T make this rigrus, it is ecessary t defi e the hypthesis i a qua titative y. Examp e: Suppse a researcher c aims that a e ear y i terve ti tech ique i creases the ear i g capabi ities f autistic chi dre. This ca be eva uated usi g the mea f test scres fr autistic chi dre. Test scre mea fr kids getti g e i terve ti tech ique is μ 1. Test scre mea fr kids getti g the der i terve ti tech ique μ 2. If μ 1 is greater tha μ 2, the the c aim ca be supprted. U frtu ate y, e d t k μ 1 r μ 2, s they must be estimated. As e k, estimati i v ves u certai ty i the estimate, s that must be take i t accu t.

Hypthesis testing The sta dard ay t eva uate c aims is by usi g the hypthesis testi g (r sig ifica ce testi g) frame rk. Defi iti : Hypthesis testi g is a methd fr testi g a c aim r hypthesis abut a parameter i a ppu ati, usi g data measured i a samp e. I this methd, e test sme hypthesis by determi i g the ike ihd that a samp e statistic cu d have bee se ected, if the hypthesis regardi g the ppu ati parameter ere true. T make this precise, e eed t have a sta dard frame rk fr maki g decisi s abut hether the data supprt a hypthesis r t.

Mean-Centered Variable Suppse e have bservati s a variab e, x 1,, x n. Take e bservati, say the x i th va ue. H ca e cmpare it t the rest f the dataset? O e ay t cmpare data is t cmpare the dista ce f bservati s frm their mea. Thus, if e mea -ce ter ur variab e, e have a e variab e. Fr i sta ce: y i = x i x. ҧ The mea f this variab e is 0. The varia ce f this variab e is s 2 x. Nte, that a bservati s are i terms f dista ces frm the mea, hich a s fr imprved cmparis f bservati s. A psitive y i mea s that x i is abve the mea. A egative y i mea s that x i is be the mea. The mag itude f the dista ce is u c ear, thugh, as it depe ds variabi ity i the dataset.

Mean-Centered Variable Examp e: Take va ues f x as 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 x ҧ = 5 L variabi ity (s x = 3.317) Mea -ce tered va ues: -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 N take va ues f x as -20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30 x ҧ = 5 High variabi ity (s x = 16.583) Mea -ce tered va ues: -25, -20, -15, -10, -5, 0, 5, 10, 15, 20, 25 I the first set, bei g 5 be the mea mea s yu are very far frm the mea. I the sec d set, bei g 5 be the mea mea s yu are t far frm the mea. Thus, t cmpare va ues a d have a se se f p aceme t ithi a dataset, variabi ity must be ha d ed prper y.

Standardized Variable T ha d e this issue, e ca simp y divide ur bservati s by the sta dard deviati f the data. Nte: The mea f z is 0. The sd f z is 1. z i = x i xҧ s x We ca z here a sta dardized variab e, r a z-scre. Each va ue f z describes the umber f sta dard deviati s abve r be the mea that a give bservati is.

Standardized Variable (Example) The mea a d sta dard deviati f a IQ test is 100 a d 15, respective y. What is the z-scre assciated ith a IQ scre f 140? 140 100 z = = 40 15 15 = 2.67 A IQ f 140 is 2.67 sta dard deviati s abve the mea! What is the z-scre assciated ith a IQ scre f 90? 90 100 z = = 10 15 15 =.67 A IQ f 90 is 0.67 sta dard deviati s be the mea!

The Central Limit Therem (CLT) Suppse that e dra a simp e ra dm samp e f size frm a y ppu ati distributi ( ith fi ite mea a d varia ce). Whe is large enugh, the Central Limit Therem states that the samp i g distributi f the samp e mea ഥX is apprximate y rma. That is, തX n ~N μ x, σ x 2 n, as n. If e sta dardize this resu t, the e fi d that as n തX n μ σx n = ҧ Z n ~N 0,1. Large enugh is, in general, n 30. We i use this resu t i hypthesis testi g.

Hypthesis Testing Hypthesis testi g is a prcess that i v ves testi g te tative guesses (hyptheses) abut re ati ships i a ppu ati. It ca be vie ed as a prcess f gatheri g evide ce fr (r agai st) a specific c aim, typica y regardi g a research questi bei g studied by a researcher. The researcher is c cer ed ith testi g hether r t the hypthesis ca be supprted empirica y. The u hypthesis, de ted by H 0, is the hypthesis i questi. The researcher tests hether the data supprt r fai t supprt the u hypthesis. The ppsi g hypthesis, ca ed the a ter ative hypthesis, a d de ted by H 1 r H A, is the hypthesis that is accepted if the data fai t supprt the u.

Hypthesis Testing Prcedure 1. Frm a u hypthesis (H 0 ) a d a a ter ative hypthesis (H 1 ). 2. Determi e the ru es fr maki g a decisi (i.e., ru es fr accepti g r rejecti g the u hypthesis). 3. Gather data! This is yur evide ce t supprt r reject H 0. 4. Use a s f prbabi ity/statistica samp i g t test H 0 vs. H 1. Use the apprpriate test statistic fr testi g H 0. 5. Accept r reject H 0 based decisi ru es.