REGRESSION MODELS ANOVA

Size: px
Start display at page:

Download "REGRESSION MODELS ANOVA"

Transcription

1 REGRESSION MODELS ANOVA 141

2 Cotiuous Outcome? NO RECAP: Logistic regressio ad other methods YES Liear Regressio Examie mai effects cosiderig predictors of iterest, ad cofouders Test effect modificatio if scietifically relevat Compute ad plot Residuals Assess ifluece Modify approach NO Do the assumptios appear reasoable? YES REPORT 142

3 COMING UP NEXT: ANOVA a special case of liear regressio What if the idepedet variables of iterest are categorical? I this case, comparig the mea of the cotiuous outcome i the differet categories may be of iterest This is what is called ANalysis Of VAriace We will show that it is just a special case of liear regressio 143

4 ANOVA a special case of liear regressio LINEAR REGRESSION Oe-way Aalysis of Variace Two-way Aalysis of Variace Aalysis of Covariace Oe Categorical POI Two Categorical POIs Oe Categorical POI + Oe cotiuous predictor Uses dummy variables to represet categorical variables! 144

5 Outlie Motivatio: We will cosider some examples of ANOVA ad show that they are special cases of liear regressio ANOVA as a regressio model Dummy variables Oe-way ANOVA models Cotrasts Multiple comparisos Two-way ANOVA models Iteractios ANCOVA models 145

6 ANOVA/ANCOVA: Motivatio Let s ivestigate if geetic factors are associated with cholesterol levels. Ideally, you would have a cofirmatory aalysis of scietific hypotheses formulated prior to data collectio Alteratively, you could cosider a exploratory aalysis hypotheses geeratio for future studies 146

7 ANOVA/ANCOVA: Motivatio Scietific hypotheses of iterest: Assess the effect of rs o cholesterol levels. Assess the effect of rs ad sex o cholesterol levels Does the effect of rs o cholesterol differ betwee males ad females? Assess the effect of rs ad age o cholesterol levels Does the effect of rs o cholesterol differ depedig o subject s age? 147

8 ANOVA: Oe-Way Model Motivatio: Scietific questio: Assess the effect of rs o cholesterol levels. 148

9 Motivatio: Example Here are some descriptive summaries: > tapply(chol, factor(rs174548), mea) > tapply(chol, factor(rs174548), sd)

10 Motivatio: Example Aother way of gettig the same results: > by(chol, factor(rs174548), mea) factor(rs174548): 0 [1] factor(rs174548): 1 [1] factor(rs174548): 2 [1] > by(chol, factor(rs174548), sd) factor(rs174548): 0 [1] factor(rs174548): 1 [1] factor(rs174548): 2 [1]

11 Motivatio: Example Is rs associated with cholesterol? R commad: boxplot(chol ~ factor(rs174548)) 151

12 Motivatio: Example Aother graphical display: mea of chol as.factor(rs174548) R commad: plot.desig(chol ~ factor(rs174548)) Factors 152

13 Motivatio: Example Feature: How do the mea resposes compare across differet groups? Categorical/qualitative predictor 153

14 REGRESSION MODELS Oe-way ANOVA as a regressio model 154

15 ANalysis Of VAriace Models (ANOVA) Compares the meas of several populatios Assumptios for Classical ANOVA Framework: Idepedece Normality Equal variaces 155

16 ANalysis Of VAriace Models (ANOVA) Compares the meas of several populatios

17 ANalysis Of VAriace Models (ANOVA) Compares the meas of several populatios Couter-ituitive ame! 157

18 ANalysis Of VAriace Models (ANOVA) I both data sets, the true populatio meas are: 3 (A), 5 (B), 7(C) Situatio 1 Situatio A B C Low variace withi groups A B C High variace withi groups Where do you expect to detect differece betwee populatio meas? 158

19 ANalysis Of VAriace Models (ANOVA) Compares the meas of several populatios Couter-ituitive ame! Uderlyig cocept: To assess whether the populatio meas are equal, compares: Variatio betwee the sample meas (MSR) to Natural variatio of the observatios withi the samples (MSE). The larger the MSR compared to MSE the more support that there is a differece i the populatio meas! The ratio MSR/MSE is the F-statistic. We ca make these comparisos with multiple liear regressio: the differet groups are represeted with dummy variables 159

20 ANOVA as a multiple regressio model Dummy Variables: Suppose you have a categorical variable C with k categories 0,1, 2,, k-1. To represet that variable we ca costruct k-1 dummy variables of the form The omitted category (here category 0) is the referece group. 160

21 ANOVA as a multiple regressio model Dummy Variables: Back to our motivatig example: Predictor: rs (coded 0=C/C, 1=C/G, 2=G/G) Outcome (Y): cholesterol Let s take C/C as the referece group. x 1 = ì1, í î0, if code1(c/g) otherwise x 2 = ì1, í î0, if code 2 (G/G) otherwise 161

22 ANOVA as a multiple regressio model rs Mea cholesterol X 1 X 2 C/C µ C/G µ G/G µ

23 ANOVA as a multiple regressio model Regressio with Dummy Variables: Example: Model: E[Y x 1, x 2 ] = b 0 + b 1 x 1 + b 2 x 2 Iterpretatio of model parameters? 163

24 ANOVA as a multiple regressio model Mea Regressio Model µ 0 b 0 µ 1 b 0 + b 1 µ 2 b 0 + b 2 164

25 ANOVA as a multiple regressio model Regressio with Dummy Variables: Example: Model: E[Y x 1, x 2 ] = b 0 + b 1 x 1 + b 2 x 2 Iterpretatio of model parameters? µ 0 = b 0 : mea cholesterol whe rs is C/C µ 1 = b 0 +b 1 : mea cholesterol whe rs is C/G µ 2 = b 0 +b 2 : mea cholesterol whe rs is G/G 165

26 ANOVA as a multiple regressio model Regressio with Dummy Variables: Example: Model: E[Y x 1, x 2 ] = b 0 + b 1 x 1 + b 2 x 2 Iterpretatio of model parameters? µ 0 = b 0 : mea cholesterol whe rs is C/C µ 1 = b 0 +b 1 : mea cholesterol whe rs is C/G µ 2 = b 0 +b 2 : mea cholesterol whe rs is G/G Alteratively b 1 : differece i mea cholesterol levels betwee groups with rs equal to C/G ad C/C (µ 1 - µ 0 ). b 2 : differece i mea cholesterol levels betwee groups with rs equal to G/G ad C/C (µ 2 - µ 0 ). 166

27 ANOVA: Oe-Way Model Goal: Compare the meas of K idepedet groups (defied by a categorical predictor) Statistical Hypotheses: (Global) Null Hypothesis: H 0 : µ 0 = µ 1 = = µ K-1 or, equivaletly, H 0 : β 1 = β 2 = = β K-1 =0 Alterative Hypothesis: H 1 : ot all meas are equal If the meas of the groups are ot all equal (i.e. you rejected the above H 0 ), determie which oes are differet (multiple comparisos) 167

28 Estimatio ad Iferece Global Hypotheses µ = µ =... = µ K H 0 : vs. H 1 : ot all meas are equal 1 H 0 : β 1 = β 2 = = β K-1 =0 2 Aalysis of variace table Source df SS MS F 2 Regressio K-1 SSR= (y - y MSR= SSR/(K-1) Residual -K SSE= å 2 (yij - yi) MSE= i, j SSE/-K Total -1 SST= å i å i, j (y i ) 2 ij - y) MSR/ MSE 168

29 ANOVA: Oe-Way Model How to fit a oe-way model as a regressio problem? Need to use dummy variables Create o your ow (ca be tedious!) Most software packages will do this for you R creates dummy variables i the backgroud as log as you state you have a categorical variable (may eed to use: factor) 169

30 ANOVA: Oe-Way Model By had: Creatig dummy variables: > dummy1 = 1*(rs174548==1) > dummy2 = 1*(rs174548==2) Fittig the ANOVA model: > fit0 = lm(chol ~ dummy1 + dummy2) > summary(fit0) Call: lm(formula = chol ~ dummy1 + dummy2) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** dummy ** dummy Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit0) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) dummy ** dummy Residuals Sigif. codes: 0 *** ** 0.01 *

31 ANOVA: Oe-Way Model Better: Let R do it for you! > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) ** factor(rs174548) Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals Sigif. codes: 0 *** ** 0.01 *

32 ANOVA: Oe-Way Model Your tur! Compare model fit results (fit0 & fit1.1) What do you coclude? 172

33 ANOVA: Oe-Way Model > fit0 = lm(chol ~ dummy1 + dummy2) > summary(fit0) Call: lm(formula = chol ~ dummy1 + dummy2) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** dummy ** dummy Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit0) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) dummy ** dummy Residuals > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) ** factor(rs174548) Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals

34 ANOVA: Oe-Way Model > fit0 = lm(chol ~ dummy1 + dummy2) > summary(fit0) Call: lm(formula = chol ~ dummy1 + dummy2) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** dummy ** dummy Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit0) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) dummy ** dummy Residuals > 1-pf(4.4865,2,397) [1] > 1-pf((( )/2)/481,2,397) [1] > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) ** factor(rs174548) Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals

35 ANOVA: Oe-Way Model > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 factor(rs174548) factor(rs174548) Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals Let s iterpret the regressio model results! What is the iterpretatio of the regressio model coefficiets? 175

36 ANOVA: Oe-Way Model > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 factor(rs174548) factor(rs174548) Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals Iterpretatio: Estimated mea cholesterol for C/C group: mg/dl Estimated differece i mea cholesterol levels betwee C/G ad C/C groups: mg/dl Estimated differece i mea cholesterol levels betwee G/G ad C/C groups: mg/dl 176

37 ANOVA: Oe-Way Model > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 factor(rs174548) factor(rs174548) Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals Overall F-test shows a sigificat p-value. We reject the ull hypothesis that the mea cholesterol levels are the same across groups defied by rs (p= ). This does ot tell us which groups are differet! (Need to perform multiple comparisos! More soo ) 177

38 ANOVA: Oe-Way Model Alterative form: (better if you will perform multiple comparisos) > fit1.2 = lm(chol ~ -1 + factor(rs174548)) > summary(fit1.2) Call: lm(formula = chol ~ -1 + factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) factor(rs174548) <2e-16 *** factor(rs174548) <2e-16 *** factor(rs174548) <2e-16 *** --- Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 9383 o 3 ad 397 DF, p-value: < 2.2e-16 > aova(fit1.2) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) < 2.2e-16 *** Residuals Sigif. codes: 0 *** ** 0.01 *

39 ANOVA: Oe-Way Model How about this oe? How is rs beig treated ow? Compare model fit results from (fit1.1 & fit2). > fit2 = lm(chol ~ rs174548) > summary(fit2) Call: lm(formula = chol ~ rs174548) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** rs ** --- Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 398 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 1 ad 398 DF, p-value: > aova(fit2) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) rs ** Residuals Sigif. codes: 0 *** ** 0.01 *

40 ANOVA: Oe-Way Model > fit2 = lm(chol ~ rs174548) > summary(fit2) Model: E[Y x] = b 0 + b 1 x where Y: cholesterol, x: rs Call: lm(formula = chol ~ rs174548) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** rs ** Residual stadard error: o 398 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 1 ad 398 DF, p-value: > aova(fit2) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) rs ** Residuals Iterpretatio of model parameters? b 0 : mea cholesterol i the C/C group [estimate: mg/dl] b 1 : mea cholesterol differece betwee C/G ad C/C or betwee G/G ad C/G groups [estimate: mg/dl] This model presumes differeces betwee cosecutive groups are the same (i this example, liear dose effect of allele) more restrictive tha the ANOVA model! Back to the ANOVA model 180

41 ANOVA: Oe-Way Model > fit1.1 = lm(chol ~ factor(rs174548)) > summary(fit1.1) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 factor(rs174548) factor(rs174548) We rejected the ull hypothesis that the mea cholesterol levels are the same across groups defied by rs (p= ). Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit1.1) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals What are the groups with differeces i meas? MULTIPLE COMPARISONS (comig up) 181

42 Oe-Way ANOVA allowig for uequal variaces We ca also perform oe-way ANOVA allowig for uequal variaces: > oeway.test(chol ~ factor(rs174548)) Oe-way aalysis of meas (ot assumig equal variaces) data: chol ad factor(rs174548) F = , um df = 2.000, deom df = , p-value = We reject the ull hypothesis that the mea cholesterol levels are the same across groups defied by rs (p= ). What are the groups with differeces i meas? MULTIPLE COMPARISONS (comig up) 182

43 Oe-Way ANOVA with robust stadard errors > summary(gee(chol ~ factor(rs174548), id=seq(1,legth(chol)))) Begiig Cgee geeformula.q /01/27 ruig glm to get iitial regressio estimate (Itercept) factor(rs174548)1 factor(rs174548) GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-fuctio, versio 4.13 modified 98/01/27 (1998) Model: Lik: Idetity Variace to Mea Relatio: Gaussia Correlatio Structure: Idepedet Call: gee(formula = chol ~ factor(rs174548), id = seq(1, legth(chol))) Summary of Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Naive S.E. Naive z Robust S.E. Robust z (Itercept) factor(rs174548) factor(rs174548) Estimated Scale Parameter: Number of Iteratios: 1 183

44 Kruskal-Wallis Test No-parametric aalogue to the oe-way ANOVA Based o raks I our example: > kruskal.test(chol ~ factor(rs174548)) Kruskal-Wallis rak sum test data: chol by factor(rs174548) Kruskal-Wallis chi-squared = , df = 2, p-value = Coclusio: Evidece that the cholesterol distributio is ot the same across all groups. With the global ull rejected, you ca also perform pairwise comparisos [Wilcoxo rak sum], but adjust for multiplicities! 184

45 REGRESSION METHODS MULTIPLE COMPARISONS 185

46 ANOVA: Oe-Way Model What are the groups with differeces i meas? MULTIPLE COMPARISONS: µ 0 = µ 1? µ 0 = µ 2? Pairwise comparisos µ 1 = µ 2? (µ 1 + µ 2 )/2 = µ 0? No-pairwise compariso 186

47 Multiple Comparisos: Family-wise error rates Illustratig the multiple compariso problem Truth: ull hypotheses Tests: pairwise comparisos - each at the 5% level. What is the probability of rejectig at least oe? #groups = K #pairwise comparisos C = K(K-1)/2 P(at least oe sig) =1-(1-0.05) C That is, if you have three groups ad make pairwise comparisos, each at the 5% level, your familywise error rate (probability of makig at least oe false rejectio) is over 14%! Need to address this issue! Several methods!!! 187

48 Multiple Comparisos Several methods: Noe (o adjustmet) Boferroi Holm Hochberg Hommel BH BY FDR Available i R 188

49 Multiple Comparisos Boferroi adjustmet: for C tests performed, use level α/c (or multiply p-values by C). Simple Coservative Must decide o umber of tests beforehad Widely applicable Ca be doe without software! 189

50 Multiple Comparisos FDR (False Discovery Rate) Less coservative procedure for multiple comparisos Amog rejected hypotheses, FDR cotrols the expected proportio of icorrectly rejected ull hypotheses (that is, type I errors). 190

51 Multiple Comparisos This optio cosiders all pairwise comparisos > ## call library for multiple comparisos > library(multcomp) > > ## fit model > fit1 = lm(chol ~ -1 + factor(rs174548)) > > ## all pairwise comparisos > ## -- first, defie matrix of cotrasts > M = cotrmat(table(rs174548), type="tukey") > M Multiple Comparisos of Meas: Tukey Cotrasts > > ## -- secod, obtai estimates for multiple comparisos > mc = glht(fit1, lifct =M) Stads for geeral liear hypothesis testig 191

52 Multiple Comparisos > ## -- third, adjust the p-values (or ot) for multiple comparisos > summary(mc, test=adjusted("oe")) Simultaeous Tests for Geeral Liear Hypotheses Multiple Comparisos of Meas: Tukey Cotrasts Fit: lm(formula = chol ~ -1 + factor(rs174548)) Liear Hypotheses: Estimate Std. Error t value Pr(> t ) 1-0 == ** 2-0 == == Sigif. codes: 0 *** ** 0.01 * (Adjusted p values reported -- oe method) 192

53 Multiple Comparisos > summary(mc, test=adjusted("boferroi")) Simultaeous Tests for Geeral Liear Hypotheses Multiple Comparisos of Meas: Tukey Cotrasts Fit: lm(formula = chol ~ -1 + factor(rs174548)) Liear Hypotheses: Estimate Std. Error t value Pr(> t ) 1-0 == * 2-0 == == Sigif. codes: 0 *** ** 0.01 * (Adjusted p values reported -- boferroi method) 193

54 Multiple Comparisos > summary(mc, test=adjusted("fdr")) Simultaeous Tests for Geeral Liear Hypotheses Multiple Comparisos of Meas: Tukey Cotrasts Fit: lm(formula = chol ~ -1 + factor(rs174548)) Liear Hypotheses: Estimate Std. Error t value Pr(> t ) 1-0 == * 2-0 == == Sigif. codes: 0 *** ** 0.01 * (Adjusted p values reported -- fdr method) 194

55 Multiple Comparisos What about usig other adjustmet methods? For example, we used: > summary(mc, test=adjusted("boferroi")) (all pairwise comparisos, with Boferroi adjustmet) > summary(mc, test=adjusted("fdr")) (all pairwise comparisos, with FDR adjustmet) Other optios are: summary(mc, test=adjusted("holm")) summary(mc, test=adjusted("hochberg")) summary(mc, test=adjusted("hommel")) summary(mc, test=adjusted("bh")) summary(mc, test=adjusted("by")) Results, i this particular example, are basically the same, but they do t eed to be! Differet criteria could lead to differet results! 195

56 Summary: Relatioships: GOAL: Compariso of Meas across K groups Oe-way ANOVA: H 0 :µ 0 = µ 1 = = µ K-1 H 1 : ot all meas are equal Rejected H 0? Multiple Regressio: Model: E[Y groups]= b 0 + b 1 group 2 + +b k-1 group k where group 1 is the referece group H 0 :b 1 = b 2 = = b k-1 =0 H 1 : ot all b i are equal to zero YES Multiple Comparisos (cotrol a overall) e.g. Boferroi: a/#comparisos 196

57 REGRESSION METHODS Two-way ANOVA models 197

58 ANOVA: Two-Way Model Motivatio: Scietific questio: Assess the effect of rs ad sex o cholesterol levels. 198

59 ANOVA: Two-Way Model Factors: A ad B Goals: Test for mai effect of A Test for mai effect of B Test for iteractio effect of A ad B 199

60 ANOVA: Two-Way Model To simplify discussio, assume that factor A has three levels, while factor B has two levels Factor A A 1 A 2 A 3 Factor B B 1 µ 11 µ 21 µ 31 B 2 µ 12 µ 22 µ

61 ANOVA: Two-Way Model Meas B 2 Parallel lies = No iteractio B 1 A 1 A 2 A 3 B 2 Lies are ot parallel = Iteractio B 1 A 1 A 2 A 3 201

62 ANOVA: Two-Way Model Recall: Categorical variables ca be represeted with dummy variables Iteractios are represeted with cross-products 202

63 ANOVA: Two-Way Model Model 1: E[Y A 2, A 3, B 2 ] = b 0 + b 1 A 2 + b 2 A 3 + b 3 B 2. What are the meas i each combiatio-group? A 1 A 2 A 3 B 1 µ 11 =b 0 µ 21 =b 0 + b 1 µ 31 =b 0 + b 2 B 2 µ 12 =b 0 + b 3 µ 22 =b 0 + b 1 + b 3 µ 32 = b 0 + b 2 + b 3 203

64 ANOVA: Two-Way Model Model 1: E[Y A 2, A 3, B 2 ] = b 0 + b 1 A 2 + b 2 A 3 + b 3 B 2. A 1 A 2 A 3 B 1 µ 11 =b 0 µ 21 =b 0 + b 1 µ 31 =b 0 + b 2 B 2 µ 12 =b 0 + b 3 µ 22 =b 0 + b 1 + b 3 µ 32 = b 0 + b 2 + b 3 Model with o iteractio: Differece i meas betwee groups defied by factor B does ot deped o the level of factor A. Differece i meas betwee groups defied by factor A does ot deped o the level of factor B. 204

65 ANOVA: Two-Way Model Model 2: E[Y A 2, A 3, B 2 ] = b 0 + b 1 A 2 + b 2 A 3 + b 3 B 2 + b 4 A 2 B 2 + b 5 A 3 B 2 What are the meas i each combiatio-group? A 1 A 2 A 3 B 1 µ 11 =b 0 µ 21 =b 0 + b 1 µ 31 =b 0 + b 2 B 2 µ 12 =b 0 + b 3 µ 22 =b 0 + b 1 + b 3 + b 4 µ 32 = b 0 + b 2 + b 3 + b 5 205

66 ANOVA: Two-Way Model Three (possible) tests Iteractio of A ad B (may wat to start here) Rejectio would imply that differeces betwee meas of A depeds o the level of B (ad vice-versa) so stop Mai effect of A Test oly if o iteractio Mai effect of B Test oly if o iteractio [ Note: If you have oe observatio per cell, you caot test iteractio! ] 206

67 ANOVA: Two-Way Model Model without iteractio E[Y A 2, A 3, B 2 ] = b 0 + b 1 A 2 + b 2 A 3 + b 3 B 2. How do we test for mai effect of factor A? H 0 : b 1 = b 2 =0 vs. H 1 : b 1 or b 2 ot zero How do we test for mai effect of factor B? H 0 : b 3 =0 vs. H 1 : b 3 ot zero 207

68 ANOVA: Two-Way Model Model with iteractio: E[Y A 2, A 3, B 2 ] = b 0 + b 1 A 2 + b 2 A 3 + b 3 B 2 + b 4 A 2 B 2 + b 5 A 3 B 2 How do we test for iteractios? H 0 : b 4 = b 5 =0 vs. H 1 : b 4 or b 5 ot zero IMPORTANT: If you reject the ull, do ot test mai effects!!! 208

69 ANOVA: Two-Way Model (without iteractio) > fit1 = lm(chol ~ factor(sex) + factor(rs174548)) > summary(fit1) Call: lm(formula = chol ~ factor(sex) + factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(sex) e-07 *** factor(rs174548) ** factor(rs174548) Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 396 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 12.2 o 3 ad 396 DF, p-value: 1.196e-07 > aova(fit0,fit1) Aalysis of Variace Table Model 1: chol ~ factor(sex) Model 2: chol ~ factor(sex) + factor(rs174548) Res.Df RSS Df Sum of Sq F Pr(>F) ** 209

70 ANOVA: Two-Way Model (without iteractio) > fit1 = lm(chol ~ factor(sex) + factor(rs174548)) > summary(fit1) Call: lm(formula = chol ~ factor(sex) + factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(sex) e-07 *** factor(rs174548) ** factor(rs174548) Residual stadard error: o 396 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 12.2 o 3 ad 396 DF, p-value: 1.196e-07 > aova(fit0,fit1) Aalysis of Variace Table Iterpretatio of results: Estimated mea cholesterol for male C/C group: mg/dl Estimated differece i mea cholesterol levels betwee females ad males adjusted by geotype: mg/dl Estimated differece i mea cholesterol levels betwee C/G ad C/C groups adjusted by sex: mg/dl Estimated differece i mea cholesterol levels betwee G/G ad C/C groups adjusted by sex: mg/dl Model 1: chol ~ factor(sex) Model 2: chol ~ factor(sex) + factor(rs174548) Res.Df RSS Df Sum of Sq F Pr(>F) ** There is evidece that cholesterol is associated with sex (p< 0.001). There is evidece that cholesterol is associated with geotype (p=0.005) 210

71 ANOVA: Two-Way Model (without iteractio) I words: Adjustig for sex, the differece i mea cholesterol comparig C/G to C/C is ad comparig G/G to C/C is This differece does ot deped o sex (this is because the model does ot have a iteractio betwee sex ad geotype!) 211

72 ANOVA: Two-Way Model (with iteractio) > fit2 = lm(chol ~ factor(sex) * factor(rs174548)) > summary(fit2) Call: lm(formula = chol ~ factor(sex) * factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(sex) * factor(rs174548) factor(rs174548) factor(sex)1:factor(rs174548) ** factor(sex)1:factor(rs174548) Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 394 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 9.14 o 5 ad 394 DF, p-value: 3.062e

73 ANOVA: Model compariso > aova(fit1,fit2) Aalysis of Variace Table Model 1: chol ~ factor(sex) + factor(rs174548) Model 2: chol ~ factor(sex) * factor(rs174548) Res.Df RSS Df Sum of Sq F Pr(>F) * --- Sigif. codes: 0 *** ** 0.01 *

74 ANOVA: Two-Way Model (with iteractio) > fit2 = lm(chol ~ factor(sex) * factor(rs174548)) > summary(fit2) Call: lm(formula = chol ~ factor(sex) * factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(sex) * factor(rs174548) factor(rs174548) factor(sex)1:factor(rs174548) ** factor(sex)1:factor(rs174548) Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 394 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 9.14 o 5 ad 394 DF, p-value: 3.062e-08 > aova(fit1,fit2) Aalysis of Variace Table Iterpretatio of results: Estimated mea cholesterol for male C/C group: mg/dl Estimated mea cholesterol for female C/C group? ( ) mg/dl Estimated mea cholesterol for male C/G group: ( ) mg/dl Estimated mea cholesterol for female C/G group: ( ) mg/dl Model 1: chol ~ factor(sex) + factor(rs174548) Model 2: chol ~ factor(sex) * factor(rs174548) Res.Df RSS Df Sum of Sq F Pr(>F) * --- Sigif. codes: 0 *** ** 0.01 * There is evidece for a iteractio betwee sex ad geotype (p= 0.015) 214

75 SUMMARY: Two-Way ANOVA Sigificat Iteractio? NO Iterpret mai effects of factor A ad factor B YES Iterpret the effect of factor A o mea respose for each level of factor B (or effect of factor B o mea respose for each level of factor A) 215

76 REGRESSION METHODS ANCOVA (aka ANACOVA) 216

77 ANalysis of COVAriace Models (ANCOVA) Motivatio: Scietific questio: Assess the effect of rs o cholesterol levels adjustig for age 217

78 ANalysis of COVAriace Models (ANCOVA) ANOVA with oe or more cotiuous variables Equivalet to regressio with dummy variables ad cotiuous variables Primary compariso of iterest is across k groups defied by a categorical variable, but the k groups may differ o some other potetial predictor or cofouder variables [also called covariates]. 218

79 ANalysis of COVAriace Models (ANCOVA) To facilitate discussio assume Y: cotiuous respose (e.g. cholesterol) X: cotiuous variable (e.g. age) Z: dummy variable (e.g. idicator of C/G or G/G versus C/C) Model: Y = b + b X + b Z + b XZ + e Note that: Z = 0 Þ E[ Y X, Z Z = 1Þ E[ Y X, Z = 0] = b = 1] = ( b b X 1 + b ) + 2 ( b + b ) X 1 Iteractio term 3 This model allows for differet itercepts/slopes for each group. 219

80 ANCOVA Testig coicidet lies: H0 : b2 = 0, b3 = Compares overall model with reduced model Y 0 1 = b + b X + e 0 Testig parallelism: H 0 : b3 = Compares overall model with reduced model Y = b + b X + b Z + e 220

81 ANCOVA > fit0 = lm(chol ~ factor(rs174548)) > summary(fit0) Call: lm(formula = chol ~ factor(rs174548)) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) ** factor(rs174548) Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 397 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 2 ad 397 DF, p-value: > aova(fit0) Aalysis of Variace Table Respose: chol Df Sum Sq Mea Sq F value Pr(>F) factor(rs174548) * Residuals Sigif. codes: 0 *** ** 0.01 *

82 ANCOVA > fit1 = lm(chol ~ factor(rs174548) + age) > summary(fit1) Call: lm(formula = chol ~ factor(rs174548) + age) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) ** factor(rs174548) age e-05 *** Residual stadard error: o 396 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 3 ad 396 DF, p-value: 5.778e-06 > aova(fit0,fit1) Aalysis of Variace Table Model 1: chol ~ factor(rs174548) Model 2: chol ~ factor(rs174548) + age Res.Df RSS Df Sum of Sq F Pr(>F) e-05 *** --- Sigif. codes: 0 *** ** 0.01 *

83 ANCOVA Total cholesterol (mg/dl) C/C C/G G/G Age (years) 223

84 ANCOVA > fit2 = lm(chol ~ factor(rs174548) * age) > summary(fit2) Call: lm(formula = chol ~ factor(rs174548) * age) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** factor(rs174548) factor(rs174548) age ** factor(rs174548)1:age factor(rs174548)2:age Residual stadard error: o 394 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 5 ad 394 DF, p-value: 4.065e

85 ANCOVA > fit0 = lm(chol ~ age) > summary(fit0) Call: lm(formula = chol ~ age) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** age e-05 *** --- Sigif. codes: 0 *** ** 0.01 * Test of coicidet lies Residual stadard error: o 398 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: o 1 ad 398 DF, p-value: 4.522e-05 > aova(fit0,fit2) Aalysis of Variace Table Model 1: chol ~ age Model 2: chol ~ factor(rs174548) * age Res.Df RSS Df Sum of Sq F Pr(>F) * --- Sigif. codes: 0 *** ** 0.01 *

86 ANCOVA Test of parallel lies > aova(fit1,fit2) Aalysis of Variace Table Model 1: chol ~ factor(rs174548) + age Model 2: chol ~ factor(rs174548) * age Res.Df RSS Df Sum of Sq F Pr(>F)

87 ANCOVA Total cholesterol (mg/dl) C/C C/G G/G Age (years) 227

88 228 ANCOVA I summary: If the slopes are ot equal, the age is a effect modifier If the slopes are the same, ) ( ) ( ) ( ) ( ], [ GG x CG x GG CG x z x E Y * + * = b b b b b b ) ( ) ( ], [ GG CG x z x Y E b b b b =

89 229 ANCOVA If the slopes are the same, the oe ca obtai adjusted meas for the three geotypes usig the mea age over all groups For example, the adjusted meas for the three groups would be ˆ ) ˆ ˆ ( Y (adj) ˆ ) ˆ ˆ ( Y (adj) ˆ ˆ Y (adj) b b b b b b b b x x x + + = + + = + = ) ( ) ( ], [ GG CG x z x Y E b b b b =

90 ANCOVA > ## mea cholesterol for differet geotypes adjusted by age > predict(fit1, ew=data.frame(age=mea(age),rs174548=0)) > predict(fit1, ew=data.frame(age=mea(age),rs174548=1)) > predict(fit1, ew=data.frame(age=mea(age),rs174548=2))

91 SUMMARY: ANCOVA Sigificat Iteractio? (slopes are differet?) NO Cotrol for potetial cofouder? YES YES Iterpret the differece i meas of the respose for give values of the cotiuous variable Compute adjusted meas at the commo X mea 231

92 Summary We have cosidered: ANOVA ad ANCOVA Iterpretatio Estimatio Iteractio Multiple comparisos 232

REGRESSION METHODS. Logistic regression

REGRESSION METHODS. Logistic regression REGRESSION METHODS Logistic regressio 233 RECAP: Biary Outcome? NO Cotiuous Outcome? YES Liear Regressio/ANOVA NO Other Methods YES Odds ratio as measure of associatio? Relative risk as measure of associatio?

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

REGRESSION AND ANALYSIS OF VARIANCE. Motivation. Module structure

REGRESSION AND ANALYSIS OF VARIANCE. Motivation. Module structure REGRESSION AND ANALYSIS OF VARIANCE 1 Motivatio Objective: Ivestigate associatios betwee two or more variables What tools do you already have? t-test Compariso of meas i two populatios What will we cover

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

y ij = µ + α i + ɛ ij,

y ij = µ + α i + ɛ ij, STAT 4 ANOVA -Cotrasts ad Multiple Comparisos /3/04 Plaed comparisos vs uplaed comparisos Cotrasts Cofidece Itervals Multiple Comparisos: HSD Remark Alterate form of Model I y ij = µ + α i + ɛ ij, a i

More information

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 8: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review What do we mea by oparametric? What is a desirable locatio statistic for ordial data? What

More information

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2 Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:

More information

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740 Ageda: Recap. Lecture. Chapter Homework. Chapt #,, 3 SAS Problems 3 & 4 by had. Copyright 06 by D.B. Rowe Recap. 6: Statistical Iferece: Procedures for μ -μ 6. Statistical Iferece Cocerig μ -μ Recall yes

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet Grat MacEwa Uiversity STAT 5 Dr. Kare Buro Formula Sheet Descriptive Statistics Sample Mea: x = x i i= Sample Variace: s = i= (x i x) = Σ i=x i (Σ i= x i) Sample Stadard Deviatio: s = Sample Variace =

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio

More information

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday Aoucemets MidtermII Review Sta 101 - Fall 2016 Duke Uiversity, Departmet of Statistical Sciece Office Hours Wedesday 12:30-2:30pm Watch liear regressio videos before lab o Thursday Dr. Abrahamse Slides

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed

More information

Biostatistics for Med Students. Lecture 2

Biostatistics for Med Students. Lecture 2 Biostatistics for Med Studets Lecture 2 Joh J. Che, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 22, 2017 Lecture Objectives To uderstad basic research desig priciples

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

Describing the Relation between Two Variables

Describing the Relation between Two Variables Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

Additional Notes and Computational Formulas CHAPTER 3

Additional Notes and Computational Formulas CHAPTER 3 Additioal Notes ad Computatioal Formulas APPENDIX CHAPTER 3 1 The Greek capital sigma is the mathematical sig for summatio If we have a sample of observatios say y 1 y 2 y 3 y their sum is y 1 + y 2 +

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Module 4: Regression Methods: Concepts and Applications

Module 4: Regression Methods: Concepts and Applications Module 4: Regression Methods: Concepts and Applications Example Analysis Code Rebecca Hubbard, Mary Lou Thompson July 11-13, 2018 Install R Go to http://cran.rstudio.com/ (http://cran.rstudio.com/) Click

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical

More information

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D. ample ie Estimatio i the Proportioal Haards Model for K-sample or Regressio ettigs cott. Emerso, M.D., Ph.D. ample ie Formula for a Normally Distributed tatistic uppose a statistic is kow to be ormally

More information

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution. Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

MA 575, Linear Models : Homework 3

MA 575, Linear Models : Homework 3 MA 575, Liear Models : Homework 3 Questio 1 RSS( ˆβ 0, ˆβ 1 ) (ŷ i y i ) Problem.7 Questio.7.1 ( ˆβ 0 + ˆβ 1 x i y i ) (ȳ SXY SXY x + SXX SXX x i y i ) ((ȳ y i ) + SXY SXX (x i x)) (ȳ y i ) SXY SXX SY

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Chapter 12 Correlation

Chapter 12 Correlation Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio

More information

University of California, Los Angeles Department of Statistics. Simple regression analysis

University of California, Los Angeles Department of Statistics. Simple regression analysis Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig

More information

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 7: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review How ca we set a cofidece iterval o a proportio? 2 Review How ca we set a cofidece iterval

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1 October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve

More information

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,

More information

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed: School of Busiess ad Ecoomics Exam: Code: Examiator: Co-reader: Busiess Statistics E_BK_BS / E_IBA_BS dr. R. Heijugs dr. G.J. Frax Date: 5 May, 08 Time: :00 Duratio: Calculator allowed: Graphical calculator

More information

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued) Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This

More information

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic http://ijspccseetorg Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium Statistical Hypothesis Testig STAT 536: Geetic Statistics Kari S. Dorma Departmet of Statistics Iowa State Uiversity September 7, 006 Idetify a hypothesis, a idea you wat to test for its applicability

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x

More information

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005 Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

1036: Probability & Statistics

1036: Probability & Statistics 036: Probability & Statistics Lecture 0 Oe- ad Two-Sample Tests of Hypotheses 0- Statistical Hypotheses Decisio based o experimetal evidece whether Coffee drikig icreases the risk of cacer i humas. A perso

More information

Lecture 10: Performance Evaluation of ML Methods

Lecture 10: Performance Evaluation of ML Methods CSE57A Machie Learig Sprig 208 Lecture 0: Performace Evaluatio of ML Methods Istructor: Mario Neuma Readig: fcml: 5.4 (Performace); esl: 7.0 (Cross-Validatio); optioal book: Evaluatio Learig Algorithms

More information

MA238 Assignment 4 Solutions (part a)

MA238 Assignment 4 Solutions (part a) (i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Chapter 13: Tests of Hypothesis Section 13.1 Introduction Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed

More information

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is: PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,

More information

Notes on Hypothesis Testing, Type I and Type II Errors

Notes on Hypothesis Testing, Type I and Type II Errors Joatha Hore PA 818 Fall 6 Notes o Hypothesis Testig, Type I ad Type II Errors Part 1. Hypothesis Testig Suppose that a medical firm develops a ew medicie that it claims will lead to a higher mea cure rate.

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples. Chapter 9 & : Comparig Two Treatmets: This chapter focuses o two eperimetal desigs that are crucial to comparative studies: () idepedet samples ad () matched pair samples Idepedet Radom amples from Two

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Formulas and Tables for Gerstman

Formulas and Tables for Gerstman Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary

More information

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,

More information

Stat 200 -Testing Summary Page 1

Stat 200 -Testing Summary Page 1 Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS Lecture 7: No-parametric Compariso of Locatio GENOME 560 Doug Fowler, GS (dfowler@uw.edu) 1 Review How ca we set a cofidece iterval o a proportio? 2 What do we mea by oparametric? 3 Types of Data A Review

More information