McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

Similar documents
McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

y response variable x 1, x 2,, x k -- a set of explanatory variables

Sociology 593 Exam 1 Answer Key February 17, 1995

MATH 644: Regression Analysis Methods

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Sociology 593 Exam 2 Answer Key March 28, 2002

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Analysing data: regression and correlation S6 and S7

Lecture 7: Hypothesis Testing and ANOVA

Sociology 593 Exam 1 February 17, 1995

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

UNIVERSITY OF TORONTO Faculty of Arts and Science

Two-Way ANOVA. Chapter 15

Stat 5102 Final Exam May 14, 2015

WORKSHOP 3 Measuring Association

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

ANOVA CIVL 7012/8012

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

Week 14 Comparing k(> 2) Populations

MATH ASSIGNMENT 2: SOLUTIONS

Sociology 593 Exam 2 March 28, 2002

STAT 525 Fall Final exam. Tuesday December 14, 2010

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

Formulas and Tables by Mario F. Triola

Simple Linear Regression: One Qualitative IV

16.400/453J Human Factors Engineering. Design of Experiments II

22s:152 Applied Linear Regression. Take random samples from each of m populations.

ST430 Exam 1 with Answers

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

SPSS Guide For MMI 409

Analysis of variance

A discussion on multiple regression models

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

T-Test QUESTION T-TEST GROUPS = sex(1 2) /MISSING = ANALYSIS /VARIABLES = quiz1 quiz2 quiz3 quiz4 quiz5 final total /CRITERIA = CI(.95).

Multiple Comparisons

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Multiple linear regression S6

STAT 501 EXAM I NAME Spring 1999

Review of Multiple Regression

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Examination paper for TMA4255 Applied statistics

Interactions between Binary & Quantitative Predictors

Model Building Chap 5 p251

4/22/2010. Test 3 Review ANOVA

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

3 Joint Distributions 71

Regression. Notes. Page 1. Output Created Comments 25-JAN :29:55

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

ST430 Exam 2 Solutions

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Lecture 6 Multiple Linear Regression, cont.

Self-Assessment Weeks 6 and 7: Multiple Regression with a Qualitative Predictor; Multiple Comparisons

Levene's Test of Equality of Error Variances a

BIOS 2083 Linear Models c Abdus S. Wahed

MATH Notebook 3 Spring 2018

SCHOOL OF MATHEMATICS AND STATISTICS

Statistical. Psychology

Lecture 9: Linear Regression

Correlation and simple linear regression S5

What is a Hypothesis?

Inference for Regression

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Statistics and Quantitative Analysis U4320

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Exam Applied Statistical Regression. Good Luck!

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Simple Linear Regression

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Types of Statistical Tests DR. MIKE MARRAPODI

Textbook Examples of. SPSS Procedure

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

SPSS LAB FILE 1

Multiple Linear Regression

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Ordinary Least Squares Regression Explained: Vartanian

Confidence Intervals, Testing and ANOVA Summary

Transcription:

McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please write your answers in the answer booklets provided This paper contains six questions Each question carries 25 marks Credit will be given for all questions attempted The total mark available is 50 but rescaling of the final mark may occur Candidates may take one double-sided sheet of Letter-sized (26 279 mm) or A4-sized (20 297mm) paper with notes into the examination room Calculators may be used Relevant statistical tables are provided Dictionaries and Translation dictionaries are permitted c 2009 McGill University MATH 204 Page of 6

This page intentionally left blank MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 2 of 6

In an experimental study of the efficacy of medication for blood pressure reduction, patients were recruited to the study and given the same amount of medication over the course of treatment The patients were categorized by age: Group : Age 40-49 (group mid point 45) Group 2: Age 50-59 (group mid point 55) Group 3: Age 60-69 (group mid point 65) Group 4: Age 70-79 (group mid point 75) Group 5: Age 80-89 (group mid point 85) SPSS output for the analysis of these data is contained on page 9: the output contains the analyses carried out by two statisticians, Analyst (two pieces of output) and Analyst 2 (three pieces of output) The SPSS data set contained the following variables: Age : Age of patient in years AgeGroup : Age group to which patient belonged AgeMidPoint : Mid point of age group to which patient belonged Y : Reduction in blood pressure over course of treatment (a) Is this study balanced? Comment on the relevance of balance to the analyses carried out 2 MARKS (b) The SPSS ANOVA table for Analyst is incomplete: three key entries (marked X) are missing Write down the values of the three missing quantities, explaining how you computed each 6 MARKS (c) Explain the statistical hypothesis test carried out by Analyst : give the null hypothesis, the test statistic and the result of the test 5 MARKS (d) Explain the analysis carried out by Analyst 2 What type of model is being used? Explain the difference (in terms of the model used and the conclusions) between this analysis and the analysis carried out by Analyst 8 MARKS (e) Both Analyst and Analyst 2 need to make certain assumptions in order for their statistical conclusions to be valid List these assumptions 4 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 3 of 6

2 (a) An experimental study is to be carried out into the effect of two factors A (with a levels) and B (with b levels) on response Y Explain how to carry out a complete, balanced replicated design with r replicates (i) if A and B are treated identically as factors, (ii) if B is regarded as a blocking factor Comment in particular on how the experimental units are assigned to treatments in the two cases, and how the statistical analyses of the two designs differ 6 MARKS (b) A balanced, complete experimental study with two factors A and B was carried out, and the analysis is presented on pages 0 and Summarize relevant features of the SPSS output; comment in particular on (i) the models being fitted, and the model you would regard as the most suitable for explaining the variation in the response (ii) the conclusions from the ANOVA results (iii) the overall quality of the model fit (iv) whether the assumptions necessary for ANOVA results to be valid are met here When describing the models, use the standard model notation such as A + B to represent the model with main effects only, and so on 0 MARKS (c) On the basis of the output given, report an estimate of the mean response for experimental units in the treatment group for (i) Factor A level 5, Factor B level 4 (ii) Factor A level, Factor B level 3 6 MARKS (d) Explain briefly how you would proceed to carry out an analysis of a two-factor experiment that was not balanced 3 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 4 of 6

3 A data set containing information on 400 schools in California is to be used to try to understand how to predict academic performance in the school, and how to modify future education policy in the state The data set comprises four variables: X : the academic performance in previous year (covariate) X 2 : the average class size in current year (covariate) (two values missing) X 3 : the total enrollment at the school in the current year (covariate) Y : the academic performance in current year (response) (a) Write down the formula, involving suitable β coefficients, of a multiple regression model for the expected response E[Y ] when the model includes all three covariates as main effects 3 MARKS (b) Output from the SPSS analysis of the response data is given on pages 2 and 3 A summary of the model fit results are given in the following table: Analysis R 2 Adj R 2 Analysis R 2 Adj R 2 Analysis 0986 097 Analysis 5 0985 097 Analysis 2 0985 097 Analysis 6 07 0029 Analysis 3 0986 097 Analysis 7 038 08 Analysis 4 0380 044 Explain the results in the analyses given, first listing the models fitted for each analysis, and then summarizing the conclusions you would report Use the standard model notation to describe the models 0 MARKS (c) Predict the expected academic performance for a school whose academic performance in the previous year was 700, with average class size 20, and school enrollment 650, using the results from (i) Analysis, (ii) Analysis 3, (iii) Analysis 4 6 MARKS (d) Explain why you would not recommend that prediction be made using the results for Analysis for a school whose academic performance in the previous year was 700, but with current average class size 28 and school enrollment 800 from these data 2 MARKS (e) Another variable, X 4, defined as the change in academic performance at the school between the previous year and the current year is also included in the data set Explain why X 4 should not be used as a covariate in a model aimed at predicting Y 4 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 5 of 6

4 Optimal dosing of the anticoagulant drug warfarin is an important issue in the treatment of patients affected with blood clotting disorders The optimal dose can be measured experimentally, and is thought to be dependent on the dietary level of vitamin K and genetic factors - in particular, the genotypes (polymorphisms) at two loci are thought to influence the level of optimal dose An observational study involving 07 patients was carried out to study optimal dosing The recorded data comprise the response (the recorded optimal dose level), two factor predictors and a single continuous covariate: G: a factor with three levels corresponding to locus one, corresponding to the three polymorphisms observed at that locus G2: a factor with two levels corresponding to locus two, corresponding to the two polymorphisms observed at that locus X: a covariate recording the dietary level of vitamin K A series of models were fitted, and the results are summarized in a table on page 4 (a) List the models that can be fitted using the pair of variables G and X only, and give the number of parameters that each of these models contains 5 MARKS (b) Explain, using sketches if necessary, what the model G + G2 + X + G2X represents in terms of how the mean response changes in the different subgroups Explain the interaction term carefully 6 MARKS (c) Using the table of results above and a model search strategy, find the most appropriate model in ANOVA-F terms The ANOVA-F test for comparing nested models is based on the statistic F = (SSE R SSE C )/(k g) SSE C /(n k ) SSE R is the error sum of squares for the Reduced, specified using g + parameters including the intercept SSE C is the error sum of squares for the Complete, specified using k + parameters including the intercept If the reduced model is an adequate simplification of the complete model, then F Fisher-F(k g, n k ) You do not have to perform all possible model comparisons between the eight models listed 0 MARKS A table of the Fisher-F distribution is provided on page 5 Entries in the table are the 005 tail quantile of the Fisher-F(ν, ν 2 ) distribution, for different values of ν and ν 2 (d) Report the estimate of the residual error variance σ 2 for your selected model (e) Explain why models 3 and 4 cannot be compared using the procedure above 2 MARKS 2 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 6 of 6

5 (a) A clinician has developed a new diagnostic test for early-onset Alzheimers disease, and in a pilot study was able to predict the disease correctly on out of 5 occasions Using data obtained, the test statistic S =, and the table of the Binomial distribution on page 6, carry out a test of the hypotheses H 0 : θ = 2 H a : θ > 2 where θ is the probability of correctly predicting the disease Compute the p-value in the test, and give the critical value if the rejection region was required to contain at most probability 005 0 MARKS Hint: as in the sign or Binomial test, if the null hypothesis is true, S Binomial(5, /2) Also, recall that in the case of a discrete test statistic T, the one-sided rejection region is defined by considering, for critical value C R, the probability P [T C R ] (b) The usual test for early onset Alzheimers is based on a genetic screening test In a larger follow-up study of 75 patients who were ultimately determined to be sufferers from the disease, the diagnostic accuracy of the new test was compared with genetic screening, and the results are given in the following table Diagnosis Test Used Alzheimers New Usual Total Yes 22 36 58 No 8 9 7 Total 30 45 75 Is there any evidence that diagnostic accuracy is the same for both tests? Justify your answer by carrying out a Chi-squared test Report also the estimate of the odds ratio for correct diagnosis derived from this table 9 MARKS The table on page 6 contains the 005 and 00 tail quantiles of the Chisquared(ν) distribution, for ν = to 20 The two forms of the Chi-squared statistic are also given (c) For the 75 patients in the previous study, that comprised 8 women and 57 men, the age at diagnosis was also recorded Name two tests that can be used to assess whether there was a difference between age at diagnosis between women and men, describing briefly the assumptions that must hold for each test to be valid 6 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 7 of 6

6 (a) Explain the different contexts in which an ANOVA F-test, the Kruskal-Wallis test, and the Friedman test are used Comment in particular on the relevant experimental designs, and on the assumptions needed for each test to be valid 6 MARKS (b) In a study of three different therapies for migraine, a small randomized study was carried out in a sample of sixteen of healthy subjects Each subject had a migraine induced chemically, and then underwent a three hour course of therapy, at the end of which they reported their current migraine level using a score on a scale of to 20, with 20 being highest pain level The data recorded in the study are given below Group 2 2 2 2 2 3 3 3 3 3 Score 3 5 5 0 9 6 4 2 4 2 7 5 8 2 8 5 Carry out the Kruskal-Wallis test for these data Recall that the Kruskal-Wallis test statistic is H = 2 n(n + ) k j= R 2 j n j 3(n + ) where R j is the rank sum for group j =, 2, 3, and k is the number of groups being compared If the relevant null hypothesis, H 0, is true, then H : Chisquared(k ) The table on page 6 contains the 005 and 00 tail quantiles of the Chisquared(ν) distribution, for ν = to 20 0 MARKS (c) An ANOVA test was also considered for the data in the table above In the test for this one-way, Completely Randomized Design, the sums of squares quantities were computed to be SST = 203437 SSE = 42 Complete the Fisher-F test for the equality of the group means for these data 5 MARKS A table of the Fisher-F distribution is provided on page 5 Entries in the table are the 005 tail quantile of the Fisher-F(ν, ν 2 ) distribution, for different values of ν and ν 2 (d) For the test in (c), name a test that you would carry out in order to check one of the three assumptions underlying the ANOVA F-test? 2 MARKS (e) If the sample size in this study was deemed not to be large enough for the approximate null distribution of statistic H to be valid, name a type of procedure that could be used to carry out a test of the hypothesis of interest 2 MARKS MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 8 of 6

Output for Question Analyst Between-Subjects Factors AgeGroup 2 3 4 5 N 3 9 20 6 7 Tests of Between-Subjects Effects Dependent Variable:0 Source Corrected Intercept AgeGroup Error Total Corrected Total Type III Sum of Squares 66550 a 29820 66550 9256 298640 X df 4 X 70 75 74 a R Squared = 258 (Adjusted R Squared = 26) Mean Square 6637 29820 6637 2732 F 6089 094 X Sig 002 Analyst 2 Summary Mode Adjusted R Std Error of l R R Square Square the Estimate 09 a 02-002 86804 a Predictors: (Constant), AgeMidPoint ANOVA b Regression Residual Total Sum of Squares 3067 254739 257806 a Predictors: (Constant), AgeMidPoint df 73 74 Mean Square 3067 3490 F 879 Sig 352 a b Dependent Variable: Y Coefficients a (Constant) AgeMidPoint a Dependent Variable: Y Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 667-04 05 05-09 643-938 05 352 MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 9 of 6

Output for Question 2: Part I Levene's Test of Equality of Error Variances a Dependent Variable:Y F df df2 Sig 38 9 40 9 Tests the null hypothesis that the error variance of the dependent variable is equal across groups a Design: Intercept + A + B + A * B Tests of Between-Subjects Effects Dependent Variable:Y Source Corrected Intercept A B A * B Error Total Type III Sum of Squares 2066 a 304425 70673 20227 9762 3283 546369 Corrected Total 24944 59 df 9 4 3 2 40 60 a R Squared = 87 (Adjusted R Squared = 809) Mean Square 087 304425 42668 6742 647 782 F 477 389255 54558 862 206 Parameter Estimates Sig 039 Dependent Variable:Y 95% Confidence Interval Parameter B Std Error t Sig Lower Bound Upper Bound Intercept 663 5 3258 002 63 2695 [A=] 887 722 228 227-573 2346 [A=2] 347 722 865 070-3 2806 [A=3] -2037 722-282 007-3496 -577 [A=4] -33 722-434 667-773 46 [A=5] 0 a [B=] 350 722 485 63-09 809 [B=2] -860 722-9 24-239 599 [B=3] -253 722-35 728-73 206 [B=4] 0 a [A=] * [B=] 327 02 350 003 53 528 [A=] * [B=2] 2947 02 2886 006 883 50 [A=] * [B=3] 2820 02 2762 009 756 4884 [A=] * [B=4] 0 a [A=2] * [B=] 843 02 805 079-22 3907 [A=2] * [B=2] 200 02 2056 046 036 464 [A=2] * [B=3] 297 02 29 773-767 236 [A=2] * [B=4] 0 a [A=3] * [B=] 507 02 475 48-557 357 [A=3] * [B=2] 430 02 400 69-634 3494 [A=3] * [B=3] 340 02 333 74-724 2404 [A=3] * [B=4] 0 a [A=4] * [B=] -480 02-470 64-2544 584 [A=4] * [B=2] 223 02 29 828-84 2287 [A=4] * [B=3] -77 02-73 864-224 887 [A=4] * [B=4] 0 a [A=5] * [B=] 0 a [A=5] * [B=2] 0 a [A=5] * [B=3] 0 a [A=5] * [B=4] 0 a a This parameter is set to zero because it is redundant MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 0 of 6

Output for Question 2: Part II 2 Levene's Test of Equality of Error Variances a Dependent Variable:Y F df df2 Sig 738 9 40 759 Tests the null hypothesis that the error variance of the dependent variable is equal across groups a Design: Intercept + A + B Tests of Between-Subjects Effects Dependent Variable:Y Source Corrected Intercept A B Error Total Type III Sum of Squares 90899 a 304425 70673 20227 5045 546369 Corrected Total 24944 59 df 7 4 3 52 60 a R Squared = 789 (Adjusted R Squared = 76) Mean Square 2727 304425 42668 6742 982 F 27782 3023 43467 6868 Sig 00 3 Tests of Between-Subjects Effects Dependent Variable:Y Source Corrected Intercept A Error Total Type III Sum of Squares 70673 a 304425 70673 727 546369 Corrected Total 24944 59 df 4 4 55 60 a R Squared = 705 (Adjusted R Squared = 684) Mean Square 42668 304425 42668 296 F 32927 234924 32927 Sig Parameter Estimates Dependent Variable:Y 95% Confidence Interval Paramet er B Std Error t Sig Lower Bound Upper Bound Intercept 472 329 448 84 23 [A=] 333 465 6740 220 4064 [A=2] 2407 465 579 475 3338 [A=3] -27 465-2620 0-249 -286 [A=4] -422 465-907 368-353 50 [A=5] 0 a a This parameter is set to zero because it is redundant MATH 204 PRINCIPLES OF STATISTICS II (2009) Page of 6

Output for Question 3: Part I Descriptive Statistics N Performance 400 Performance in Prev Yr 400 Average class size 398 Enrollment 400 Valid N (listwise) 398 Minimum 369 333 4 30 Maximum 940 97 25 570 Mean 64762 602 96 48347 Std Deviation 42249 4736 369 226448 Analysis (Constant) Performance in Prev Yr Average class size Enrollment a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 75968 945 37-05 Coefficients a 773 009 92 006 978 00-024 4424 0685 5-262 880 009 Analysis 2 (Constant) Performance in Prev Yr Average class size a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 72043 952-275 Coefficients a 7234 008 905 986-003 480 36-304 76 Analysis 3 (Constant) Performance in Prev Yr Enrollment a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 77599 946-05 Coefficients a 6700 009 006 978-023 58 09902-2630 009 Analysis 4 (Constant) Average class size Enrollment a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 337602 2604-23 9295 486 029 Coefficients a 208-34 3633 4444-7278 MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 2 of 6

Output for Question 3: Part II Analysis 5 (Constant) Performance in Prev Yr a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 66326 953 Coefficients a 589 008 985 2783 5236 Analysis 6 (Constant) Average class size a Dependent Variable: Performance Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 308337 775 9873 540 Coefficients a 7 323 3454 002 00 Analysis 7 (Constant) Enrollment Unstandardized Coefficients Standardized Coefficients B Std Error Beta t Sig 74425-200 a Dependent Variable: Performance 5933 030 Coefficients a -38 467-6695 400 400 200 200 Standardized Residual 000 Standardized Residual 000-200 -200-400 -400 400 600 Performance in Prev Yr 800 000 400 600 800 Unstandardized Predicted Value 000 Residual plots from fit in Analysis MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 3 of 6

Table for Question 4 Note: p is the total number of parameters including the intercept Formula SSE p G G2 X 6787 2 2 G + G2 + X + GG2 + GX + G2X 683 0 3 G + G2 + X + GG2 + GX 6936 9 4 G + G2 + X + GG2 + G2X 756 8 5 G + G2 + X + GX + G2X 30992 8 6 G + G2 + X + GG2 7593 7 7 G + G2 + X 332 5 8 G + G2 + GG2 253246 6 MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 4 of 6

Table of the Fisher-F distribution Entries in table are the α = 005 tail quantile of Fisher-F(ν, ν2) distribution ν given in columns, ν2 given in rows ν2\ν 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 6459950257224582306233992367723888240542488242982439244692453624595246462469224732247692480248324858248832490524926 2 85 900 96 925 930 933 935 937 938 940 940 94 942 942 943 943 944 944 944 945 945 945 945 945 946 3 03 955 928 92 90 894 889 885 88 879 876 874 873 87 870 869 868 867 867 866 865 865 864 864 863 4 77 694 659 639 626 66 609 604 600 596 594 59 589 587 586 584 583 582 58 580 579 579 578 577 577 5 66 579 54 59 505 495 488 482 477 474 470 468 466 464 462 460 459 458 457 456 455 454 453 453 452 6 599 54 476 453 439 428 42 45 40 406 403 400 398 396 394 392 39 390 388 387 386 386 385 384 383 7 559 474 435 42 397 387 379 373 368 364 360 357 355 353 35 349 348 347 346 344 343 343 342 34 340 8 532 446 407 384 369 358 350 344 339 335 33 328 326 324 322 320 39 37 36 35 34 33 32 32 3 9 52 426 386 363 348 337 329 323 38 34 30 307 305 303 30 299 297 296 295 294 293 292 29 290 289 0 496 40 37 348 333 322 34 307 302 298 294 29 289 286 285 283 28 280 279 277 276 275 275 274 273 484 398 359 336 320 309 30 295 290 285 282 279 276 274 272 270 269 267 266 265 264 263 262 26 260 2 475 389 349 326 3 300 29 285 280 275 272 269 266 264 262 260 258 257 256 254 253 252 25 25 250 3 467 38 34 38 303 292 283 277 27 267 263 260 258 255 253 25 250 248 247 246 245 244 243 242 24 4 460 374 334 3 296 285 276 270 265 260 257 253 25 248 246 244 243 24 240 239 238 237 236 235 234 5 454 368 329 306 290 279 27 264 259 254 25 248 245 242 240 238 237 235 234 233 232 23 230 229 228 6 449 363 324 30 285 274 266 259 254 249 246 242 240 237 235 233 232 230 229 228 226 225 224 224 223 7 445 359 320 296 28 270 26 255 249 245 24 238 235 233 23 229 227 226 224 223 222 22 220 29 28 8 44 355 36 293 277 266 258 25 246 24 237 234 23 229 227 225 223 222 220 29 28 27 26 25 24 9 438 352 33 290 274 263 254 248 242 238 234 23 228 226 223 22 220 28 27 26 24 23 22 2 2 20 435 349 30 287 27 260 25 245 239 235 23 228 225 222 220 28 27 25 24 22 2 20 209 208 207 2 432 347 307 284 268 257 249 242 237 232 228 225 222 220 28 26 24 22 2 20 208 207 206 205 205 22 430 344 305 282 266 255 246 240 234 230 226 223 220 27 25 23 2 20 208 207 206 205 204 203 202 23 428 342 303 280 264 253 244 237 232 227 224 220 28 25 23 2 209 208 206 205 204 202 20 20 200 24 426 340 30 278 262 25 242 236 230 225 222 28 25 23 2 209 207 205 204 203 20 200 99 98 97 25 424 339 299 276 260 249 240 234 228 224 220 26 24 2 209 207 205 204 202 20 200 98 97 96 96 26 423 337 298 274 259 247 239 232 227 222 28 25 22 209 207 205 203 202 200 99 98 97 96 95 94 27 42 335 296 273 257 246 237 23 225 220 27 23 20 208 206 204 202 200 99 97 96 95 94 93 92 28 420 334 295 27 256 245 236 229 224 29 25 22 209 206 204 202 200 99 97 96 95 93 92 9 9 29 48 333 293 270 255 243 235 228 222 28 24 20 208 205 203 20 99 97 96 94 93 92 9 90 89 30 47 332 292 269 253 242 233 227 22 26 23 209 206 204 20 99 98 96 95 93 92 9 90 89 88 3 46 330 29 268 252 24 232 225 220 25 2 208 205 203 200 98 96 95 93 92 9 90 88 88 87 32 45 329 290 267 25 240 23 224 29 24 20 207 204 20 99 97 95 94 92 9 90 88 87 86 85 MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 5 of 6

Table of the Binomial(5, /2) distribution s 0 2 3 4 5 6 7 8 9 0 P [S = s] 00 05 00032 0039 0047 0096 0527 0964 0964 0527 0096 P [S s] 00 05 00037 0076 00592 0509 03036 05000 06964 0849 09408 s 2 3 4 5 P [S = s] 0047 0039 00032 05 00 P [S s] 09824 09963 09995 0 0 Table of the Chisquared(ν) distribution Entries in table are the α = 005 and α = 00 tail quantiles ν 2 3 4 5 6 7 8 9 0 α = 005 384 599 785 9488 070 2592 4067 5507 699 8307 α = 00 6635 920 345 3277 5086 682 8475 20090 2666 23209 ν 2 3 4 5 6 7 8 9 20 α = 005 9675 2026 22362 23685 24996 26296 27587 28869 3044 340 α = 00 24725 2627 27688 294 30578 32 33409 34805 369 37566 The Chi-squared statistic takes one of the two forms depending on the test being carried out X 2 = k (O i E i ) 2 X 2 = E i i= r c (n ij n ij ) 2 i= j= n ij MATH 204 PRINCIPLES OF STATISTICS II (2009) Page 6 of 6