EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Similar documents
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Ismor Fischer, 1/11/

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

University of California, Los Angeles Department of Statistics. Simple regression analysis

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Properties and Hypothesis Testing

1 Inferential Methods for Correlation and Regression Analysis

11 Correlation and Regression

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Statistics 20: Final Exam Solutions Summer Session 2007

Stat 139 Homework 7 Solutions, Fall 2015

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Lecture 11 Simple Linear Regression

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

Simple Linear Regression

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Regression, Inference, and Model Building

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Describing the Relation between Two Variables

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Common Large/Small Sample Tests 1/55

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Correlation Regression

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

MATHEMATICS Paper 2 22 nd September 20. Answer Papers List of Formulae (MF15)

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Comparing your lab results with the others by one-way ANOVA

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Efficient GMM LECTURE 12 GMM II

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

MA238 Assignment 4 Solutions (part a)

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

Data Analysis and Statistical Methods Statistics 651

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

Linear Regression Models

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Chapter 12 Correlation

Chapter 13, Part A Analysis of Variance and Experimental Design

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

STP 226 EXAMPLE EXAM #1

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Stat 200 -Testing Summary Page 1

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Topic 9: Sampling Distributions of Estimators

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Final Examination Solutions 17/6/2010

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Simple Linear Regression

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Sample Size Determination (Two or More Samples)

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

University of California, Los Angeles Department of Statistics. Hypothesis testing

Mathematical Statistics - MS

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM

Chapter two: Hypothesis testing

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Read through these prior to coming to the test and follow them when you take your test.

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Additional Notes and Computational Formulas CHAPTER 3

STATISTICAL INFERENCE

Chapter 22: What is a Test of Significance?

(7 One- and Two-Sample Estimation Problem )

MATHEMATICAL SCIENCES

Working with Two Populations. Comparing Two Means

MA 575, Linear Models : Homework 3

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Chapter 6 Sampling Distributions

Transcription:

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries 0 marks. The umber of marks allotted for each part-questio is show i brackets. Graph paper ad Official tables are provided. Cadidates may use calculators i accordace with the regulatios published i the Society's "Guide to Examiatios" (documet Ex1). The otatio log deotes logarithm to base e. Logarithms to ay other base are explicitly idetified, e.g. log10. Note also that r is the same as C. r 1 HC Module 4 017 This examiatio paper cosists of 8 prited pages. This frot cover is page 1. Questio 1 starts o page 3. RSS 017 There are 4 questios altogether i the paper.

BLANK PAGE

1. A gas geeratio plat distills liquid air to produce oxyge. The percetage purity of the oxyge is thought to be liearly related to the amout of impurities i the air, as measured by the "pollutio cout" i parts per millio by volume (ppm). The followig data were collected o 15 successive days. Purity (%) y 93.3 9.0 9.4 91.7 94.0 94.6 93.6 93.1 Pollutio cout (ppm) x 1.10 1.45 1.36 1.59 1.08 0.75 1.0 0.99 Purity (%) y 93. 9.9 9. 91.3 90.1 91.6 91.9 Pollutio cout (ppm) x 0.83 1. 1.47 1.81.03 1.75 1.68 (i) Fit a liear regressio to the data usig xi 0.31, y i 1387.9, S 1.959 56, S 5.6846 ad S 18.8693, where S ( x x ), xx xy S ( x x )( y y) ad S yy ( yi y). xy i i yy xx i (4) Costruct the ANOVA table for this model ad perform the appropriate hypothesis test usig the 0.1% sigificace level. Hece write dow the value of s, the estimate of the error variace. (9) (iii) Fid a 95% cofidece iterval for the slope of the regressio equatio. (3) (iv) Fid a 90% cofidece iterval for the mea purity o a day whe the pollutio cout is 1.00. (4) [You may use the fact that the estimated variace for the predicted mea respose ˆ ˆ 1 ( x0 x) x0 is s S xx.] 3

. (a) Sketch scatter diagrams to illustrate the followig features of bivariate data. Commet briefly o each of your plots. (i) (iii) Strog positive associatio, appropriately reflected by the product momet correlatio coefficiet. () Weak egative associatio, appropriately reflected by the product momet correlatio coefficiet. () Strog egative associatio, appropriately reflected by Spearma's rak correlatio coefficiet, but less satisfactorily by the product momet correlatio coefficiet. () (iv) Strog associatio with a o-mootoic tred. () (b) The followig table shows diastolic (DBP) ad systolic (SBP) blood pressure measuremets (i mm Hg) for 10 radomly chose cardiac patiets. DBP 55 60 70 75 80 85 90 95 105 110 SBP 15 115 10 135 105 145 130 00 190 150 S 96.5, S 3437.5 ad S 880.5, where S ( x x ), xx xy S ( x x )( y y) ad S yy ( yi y). xy i i yy xx i (i) Calculate the sample product-momet correlatio coefficiet of these data. Test at the 1% level the ull hypothesis that = 0 agaist the alterative hypothesis that > 0, where is the populatio value of the product momet correlatio coefficiet. State ay assumptios made i performig the test. (6) Calculate the value of Spearma's rak correlatio coefficiet for these data, ad carry out the correspodig test. State your coclusios clearly. (6) 4

3. A experimet was coducted to examie the effect of differet lightig coditios o the umber of eggs laid by a certai breed of chickes. The treatmets were O : cotrol (atural daylight), E : exteded day (atural daylight exteded by artificial light to a total of 14 hours), F : flashlight (atural daylight plus flashes of light every 0 secods through the ight). Twelve pes each cotaiig 6 chickes were radomly allocated to the three treatmets. The total umber of eggs laid i a give period was recorded as follows. You are give that jth pe uder the ith treatmet. O 330 88 95 313 E 37 340 343 341 F 359 337 373 30 3 4 yij 1337 535, where ij j 1 y is the umber of eggs laid i the (i) (iii) Write dow a appropriate model for these data, explaiig fully all the terms i the model. State ay assumptios that are made for this model. (4) Draw up a Aalysis of Variace table for this model ad test for differeces i the treatmets at the 10% sigificace level. (11) Test whether there are treatmet differeces betwee the exteded day (E) ad flashlight (F) at the 5% sigificace level. (5) 5

4. Data for the first year box office receipts (Y) have bee collected for a umber of movies. A project to model these receipts collects data o the total productio costs (Xl), promotioal costs (X) ad ay associated book sales (X3), all data beig measured i millios of US dollars. Cosider the edited computer output from three regressio models, labelled A, B ad C, as give below ad o the ext page. (i) (iii) (iv) Briefly commet o the scatter plots, i Figures 1, ad 3, of the observed agaist fitted y for the three models. Relate your commets i each case to s, the square root of the mea square error. (4) I Model A, test for the global sigificace, at the 5% level, of the regressio model, statig clearly the ull ad alterative hypotheses that you are testig. State the ull distributio of the test statistic. Iterpret the statemet "Multiple R-Squared: 0.9668" ad explai how this quatity is calculated. (5) By cosiderig the output from all three models, say which of the explaatory variables should be icluded i the model ad justify your aswer usig appropriate t tests. (6) Which of the three models do you cosider best describes the data? Justify your aswer. (5) Model A: Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) 7.6760 6.760 1.135 0.995 x1 3.6616 1.1178 3.76 0.0169 * x 7.611 1.6573 4.598 0.0037 ** x3 0.885 0.5394 1.536 0.1754 --- Sigif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual stadard error: 7.541 o 6 degrees of freedom Multiple R-squared: 0.9668, Adjusted R-squared: 0.950 F-statistic: 58. o 3 ad 6 DF, p-value: 7.913e-05 Model B: Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) 11.848 6.765 1.751 0.1334 x1 4.8 1.153 3.667 0.00800 ** x 7.436 1.806 4.117 0.00448 ** --- Sigif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual stadard error: 8.41 o 7 degrees of freedom Multiple R-squared: 0.9537, Adjusted R-squared: 0.9405 F-statistic: 7.14 o ad 7 DF, p-value:.131e-05 6

Model C: Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) 3.163 9.65.407 0.047 * x 1.669 1.771 7.155 9.66e-05 *** --- Sigif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual stadard error: 13.17 o 8 degrees of freedom Multiple R-squared: 0.8648, Adjusted R-squared: 0.8479 F-statistic: 51.19 o 1 ad 8 DF, p-value: 9.665e-05 Figure 1 Figure Model A Scatterplot of Y vs fitted values Model B Scatterplot of Y vs fitted values Figure 3 Model C Scatterplot of Y vs fitted values 7

BLANK PAGE 8