STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14. Your Name:

Similar documents
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

22s:152 Applied Linear Regression. Chapter 2: Regression Analysis. a class of statistical methods for

STAT5044: Regression and Anova

What is a Hypothesis?

12 Modelling Binomial Response Data

Exam details. Final Review Session. Things to Review

(c) Interpret the estimated effect of temperature on the odds of thermal distress.

STAT5044: Regression and Anova. Inyoung Kim

Correlation and Simple Linear Regression

Linear regression is designed for a quantitative response variable; in the model equation

Final Exam - Solutions

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

Institute of Actuaries of India

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Ph.D. Preliminary Examination Statistics June 2, 2014

2 Prediction and Analysis of Variance

STAT5044: Regression and Anova

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

The Flight of the Space Shuttle Challenger

Formal Statement of Simple Linear Regression Model

Stat 500 Midterm 2 8 November 2007 page 0 of 4

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Ch 3: Multiple Linear Regression

Lab 11 - Heteroskedasticity

Hypothesis testing. Data to decisions

Inferences for Regression

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

0.3. Proportion failing Temperature

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Unit 9: Inferences for Proportions and Count Data

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Tests about a population mean

Midterm 2 - Solutions

Correlation Analysis

Statistics 5100 Spring 2018 Exam 1

Unit 9: Inferences for Proportions and Count Data

Multivariate Regression (Chapter 10)

Exercise Sheet 6: Solutions

ORF 245 Fundamentals of Engineering Statistics. Final Exam

Test Yourself! Methodological and Statistical Requirements for M.Sc. Early Childhood Research

Open book and notes. 120 minutes. Covers Chapters 8 through 14 of Montgomery and Runger (fourth edition).

MATH 644: Regression Analysis Methods

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

One-Way Tables and Goodness of Fit

CHAPTER 10. Regression and Correlation

Concordia University (5+5)Q 1.

Chapter 7, continued: MANOVA

Regression Models - Introduction

SMAM 314 Exam 49 Name. 1.Mark the following statements true or false (10 points-2 each)

Mathematical Notation Math Introduction to Applied Statistics

BIOS 312: Precision of Statistical Inference

Econ 583 Final Exam Fall 2008

STAT Final Practice Problems

Midterm 2 - Solutions

Exercise Sheet 5: Solutions

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

STAT 7030: Categorical Data Analysis

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name

STAT 350 Final (new Material) Review Problems Key Spring 2016

Answers: Problem Set 9. Dynamic Models

Chapter 8: Simple Linear Regression

Chapters 9 and 10. Review for Exam. Chapter 9. Correlation and Regression. Overview. Paired Data

Simple Linear Regression

Mathematical statistics

Sleep data, two drugs Ch13.xls

Hypothesis Testing for Var-Cov Components

Chapter 1 Linear Regression with One Predictor

Chapter 2 Multiple Regression I (Part 1)

STAT 501 EXAM I NAME Spring 1999

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

a. When a data set is not normally distributed, what should you try in order to appropriately make statistical tests on that data?

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Textbook Examples of. SPSS Procedure

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Topic 21 Goodness of Fit

This document contains 3 sets of practice problems.

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

Introduction to the Generalized Linear Model: Logistic regression and Poisson regression

STAT 461/561- Assignments, Year 2015

16.400/453J Human Factors Engineering. Design of Experiments II

Statistical Inference. Part IV. Statistical Inference

Introduction to statistical modeling

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS

Stat 231 Exam 2 Fall 2013

Transcription:

STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14 Your Name: Please make sure to specify all of your notations in each problem GOOD LUCK! 1

Problem# 1. Consider the following model, y i = β 0 + β 1 x 1i + β 2 x 2 1i + β 3 x 2i + σε i, i = 1,...,n where E(ε i ) = 0, Var(ε i ) = 1, Cov(ε i,ε j ) = 0. Parameters (β 0, β 1, β 2, β 3, σ 2 ) are unknown. (a) Under the assumption that ε i are iid and normally distributed, we want to test H 0 : β 0 + β 1 β 2 + β 3 = 0 vs H a : not H 0. What is your test statistic and its distribution? Justify why your test statistic follows this distribution and provide your decision rule. Be sure to define all terms. (b) We found that there is a significant evidence to reject H 01 in (b). We then fit the simple linear model between y and x 1. We found that the p-value obtained from the Shapiro-Wilk test was still smaller than the significant level 0.05. However, by taking the log transformation of y, we obtained a larger p-value from the Shapiro-Wilk test. Hence we consider the following simple linear regression of log(y) on x 1 log(y i ) = γ 0 + γ 1 x 1i + σε i. From this regression model, we observe that the p-value from Breusch-Pagan test is smaller than the significant level. The scatter plot between abs-residuals and x shows that there is a strong linear relationship between them. Under this situation, we want to obtain a prediction interval of y new for a given new x 1,new. Explain in a step by step manner how to obtain the prediction interval of y new. Be sure to define all terms. 2

Problem# 2. An experiment analyzes imperfection rates for two processes used to fabricate silicon wafers for computer chips. For treatment A applied to 10 wafers, the numbers of imperfections are 8, 7, 6,6,3,4,4,7,2,3,4. Treatment B applied to 10 other wafers has 9,9,8,14,8,13,11,5,7,6 imperfections. An experimenter wants to know whether imperfection rates for two treatments are the same or not. (a) Construct generalized linear model (GLM) for this experiment. (b) Based on your GLM in (a), obtain likelihood. (c) Explain in a step by step manner how to estimate parameter. Interpret your parameters. (d) What are the distributions of your parameters? Explain in a step by step manner how to calculate confidence interval for your parameter estimators. (e) Explain in a step by step manner how to use the confidence interval to know whether imperfection rates for two treatments are the same or not. 3

Problem# 3. For each problem, identify what analysis or test you can perform. If necessary, give model, null hypothesis, test statistics, and decision rule. Explain how you estimate parameters in your models in detail. (Please make sure to specify all of your notations and index.) (i) The data refer to 10 army corps, each observed for 20 years. In 109 corps-years of exposure, there were no deaths, in 65 corps-years there was one death and so on. We would like to test whether probabilities of occurrences in these five categories follow a Poisson distribution. Number of Number of Deaths Corps-Years 0 109 1 65 2 22 3 3 4 1 5 0 (ii) A sample of 100 women suffer from dysmenorrhea. A new analgesic is claimed to provide greater relief than a standard one. After using each analgesic in a crossover experiment, 40 reported greater relief with the standard analgesic and 60 reported greater relief with the new one. (iii) A study of homicides in a given year for a sample of cities might model the homicide rate, defined for a city as its number of homicides that year divided by its population size. We want to know how the rate depends on the city s unemployment rate, its residents median income, and the percentage of residents having completed high school. (iv) The dataset from Dalal, Fowlkes, and Hoadley contains data on O-rings on 23 U.S. space shuttle missions prior to the Challenger disaster of January 20, 1986. For each of the previous missions, the temperature at take-off and the pressure of a prelaunch test were recorded, along with the number of O-rings that filed out of six. Use these data to try to understand the probability of failure as a function of temperature, and of temperature and pressure. We want to estimate the probability of failure of an O-ring when the temperature was 31 F, the launch temperature on January 20, 1986. We also want to predict the the probability of failure of an O-ring when the temperature is 31 F, the launch temperature on January 20, 1988. 4

Some formula Large sample distribution of log of relative risk (r) log(ˆr) N[log( π 1 π 2 ),( 1 π 1 π 1 n 1 π 2 π 2 n 2 )] Large sample distribution of log of odds ratio (θ) log(ˆθ) N[log(θ),( 1 nπ 11 nπ 12 nπ 21 nπ 22 )] where n is the total sample size and π i j is the proportion of cell (i,j) in contingency table 5