UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

Similar documents
UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Inference for Regression

ST430 Exam 1 with Answers

ST430 Exam 2 Solutions

A discussion on multiple regression models

Density Temp vs Ratio. temp

Stat 401B Exam 3 Fall 2016 (Corrected Version)

MATH 644: Regression Analysis Methods

Practice Final Examination

Stat 5102 Final Exam May 14, 2015

Exam 3 Practice Questions Psych , Fall 9

Coefficient of Determination

SCHOOL OF MATHEMATICS AND STATISTICS

Lecture 6 Multiple Linear Regression, cont.

Final Exam. Name: Solution:

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Chapter 12: Linear regression II

CAS MA575 Linear Models

Multiple Linear Regression

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

STAT 350: Summer Semester Midterm 1: Solutions

Tests of Linear Restrictions

1 The Classic Bivariate Least Squares Model

8 Nonlinear Regression

Lecture 4 Multiple linear regression

Chaos, Complexity, and Inference (36-462)

Stat 401XV Final Exam Spring 2017

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

STAT 3022 Spring 2007

Lecture 18: Simple Linear Regression

STATISTICS 3A03. Applied Regression Analysis with SAS. Angelo J. Canty

Introduction and Single Predictor Regression. Correlation

MODELS WITHOUT AN INTERCEPT

Stat 401B Exam 2 Fall 2016

General Linear Statistical Models - Part III

Chapter 16: Understanding Relationships Numerical Data

R 2 and F -Tests and ANOVA

Chapter 3 - Linear Regression

Exam Applied Statistical Regression. Good Luck!

lm statistics Chris Parrish

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

Multiple comparison procedures

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

We d like to know the equation of the line shown (the so called best fit or regression line).

Linear Modelling: Simple Regression

Stat 401B Final Exam Fall 2015

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Unit 6 - Introduction to linear regression

Multiple Regression and Regression Model Adequacy

Statistics for Engineers Lecture 9 Linear Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression

Unit 6 - Simple linear regression

Announcements. Unit 7: Multiple linear regression Lecture 3: Confidence and prediction intervals + Transformations. Uncertainty of predictions

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 2: Multiple Linear Regression Introduction

Linear Model Specification in R

Regression. Marc H. Mehlman University of New Haven

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

36-707: Regression Analysis Homework Solutions. Homework 3

L21: Chapter 12: Linear regression

Stat 401B Final Exam Fall 2016

22s:152 Applied Linear Regression

Comparing Nested Models

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STAT 572 Assignment 5 - Answers Due: March 2, 2007

Lecture 10: F -Tests, ANOVA and R 2

Dealing with Heteroskedasticity

MS&E 226: Small Data

Nonlinear Regression

Operators and the Formula Argument in lm

Regression and Models with Multiple Factors. Ch. 17, 18

Introduction to Linear Regression

Regression on Faithful with Section 9.3 content

Applied Regression Analysis

Stat 401B Exam 2 Fall 2015

Lecture 1 Intro to Spatial and Temporal Data

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018

Chaper 5: Matrix Approach to Simple Linear Regression. Matrix: A m by n matrix B is a grid of numbers with m rows and n columns. B = b 11 b m1 ...

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Practice 2 due today. Assignment from Berndt due Monday. If you double the number of programmers the amount of time it takes doubles. Huh?

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Swarthmore Honors Exam 2012: Statistics

Example: 1982 State SAT Scores (First year state by state data available)

General Linear Statistical Models

Principal components

1 Introduction 1. 2 The Multiple Regression Model 1

Business Statistics. Lecture 10: Course Review

Explanatory Variables Must be Linear Independent...

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Lecture 19: Inference for SLR & Transformations

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015

Motor Trend Car Road Analysis

Math 2311 Written Homework 6 (Sections )

Transcription:

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD level. 1. Data from Kelley s Blue Book can be used to model the price of a used car. This problem uses the predictors Make and Mileage and ignores other predictors. Altogether there are 804 lines of data. A few of them are: Price Mileage Make MakeName 17314 8221 1 Buick 17542 9135 1 Buick 51154 2202 2 Cadillac 11096 20334 3 Chevrolet The codes for Make are 1 Buick, 2 Cadillac, 3 Chevrolet, 4 Pontiac, 5 Saab, and 6 Saturn. This problem ignores makes other than these six. Fitting a model in R yielded the following result. > summary ( lm ( Price $\sim$ Mileage + Make, data=cars ) ) Call: lm(formula = Price ~ Mileage + Make, data = cars) Residuals: Min 1Q Median 3Q Max -14194-7008 -3753 6563 44852 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 2.790e+04 1.237e+03 22.549 < 2e-16 *** Mileage -1.681e-01 4.184e-02-4.018 6.42e-05 *** Make -9.488e+02 2.579e+02-3.680 0.000249 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 9714 on 801 degrees of freedom Multiple R-squared: 0.03675,Adjusted R-squared: 0.03434 F-statistic: 15.28 on 2 and 801 DF, p-value: 3.079e-07 (a) 5 pts What went wrong and how would you attempt to fix it? (b) 5 pts Sketch a scatterplot with Mileage on the x-axis and Price on the y-axis that indicates the presence of an interaction between Make and Mileage. You needn t show all 804 data points, just enough to illustrate an interaction. Explain how your plot shows an interaction. (c) 5 pts How would you tell R to fit a model with a Make by Mileage interaction? (d) 5 pts We might treat these six makes as a sample of size six from the population of all makes. And we might think that each make in the population has its own linear relationship between Price and Mileage. What kind of model could we use to learn, from these six makes, about the population of linear relationships? Be as specific as you can. (e) 5 pts How would you tell R to fit such a model? 1

2

2. The scatterplot below shows the Age and Length of smallmouth bass a fish species caught in West Bearskin Lake in Minnesota. This problem uses just the six and seven year-old fish; the full data set contains younger and older fish too. 200 250 300 350 6 7 Age Length A linear model lm ( Length ~ Age ) yields Estimate Std. Error t value Pr(> t ) (Intercept) 148.971 34.101 4.369 2.36e-05 *** Age 17.271 5.303 3.257 0.0014 ** (a) 5 pts What is the formula for the t statistic for Age? (b) 10 pts The t statistic for Age is large and the p-value is small, indicating that Age is a useful predictor of Length. Yet there is lots of overlap between the Lengths of six year-old and seven year-old fish. How can that be? 3

3. Sometimes we know that a linear regression line goes through the origin so we would adopt a model Y i = βx i + ɛ i. (a) 10 pts Under the usual assumptions, derive the least-squares estimator of β. (b) 15 pts Derive the variance of the estimator. 4

4. In nondestructive testing of aluminum blocks, an electromagnetic probe is used to detect flaws below the surface. The sensitivity Y of the probe is known to be related to the thickness X of the wire used to construct the coil in the probe. An investigator interested in understanding this relationship collected a random sample of measurements of thickness and sensitivity. For the thickness values considered in the experiment, suppose the assumptions for the nonlinear regression hold with the regression function Y on X given by E(Y X = x) = µ Y (x) = β 1 (1 e e(β 2 +β 3 x) ). Use the following R output, answer the following questions: (a) 10 pts What are the least squares estimates of the parameters β 1, β 2, β 3 and σ? How are the corresponding estimators defined? (b) 10 pts If β 1 and β 2 are known to be positive, then show that no matter how thin the wire used, the average sensitivity can never exceed β 1 (1 e e β 2 ). Denote this upper bound for the sensitivity by θ. Estimate θ. (c) 15 pts Obtain one-at-a-time 95% two-sided approximate confidence intervals for β 1 and β 2. Using these estimates, obtain an approximate confidence interval for θ with confidence coefficient greater than or equal to 0.90 (use the Bonferroni inequality which states that P (A B) 1 P (A c ) P (B c )). Formula: Sensitivity ~ beta_1 * (1 - exp(-exp(-(beta_2 + beta_3 * Thickness)))) Parameters: Estimate Std. Error t value Pr(> t ) beta_1 1.9484 0.4725 4.124 0.001199 ** beta_2-1.2699 0.6809-1.865 0.084902. beta_3 14.3630 2.7038 5.312 0.000141 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 0.1025 on 13 degrees of freedom Number of iterations to convergence: 6 Achieved convergence tolerance: 8.168e-06 5

6