Final Exam. Name: Solution:

Size: px
Start display at page:

Download "Final Exam. Name: Solution:"

Transcription

1 Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1. Suppose the classical regression model holds, with 0 = 10, 1 = 1 and = 3. Graph the conditional distribution of Y when X = 5. Put numbers on the horizontal axis. Solution:

2 HW2. You can estimate the slope, 1, by using ordinary least squares (OLS), or by using least sum of absolute deviations (LAD). These will be different estimates of the same parameter. You can simulate many data sets according to the classical model and calculate both estimates for each simulated data set. Based on these simulations, how will you know that the OLS estimate of 1 is better than the LAD estimate of? Solution: The histograms of the OLS and LAD estimates that result from the simulations will show the OLS values tend to be closer to the value of 1. This will also be confirmed by the fact that the standard deviation of simulated OLS estimates will smaller than that of the LAD estimates.

3 HW3. Suppose you fit a regression model using lm and get the following results. Call: lm(formula = Y ~ X) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** X <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 1.159e+04 on 1 and 98 DF, p-value: < 2.2e-16 Give an approximate 95% confidence interval for 1. Solution: (0.092). Roughly 9.8 < 1 < 10.2.

4 Midterm. The Toluca study has Y = workhours and X = lotsize for n = 25 jobs. The standard error of ˆ 1 is the estimated standard deviation of ˆ 1. To have a standard deviation of ˆ 1, there must be many values of ˆ 1. In the context of the Toluca study, describe what those many values of ˆ 1 refer to. Solution: The observed workhours data values (25 of them) are just one sample of potentially observable workhours data values from the conditional distributions specified by the observed lotsize data values. Every such other sample of 25 workhours values, matched with the original lotsize values, will give a different ˆ 1. The many values of ˆ 1 refer to these values, one for each such randomly sampled data set.

5 HW4. Suppose you fit the model ln(y) = X +, and you get the following output from lm. Call: lm(formula = lny ~ X) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** X <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 2839 on 1 and 98 DF, p-value: < 2.2e-16 Give the back transformed estimated model to predict Y as a function of X. Solution: The predicted value of lny is X. So the predicted value of Y is exp( X).

6 HW5. Recall the charitable contributions data set. Let Y = ln(charitable.cont) (logarithm of charitable contributions) X1= DEPS (dependents), X2 = ln(income.agi) (logarithm of adjusted gross income), and consider the classical regression model Y = 0 + 1X1 + 2X2 +. Interpret the parameter 1 in this specific context. Solution: Consider two conditional distributions of Y: (i) Where X 1 = 2, and X 2 = 9. (ii) Where X 1 = 1, and X 2 = 9. The conditional mean of Y in case (i) is (2) + 2 (9) The conditional mean of Y in case (ii) is (1) + 2 (9). So 1 is the difference in conditional means of Y (= ln(charitable Cont)) for those two cases. In general, 1 is the increase in the mean of Y associated with one more DEP, holding ln(charitable Cont) fixed.

7 HW6. Consider the R simulation code X1 = rnorm(100) X2 = rnorm(100) Y = *X1 + 50*X2 + rnorm(100,0,4) summary(lm(y ~ X1 +X2)) There is an F test in the lm output. In this context of this simulation, what hypothesis is being tested by the F test? Is the hypothesis true or false in this case? Solution: The F test is testing the null (restricted) model where both 1 and 2 are zero. Here, 1 = 100 and 2 = 50, so the null model is false.

8 HW7. An estimated conditional mean function using an interaction model is Yˆ 2 3X 1X 4X X Both X1 and X2 are interval variables, both having ranges from 0 to 1. Graph (i) the estimated conditional mean of Y as a function of X1, when X2 = 0, and (ii) the estimated conditional mean of Y as a function of X1, when X2 = 1, on the same axes. Solution: When X2 = 0, Yˆ 2 3X1 1(0) 4 X1(0) 2 3X1. When X2 = 1, Yˆ 2 3X 1(1) 4 X (1) 3 X. Here is a graph: (Interaction is apparent in that the effect of X1 on Y depends strongly on the value of X2: When X2 = 0, X1 has a positive effect on Y, and when X2 =1, X1 has a negative effect on Y.)

9 HW8. Consider the data pronunciation data set. The Y variable is age and the X variable is region of the U.S., either West, South, North Central, or East. The region variable is coded using indicator variables and the model is Y = 0 + 1West + 2South + 3NorthCentral +. Interpret 0 and 1 in the context of this study. Solution. Consider the four groups: West: Mean of Y is South: Mean of Y is North Central: Mean of Y is East: Mean of Y is 0 So 0 is the mean of Y in the East, and 1 is the difference between the mean of Y in the West and the mean of Y in the East. (Note: Here the mean of Y is actually the probability of pronouncing day-tuh since the Y variable is binary.)

10 HW9. In the data pronunciation data set, the Y variable is Y=1 if the survey respondent pronounces data as day - tuh, and is Y=0 if the survey respondent pronounces data as daa - tuh. The X variable is the age of the survey respondent. Here are two fitted models. fit0 = glm(y ~ 1, family = "binomial") fit1 = glm(y ~ X, family = "binomial") Explain (i) how to get the likelihood ratio chi-square statistic from the log likelihoods of these two fitted models, and (ii) how the degrees of freedom are determined. Solution: (i) you get log likelihoods for the fits using loglik(fit0) and loglik(fit1). The chisquare statistic is 2(logLik(fit1)) 2(logLik(fit0)). (ii) There are two parameters in fit1 ( 0 and 1 of the logistic regression model) and one in fit0 (just 0), so the degrees of freedom is 2-1=1.

11 HW10. When data are count data (like in the financial planners example), why does the classical regression model give a bad estimate of the conditional distribution p(y x)? Draw a graph or graphs as part of your answer. Solution: The following graph shows the estimated p(y x) for a 45-year old Female, when using the classical model. This is obviously bad estimate of p(y x) because it predicts 1.4, 0.1, etc. financial planners might be used. It even predicts that negative financial planners might be used!

12 HW11. Suppose your data look like this: Y X , , , Briefly state pros and cons of using Poisson regression with these data. Which has more weight here, pro or con? (Recall: pro = benefit, con = disadvantage ) Solution: Pro: There are 0 s in the Y variable, and that is one indication that Poisson might be appropriate. Con: The non-zero numbers should not be too large. Here they are huge, and not capable of being explained by a Poisson model is which there are also 0 s. So the con has more weight here and one should not use Poisson regression.

13 HW12. You can construct a 90% prediction interval for your Y variable, given X = 10, using quantile regressions. Explain how to do this. Solution. Run two quantile regressions, one with = 0.05, and the other with = Plug X = 10 into the two estimated linear models to get the endpoints of the interval.

14 Quiz 1. When is E(Y X = x) truly equal to x? A. When there are three or more levels of the X variable B. When the test for linearity passes (p >.05) C. When E(Y X = x) increases as x increases D. When X and Y are independent (here, E(Y X = x) = 0 + x.) Quiz 2. What are the maximum likelihood estimates when you assume a Laplace distribution for p(y x)? A. Quantile regression estimates (assuming = 0.5) B. Ordinary least squares estimates C. Weighted least squares estimates D. Generalized least squares estimates Quiz 3. If ˆ 1 is an unbiased estimator of 1, then A. ˆ 1 is a good estimator of 1 B. ˆ 1 is sometimes equal to 1 C. ˆ 1 is close to 1 D. The mean of the probability distribution of potentially observable values of ˆ 1 is exactly equal to 1 Quiz 4. Under the classical regression model, what happens to the confidence interval for E(Y X = x) when n increases? A. It gets wider B. It approaches x 0 C. It approaches x 1.96 Quiz 5. When checking assumption an assumption using a testing (p value based) method, you A. reject the assumption when p >.05 B. reject the assumption when p <.05 C. fail to reject the assumption when p <.05 D. fail to accept the assumption when p >.05

15 Quiz 6. If there is heteroscedasticity, which statement must be true? A. Var(Y X = 20) = Var(Y X = 120) B. The hypothesis test rejects the homoscedastic model C. Var(Y X = 20) is significantly different from Var(Y X = 120) D. Var(Y X = a) is different from Var(Y X = b), for some values a and b. Quiz 7. When might you transform Y but not X? (Pick one answer only) A. When there are outliers in your X data B. When your conditional Y distribution is normal C. When E(Y X = x) is a nonlinear function of x D. When the Box-Cox method indicates that = 1 is a good choice Quiz 8. When is the transformation 1/Y easy to justify? A. When the units of measurement of Y are ratio units B. When the distribution of Y is lognormal C. When the Box-Cox method indicates that = 0 is a good choice D. When the Box-Cox method indicates that = 1 is a good choice Quiz 9. Suppose the data come from the model Y X = x ~ N( 0 + 1x, 2 ). Given X = x, the best predictor of Y is A x B x + C. ˆ ˆ 0 1 x D. ˆ ˆ ˆ 0 1x Quiz 10. Assuming the same model as in quiz 9 above, is the A. standard deviation of Y B. standard deviation of X C. standard deviation of Y when X = 10 D. standard deviation of X when Y = 10 Quiz 11. Which matrix tells you how much sample to sample variation there is in the potentially observable values of the OLS estimated regression coefficients? A. I B. (x x) -1 C. 2 I D. 2 (x x) -1

16 Quiz 12. Your regression model is Y = X X 2 +. Lack of multicollinearity in your regression model is indicated when R 2 is close to 0.0 for which of the following? (Select only one.) A. lm(y ~ X1) B. lm(y ~ X2) C. lm(y ~ X1 + X2) D. lm(x1 ~ X2) Quiz 13. Consider the model E(Y X1 = x1, X2 = x2) = 0 + 1x1 + 2x2. When graphed threedimensionally, this function looks like A. A plane B. A line C. A twisted plane D. A bell curve Quiz 14. Consider a two-way ANOVA with interaction to predict price of a home as a function of region (A or B) and whether the home has a cellar (Yes or No). The model is Price = 0 + 1Region.A + 2Cellar.Yes + 3 Region.A*Cellar.Yes + Region.A = 1 if the home is in Region A, and Region.A = 0 if the home is in region B. Cellar.Yes = 1 if the home has a cellar, and Cellar.Yes = 0 if the home does not have a cellar. The mean price of a home in Region B that has a cellar is Quiz 15. Two models were analyzed: fit1 = lm(charity ~ DEPS) fit2 = lm(charity ~ as.factor(deps)) Here, CHARITY is a measure of charitable contributions and DEPS is number of dependents claimed on a tax form, taking the values 0, 1, 2, 3, 4, 5 and 6 in the data set. Select one of the following choices. A. The correct functional specification assumption is true in fit1 B. The correct functional specification assumption is true in fit2 C. The normality assumption is true in fit1 D. The normality assumption is true in fit2 Quiz 16. Models to predict Hans graduate GPA using his GRE quant and verbal scores are fit0 = lm(gpa ~ 1) fit1 = lm(gpa ~ GRE.quant) fit2 = lm(gpa ~ GRE.quant + GRE.verbal) Which model gives a prediction having least bias? A. fit0 B. fit1 C. fit2

17 Quiz 17. Suppose the variance function is Var(Y X = x) = x. Using this function, the maximum likelihood estimates are weighted least squares estimates, with weights equal to A. B. 2 C. 1/x D. 1/x 2 Quiz 18. What is the benefit of using heteroscedasticity-consistent standard errors, as opposed to the ordinary standard errors, when there is heteroscedasticity? A. The percentage of 95% confidence intervals for 1 that contain 1 becomes closer to 95% B. The p-value for testing that 1 = 0 becomes exactly correct C. The linearity assumption becomes approximately valid D. The R 2 statistic becomes higher Quiz 19. If the probability is 0.75, then the odds ratio is A. 1/4 B. 4.0 C. 1/3 D. 3.0 Quiz 20. In the ordinal logistic regression model, Pr(Y = 1 X = x) = 1 2 1x). Thus, if 1 is negative, then the probability that Y = 1 A. increases as X increases B. decreases as X increases C. does not change as X increases Quiz 21. The negative binomial regression model (NBRM) requires you to estimate one more parameter than does the Poisson regression model (PRM). If you estimate a NBRM when the data generating process is correctly modelled as a PRM, you can expect that A. Your estimated parameters will have higher variances B. Your estimated parameters will be biased C. Your true parameters will have higher variances D. Your true parameters will be biased Quiz 22. Give an example of a censored data value. A. A data value that is known only to be more than 4.5 B. A data value that is an outlier C. A data value that that has been excluded from the study D. A data value from a lognormal distribution

18 Quiz 23. The Cox proportional hazards regression model is not fully parametric. What is nonparametric about the model? A. The mean is a nonparametric function of the X variables B. The baseline distribution is not a parametric distribution C. The variance is a nonparametric function of the X variables D. The hazard function is a nonparametric function of the X variables Quiz 24. When is an outlier in X space a serious problem? A. When you use OLS estimates B. When you use ML estimates C. When there is also a large absolute standardized residual D. When there is also a small Cook s D statistic Quiz 25. Consider the classical regression model Y X = x ~ N( 0 + 1x, 2 ). Assuming this model is true, the function that relates the quantile of the distribution of Y to X = x is y0.975 = 0 + 1x B x 1.96 C x D x Quiz 26. Neural networks have what advantage over the classical linear regression model? A. They are more easily interpreted B. They allow nonlinear conditional mean functions with interactions C. They allow non-normal distributions D. They allow heteroscedasticity Quiz 27. Which is called p hacking? A. Trying different Winsorizing thresholds (eg, 95%, 98%, 99%, 99.5% etc.) until your desired result becomes statistically significant. B. Trying different models (eg lm(y ~ X1 + X2), lm(y ~ X1 + X3), lm(y ~ X1 + X2 + X3), lm(y ~ X1 + X2 + X3 + X4)), until you get a p-value for X1 that is less than C. Trying models where you violate the inclusion principle in the hope of proving your desired conclusion. D. All of the above.

Closed book, notes and no electronic devices. 10 points per correct answer, 20 points for signing your name.

Closed book, notes and no electronic devices. 10 points per correct answer, 20 points for signing your name. Quiz 1. Name: 10 points per correct answer, 20 points for signing your name. 1. Pick the correct regression model. A. Y = b 0 + b 1X B. Y = b 0 + b 1X + e C. Y X = x ~ p(y x) D. X Y = y ~ p(y x) 2. How

More information

ISQS 5349 Final Exam, Spring 2017.

ISQS 5349 Final Exam, Spring 2017. ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC

More information

ISQS 5349 Spring 2013 Final Exam

ISQS 5349 Spring 2013 Final Exam ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices

More information

Quoting from the document I suggested you read (http://courses.ttu.edu/isqs5349 westfall/images/5349/practiceproblems_discussion.

Quoting from the document I suggested you read (http://courses.ttu.edu/isqs5349 westfall/images/5349/practiceproblems_discussion. Spring 14, ISQS 5349 Midterm 1. Instructions: Closed book, notes and no electronic devices. Put all answers on scratch paper provided. Points (out of 100) are in parentheses. 1. (20) Define regression

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1.(10) What is usually true about a parameter of a model? A. It is a known number B. It is determined by the data C. It is an

More information

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. ISQS 5347 Final Exam Spring 2017 Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. 1. Recall the commute

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Generalized linear models

Generalized linear models Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data

More information

Modeling Overdispersion

Modeling Overdispersion James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections

Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections 3.4 3.6 by Iain Pardoe 3.4 Model assumptions 2 Regression model assumptions.............................................

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

17. Introduction to Tree and Neural Network Regression

17. Introduction to Tree and Neural Network Regression 17. Introduction to Tree and Neural Network Regression As we have repeated often throughout this book, the classical multiple regression model is clearly wrong in many ways. One way that the model is wrong

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate Announcements Announcements Lecture : Simple Linear Regression Statistics 1 Mine Çetinkaya-Rundel March 29, 2 Midterm 2 - same regrade request policy: On a separate sheet write up your request, describing

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

Simple Linear Regression Using Ordinary Least Squares

Simple Linear Regression Using Ordinary Least Squares Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression

More information

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables Announcements Announcements Unit : Simple Linear Regression Lecture : Introduction to SLR Statistics 1 Mine Çetinkaya-Rundel April 2, 2013 Statistics 1 (Mine Çetinkaya-Rundel) U - L1: Introduction to SLR

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Introduction to Regression

Introduction to Regression Introduction to Regression ιατµηµατικό Πρόγραµµα Μεταπτυχιακών Σπουδών Τεχνο-Οικονοµικά Συστήµατα ηµήτρης Φουσκάκης Introduction Basic idea: Use data to identify relationships among variables and use these

More information

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 Answer Key February 17, 1995 Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

Regression and the 2-Sample t

Regression and the 2-Sample t Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Diagnostics and Transformations Part 2

Diagnostics and Transformations Part 2 Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Multiple Regression. Peerapat Wongchaiwat, Ph.D. Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Stat 401B Final Exam Fall 2016

Stat 401B Final Exam Fall 2016 Stat 40B Final Exam Fall 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

Inference and Regression

Inference and Regression Name Inference and Regression Final Examination, 2015 Department of IOMS This course and this examination are governed by the Stern Honor Code. Instructions Please write your name at the top of this page.

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold. Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Spring 2015 Instructor: Martin Farnham Unless provided with information to the contrary,

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Simple linear regression

Simple linear regression Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate Review and Comments Chi-square tests Unit : Simple Linear Regression Lecture 1: Introduction to SLR Statistics 1 Monika Jingchen Hu June, 20 Chi-square test of GOF k χ 2 (O E) 2 = E i=1 where k = total

More information

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your

More information

ORF 245 Fundamentals of Engineering Statistics. Final Exam

ORF 245 Fundamentals of Engineering Statistics. Final Exam Princeton University Department of Operations Research and Financial Engineering ORF 245 Fundamentals of Engineering Statistics Final Exam May 22, 2008 7:30pm-10:30pm PLEASE DO NOT TURN THIS PAGE AND START

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information