Closed book, notes and no electronic devices. 10 points per correct answer, 20 points for signing your name.

Size: px
Start display at page:

Download "Closed book, notes and no electronic devices. 10 points per correct answer, 20 points for signing your name."

Transcription

1 Quiz 1. Name: 10 points per correct answer, 20 points for signing your name. 1. Pick the correct regression model. A. Y = b 0 + b 1X B. Y = b 0 + b 1X + e C. Y X = x ~ p(y x) D. X Y = y ~ p(y x) 2. How is prediction different from forecasting? A. Prediction requires the normality assumption, forecasting does not B. Prediction can refer to the past, forecasting always refers to the future C. Prediction uses probabilistic models, forecasting uses deterministic models D. There is no difference between the two 3. The distribution p(y x) refers to A. A population of Y values B. A sample of Y values C. Potentially observable Y data D. Actually observed Y data 4. The experiment where a student was allowed to pick where they stand before attempting to throw the ball in the trash can was a experiment. A. Fixed-X B. Random-X C. randomized D. pre-test / post-test 5. The data set containing students GPAs and GMAT scores was separated into two groups. What were those groups? A. Males and Females B. Masters and Ph.D. students C. Business and non-business majors D. High GPA and low GPA students 6. What is a statistical model? A. A recipe for producing random data B. Something you calculate from data C. A deterministic equation 7. When is the linearity assumption in regression valid? A. When there are three or more levels of the X variable B. When the test for linearity passes (p >.05) C. When Y increases as X increases D. None of the above 8. What does LOWESS estimate? A. the conditional mean function B. the conditional variance function C. the conditional distribution function D. the autocorrelation function

2 Quiz 2. Name: 9. Data (xi, yi) are produced by the classic model Y = β0 + β1 X + ε, where ε ~ N(0, σ 2 ). Which line minimizes the sum of squared errors (SSE)? A. The line β0 + β1 x, where β0 and β1 are the true values of the parameters B. The line ˆ β ˆ 0 + β 1 x, where 0 ˆβ and ˆβ 1are the maximum likelihood estimates assuming the classic model C. The line 0 + 1x D. The line 1 + 0x 10. When should you use maximum likelihood estimates based on the Laplace distribution? A. When your distributions p(y x) are skewed B. When your distributions p(y x) are outlier-prone C. When your distributions p(y x) are discrete D. When your distributions p(y x) are normal 11. What distribution p(y x) is assumed in the Gauss-Markov Theorem? A. Gaussian B. Markovian C. Laplacian D. No particular distribution is assumed 12. What does the Gauss-Markov Theorem tell you? A. OLS is better than ML B. ML is better than OLS C. OLS is best among linear, unbiased estimators D. ML is best among linear, unbiased estimators

3 Quiz 3. Name: 1. In the discussion of 1 as an unbiased estimator of β 1, what is true? A. ˆβ 1 is random, and β 1 is random B. ˆβ 1 is fixed, and β 1 is random C. ˆβ 1 is random, and β 1 is fixed D. ˆβ 1 is fixed, and β 1 is fixed 2. If ˆβ 1 is an unbiased estimator of β 1, then A. ˆβ 1 is equal to β 1 B. ˆβ 1 is sometimes equal to β 1 C. ˆβ 1 is close to β 1 D. None of the above 3. Recall that the errors are ε i = {Y i (β 0 + β 1 x i)}, and the residuals are e i = {Y i ( ˆβ ˆβ x i)}. Pick the true statement about these terms, as they apply to the real world (not simulation). A. The {ε i} are observable, and the {e i} are observable B. The {ε i} are not observable, and the {e i} are observable C. The {ε i} are observable, and the {e i} are not observable D. The {ε i} are not observable, and the {e i} are not observable 4. Your estimated ˆβ 1 is equal to 3.4, with a standard error of 1.2. What is 1.2? A. The standard deviation of 3.4. B. The estimated standard deviation of 3.4 C. The standard deviation of potentially observable values of ˆβ 1 D. The estimated standard deviation of potentially observable values of ˆβ 1

4 Quiz 4. Name: 1. The exact 95% confidence interval for β1 in the Toluca case (lotsize, workhours) used the critical from the T distribution with 23 degrees of freedom. That critical value was A B. 2.0 C Consider the Toluca case, where X = lotsize and Y = workhours. The expression refers to E(Y X = 20) A. the actual workhours that will be observed on a single job where lotsize = 20. B. the estimate of the workhours for a single job where lotsize = 20 C. the estimate of the mean of all potentially observable workhours for jobs where lotsize = 20 D. the true mean of all potentially observable workhours where lotsize = Under the classic model, what happens to the prediction interval for Y X = x when n increases? A. It gets wider B. It approaches β0 + β1x ± 0 C. It approaches β0 + β1x ± 1.96σ 4. You get a p-value that is p = 0.57 for testing whether a particular β = 0. Then A. the difference between ββ and 0 is explainable by chance alone B. the difference between β and 0 is explainable by chance alone C. the true β is close to zero D. the true β is equal to zero

5 Quiz 5. Name: 5. According to the author, what should be your first step in the analysis of any regression data? A. Plot the (xi, yi) data B. Determine whether the difference between ββ and 0 is explainable by chance alone C. Find the R 2 statistic D. Calculate summary statistics (means, standard deviations, etc.) for the X data and the Y data 6. When checking assumptions using a hypothesis testing method, the null hypothesis always states A. The assumption is precisely true B. The assumption is close enough to being true C. The assumption is false D. The assumption may be false, or it may be true 7. Which method is the preferred method to check assumptions? B. Use a test for the assumption, and conclude that the assumption is true if p >.05 C. Use a test for the assumption, and conclude that the assumption is true if p <.05 D. Use an appropriate graph to check the assumption E. Use a literature search to see what other researchers say about the assumption 8. What does the linearity assumption state? A. The data Y fall on a straight line, β0 + β1x B. The predicted values fall on a straight line, β0 + β1x C. The means of the potentially observable Y values fall on a straight line, β0 + β1x D. The means of the actually observed Y values fall on a straight line, β0 + β1x

6 Quiz 6. Name: 1. If there is homoscedasticity in the Toluca case (X = lotsize, Y = workhours), what is true? A. Var(Y X = 20) = Var(Y X = 120) B. Var(Y X = 20) is close to Var(Y X = 120) C. Var(Y X = 20) is far from Var(Y X = 120) D. Var(Y X = 20) may be far from or close to Var(Y X = 120); you can t tell from the given information which is true. 2. To test for heteroscedasticity, you test the null hypothesis that the slope of a certain regression is zero. Which regression do you use for this test? A. The regression of ei on xi. B. The regression of yi on xi C. The regression of ei on yi D. The regression of ei on yy ii 3. To test for autocorrelation, you test the null hypothesis that the correlation between two variables is zero. Which variables? A. Yt and Yt-1 B. Xt and Xt-1 C. εt and εt-1 D. βt and βt-1 4. To test for non-normality, you apply the Shapiro-Wilk test (shapiro.test) to which data? A. The response variable, Y B. The predictor variable, X C. The predicted values, yy D. The residuals, e

7 Quiz 7. Name: 1. What problems can be lessened by transforming X only (and not Y)? Select all that apply. Ten points per correct selection/non-selection. A. Non-randomness B. Non-normality of the distributions p(y x) C. Non-constant variance (heteroscedasticity) D. Non-linearity (curvature) E. Poor predictive ability (low R 2 ) F. Outliers in the X data 2. Suppose U = ln(y). Here is an estimated regression function: U ˆ = X. Back-transform this function to give the relationship between Y ˆ and X.

8 Quiz 8. Name: 3. What was the problem with comparing likelihoods to select a better transformation of Y? A. Since the distribution of the back-transformed Y is non-normal, there is no likelihood. B. You can t compare the likelihood for Y with the likelihood for a transformed Y. You have to compare likelihoods using the same Y variable. C. Since one of the models will be non-linear, likelihood is meaningless. D. The likelihoods are grossly affected by outliers, so you should not use the likelihood for an untransformed model when there are outliers. 4. Suppose Y has measurement units thousands of dollars per job. What are the measurement units of 1/Y? 5. Given X = x, the Box-Cox method assumes what about the transformation Y λ? A. Y λ is normally distributed B. Y λ is log-normally distributed C. Y 1/λ is normally distributed D. Y 1/λ is log-normally distributed 6. Elasticity, as estimated by the regression of ln(y) on ln(x), refers to A. change in Y per unit change in X B. change in the median of Y per unit change in X C. percentage change in Y per 1% change in X D. percentage change in the median of Y per 1% change in X

9 Quiz 9. Name: 7. Suppose E(Y X = x) = 3 + 4x. Suppose E(X) = 10. Then E(Y) = 8. See problem 1. Given X = x, the best predictor of Y is 9. The Law of Total Variance tells us that Var(Y ) = Var{E(Y X)} + E{Var(Y X)}. Suppose Var{E(Y X)} = 30 and E{Var(Y X)} = 10. Then the true R 2 is 10. The relationship between X = ice cream sales and Y = drownings was shown to be A. explained by chance alone B. explainable by chance alone C. predictive D. causal

10 Quiz 10. Name: 1. (20) Consider the model GPA {GMAT = x1, PHD = x2} ~ N(β 0 + β 1x1 + β 2x2, σ 2 ) of the reading. If this model is true, then the coefficient β 2 is equal to A. The difference between GPAs for Ph.D. and Master s students. B. The difference between mean GPAs for Ph.D. and Master s students. C. The difference between GPAs for Ph.D. and Master s students who have the same GMAT. D. The difference between mean GPAs for Ph.D. and Master s students who have the same GMAT. E. The difference between GPAs for Ph.D. and Master s students, all else held fixed. F. The difference between mean GPAs for Ph.D. and Master s students, all else held fixed. 2. (20) Assuming the same model as in 1 above, σ is the A. standard deviation of the GPAs. B. standard deviation of GMATs. C. standard deviation of PHDs. D. standard deviation of the GPAs for potentially observable PHD students who have GMAT score = (40) Let A be a 3x3 matrix. Then A -1 A = (give the explicit matrix.)

11 Quiz 11. Name: 1. With two X variables, X1 and X2, the least squares regression fit (in R, lm(y ~ X1+X2)) gives you A. the line that minimizes the sum of squared deviations from the X s to the line. B. the line that minimizes the sum of squared deviations from the Y s to the line. C. the plane that minimizes the sum of squared deviations from the X s to the plane. D. the plane that minimizes the sum of squared deviations from the Y s to the plane. 2. Which one is the matrix form of the regression model? A. Y = Xβ B. Y = Xβ + ε C. ˆβ = (X T X)(X T Y) D. ˆβ = (X T X) 1 (X T Y) 3. The classical regression model assumes what about the conditional covariance matrix of the potentially observable Y data? Cov(Y X = x) = A. I B. (x x) -1 C. σ 2 I D. σ 2 (x x) In the simulation of the reading there were two models to predict a common Y, one using an actual X (called XA), and the other using a guess of X (called XG). The true models have coefficients βa and βg, respectively; their estimates were ˆA β and ˆG β, respectively. The conclusions were: A. ˆG β was an unbiased estimate of βg, and ˆG β was an unbiased estimate of βa B. ˆG β was an unbiased estimate of βg, but ˆG β was a biased estimate of βa C. ˆG β was a biased estimate of βg, but ˆG β was an unbiased estimate of βa D. ˆG β was a biased estimate of βg, and ˆG β was a biased estimate of βa

12 Quiz 12. Name: 1. What is the point of using the adjusted R square statistic? A. It is equal to the population R 2 statistic B. It is less biased than the usual R 2 statistic C. It measures causal effects of the X variables D. It is less sensitive to violations of the assumptions 2. A multiple regression model is Y = β0 + β1x1 + β2x2 + ε. The null hypothesis that is tested by the F test is A. H0 : β0 = β1 = β2 = 0 B. H0 : β1 = β2 = 0 C. H0 : β0 = β1 = β2 D. H0 : β1 = β2 3. What happens to the distribution of the F statistic when the null hypothesis is false? A. It is shifted to the left of the usual F distribution B. It is shifted to the right of the usual F distribution C. It becomes a bimodal distribution D. It becomes a degenerate distribution 4. Your regression model is Y = β0 + β1x1 + β2x2 + ε. Multicollinearity in your regression model is a concern when R 2 is close to 1.0 for which of the following? (Select only one.) A. lm(y ~ X1) B. lm(y ~ X2) C. lm(y ~ X1 + X2) D. lm(x1 ~ X2)

13 Quiz 13. Name: 1. A fitted quadratic model is yy = 4 3x + 2x 2. The value of x that gives the minimum yy is A B. 2.0 C. 3.0 D When graphed in three dimensions, the function f (x1, x2) = b0 + b1x1 + b2x2 + b3x1x2 looks like A. A plane B. A line C. A twisted plane D. A bell curve 3. In the model of 2 above, the coefficient b1 is A. the effect of X1 when X2 is held fixed. B. the effect of X1 when all else is held fixed. C. the effect of X1 regardless of X2. D. the effect of X1 when X2 is equal to In the model of 2 above, the variable inclusion principle states that you should A. include X1 in the model whenever you include X1X2 in the model. B. include X1 in the model only when X1 is statistically significant. C. include X1 in the model only when X1X2 is statistically significant. D. include X1 in the model only when the intercept is statistically significant.

14 5. The model to predict Grade Point Average (GPA) from the PhD indicator variable (1=Ph.D., 0=Masters) was GPA = β0 + β1phd + ε. In this model the linearity assumption was A. Certainly true B. Certainly false C. True only if β1 was insignificant D. True only if β1 was significant 6. See 5 above. What happens when you code the GPA variable as (0=Ph.D., 1=Masters) instead of (1=Ph.D., 0=Masters)? A. Nothing at all B. The β coefficients change C. The R 2 changes D. The constant variance assumption needs to be re-evaluated 7. In the housing price example, there were five regions (A,B,C,D, and E) and the indicator variable model was Price = β0 + β1xa + β2xb + β3xc + β4xd + ε. The parameter β1 is A. The price in region A B. The mean price in region A C. The price in region A, minus the price in region E D. The mean price in region A, minus the mean price in region E 8. In the housing price example of quiz question 7 above, you can bring square footage of the home into the model in a way that allows different effects of square footage for different regions. How do you do this? A. Incorporate interaction terms between square footage and region indicators B. Perform simulation study to see whether square footage and region interact C. Test for simultaneous significance of square foot and region effects D. Draw a graph of the distribution of price when square footage and region are held fixed

15 Quiz 14. Name: Consider a two-way ANOVA to predict price of a home as a function of region (A or B) and whether the home has a cellar (Yes or No). The model is Price = β0 + β1region.a + β2cellar.yes + ε Region.A = 1 if the home is in Region A, and Region.A = 0 if the home is in region B. Cellar.Yes = 1 if the home has a cellar, and Cellar.Yes = 0 if the home does not have a cellar. Fill in the table of mean values, as shown in the book. Table 1. True mean values of Price, according to the model Price = β0 + β1region.a + β2cellar.yes + ε. Cellar No Cellar Region A Region B

16 Quiz 15. Name: In the book, two models were analyzed: fit1 = lm(charity ~ INCOME + DEPS) fit2 = lm(charity ~ INCOME + as.factor(deps)) Recall: CHARITY is a measure of charitable contributions. INCOME is a measure of income. DEPS is number of dependents claimed on a tax form, taking the values 0, 1, 2, 3, 4, 5 and 6 in the data set. How are the theoretical models represented by fit1 and fit2 different?

17 Quiz 16. Name: 1. Hans had X1 = GRE quant = 140, X2 = GRE verbal = 160, and X3 = Undergrad GPA = 2.7. Which number is likely to be closest to Hans finishing GPA? A. E(GPA) B. E(GPA X1 = 140) C. E(GPA X1 = 140, X2 = 160) D. E(GPA X1 = 140, X2 = 160, X3 = 2.7) 2. See question 1. What is E(GPA X1 = 140)? A. The average of the GPAs among students in the sample who have GRE quant = 140 B. The average of potentially observable GPAs when GRE quant = 140 C. The predicted GPA using the estimated regression model with X1 = 140 D. Hans GPA, assuming all we know about him is his GRE quant score 3. Three different regression models can be estimated using data to predict Hans GPA: fit1 = lm(gpa ~ GRE.quant) fit2 = lm(gpa ~ GRE.quant + GRE.verbal) fit3 = lm(gpa ~ GRE.quant + GRE.verbal + GPA.undergrad) Each of these models gives prediction of Hans GPA by plugging in the appropriate X values. Which model gives a prediction having highest variance? A. fit1 B. fit2 C. fit3 D. because of homoscedasticity, all variances are the same 4. What does the variance-bias tradeoff tell you? A. Unbiased estimates are always best B. Estimates with smaller variance are always best C. Biased estimates can be better when their variance is small D. Biased estimates can be better when their variance is large

18 Quiz 17. Name: 1. The BIC statistic is BIC = 2(Log Likelihood) + {ln(n)} (# of estimated parameters). When using this statistic for variable selection, you should choose a set of variables for which BIC is A. Close to 1.0 B. Close to 0.0 C. Large D. Small 2. Suppose the variance function is Var(Y X = x) = γ 2 x 2. Using this function, the maximum likelihood estimates are weighted least squares estimates, with weights equal to A. x B. x 2 C. 1/x D. 1/x 2 3. When is WLS more efficient than OLS? A. When all the assumptions are satisfied for the classical regression model B. When the distributions p(y x) are non-normal distributions C. When the variance function is correctly specified D. When the variance function is not correctly specified 4. Why is it better to assume (i) σ(y X = x) = exp(γ0 + γ1x), rather than to assume (ii) σ(y X = x) = γ0 + γ1x? A. Because standard deviations cannot be negative B. Because standard deviations are normally distributed C. Because the log likelihood is higher for (i) D. Because (i) is true and (ii) is false

19 Quiz 18. Name: 1. What assumption do you make about σ(y X = x) when you use the heteroscedasticityconsistent standard errors? A. σ(y X = x) = γx B. σ(y X = x) = γ0 + γ1x C. σ(y X = x) = exp(γ0 + γ1x) D. No assumption regarding the relationship between σ(y X = x) and x is made at all 2. When you use the heteroscedasticity-consistent standard errors, what is the estimate of 2 σ 2 = Var(Y2 X2 = x2)? Α. {y2 (β0 + β1x2)} 2 2 B. { y ( ˆ β + ˆ β x )} C. RMSPE D. No estimate of σ 2 = Var(Y2 X2 = x2) is used at all 3. (40) What is different about Cov(Y X = x) when the observations are correlated versus when they are uncorrelated? (Be brief).

20 Quiz 19. Name: 1. What distribution do you assume for p(y x) when you use logistic regression? A. Normal B. Logistic C. Multinomial D. Bernoulli 2. What distribution do you assume for p(y x) when you use multinomial logistic regression? B. Normal B. Logistic C. Multinomial D. Bernoulli 3. To estimate parameters of the logistic regression model you use, and to estimate the parameters of the multinomial logistic regression model you use. A. Least squares, Least squares B. Least squares, Maximum likelihood C. Maximum likelihood, Least squares D. Maximum likelihood, Maximum likelihood 4. If the probability is 0.25, then the odds ratio is A. 1/4 B. 4.0 C. 1/3 D. 3.0

21 Quiz 20. Name: 5. What is a latent variable? A. A variable in your data set B. A skewed variable C. An unobserved variable D. A multicollinear variable 6. What distribution was assumes for the latent variable (called Z ) in the reading? A. Normal B. Logistic C. Multinomial D. Bernoulli 7. The multinomial logistic regression model A. has the same number of parameters as the ordinal regression model B. has fewer parameters than the ordinal regression model C. has more parameters than the ordinal regression model 8. In the Boss rating example, where Y could be 1 (Low), 2 (Medium), or 3 (High), and X = salary, it was shown that Pr(Y = 1 X = x) A. increases as X increases B. decreases as X increases C. does not change as X increases

22 Quiz 21. Name: 9. Give the complete list of possible values of Y, according to the Poisson regression model. A. 0, 1, 2, 3, B. 1, 2, 3, C. 0,1 D. all numbers between - and 2. Suppose you have a data set with a single Y and a single X variable. Suppose also that there are 10 distinct values of your Y variable in your data set. You are considering different regression models. Which one will have the most parameters? A. Classical B. Ordinal C. Poisson D. Negative binomial 3. Which is evidence of extra-poisson variation? A. The mean is greater than the median B. The mean is less than the median C. The mean is greater than the variance D. The mean is less than the variance 4. How do data from the negative binomial (NB) distribution differ from data from the Poisson distribution, assuming both distributions have the same mean? A. There is a larger probability of seeing large values (outliers) when data come from NB B. There is a smaller probability of seeing large values (outliers) when data come from NB C. There is a larger probability of seeing negative numbers when data come from NB D. There is a smaller probability of seeing negative numbers when data come from NB

23 Quiz 22. Name: Give an example of a censored observation. Your words should be clear as to why that observation is called censored. Be brief.

24 Quiz 23. Name: 1. When you raise 0.7 to the 0.9 power, you get A B C. 1.6 D Suppose, in the time until divorce example, that S(20 Educ = > 15 years ) = Then A. 75% of marriages have husband education greater than 15 years B. 25% of marriages have husband education greater than 15 years C. 75% of couples whose husband has greater than 15 years education are married at least 20 years D. 25% of couples whose husband has greater than 15 years education are married at least 20 years 3. In the Tobit analysis of back injury data, what was the recommendation for dealing with the outlier? A. Remove it B. Since it was a mistake, to replace it with the correct value C. Replace it with a smaller value D. Use a different distribution p(y x) 4. What was the ugly rule of thumb for identifying an outlier, as presented in the book? A. An observation that is less than two standard deviations from the mean is an outlier B. An observation that is more than two standard deviations from the mean is an outlier C. An observation that is less than three standard deviations from the mean is an outlier D. An observation that is more than three standard deviations from the mean is an outlier

25 Quiz 24: Name Closed book, notes and no electronic devices Choose one answer only for each. 1. Which statistic measures outlier in X space? A. Leverage B. Standardized residual C. Cook s distance D. z = ( y y)/ s y i y 2. Which statistic measures outlier in Y X space? B. Leverage B. Standardized residual C. Cook s distance D. z = ( y y)/ s y i y 3. Which statistic measures outlier in both Y X space and in X space? C. Leverage B. Standardized residual C. Cook s distance D. z = ( y y)/ s y i y 4. Which statistic measures outlier in Y space? D. Leverage B. Standardized residual C. Cook s distance D. z = ( y y)/ s y i y

26 Quiz 25: Name Closed book, notes and no electronic devices n 1. Let SAE(b0) = yi b0. When is there a unique value b0 that minimizes SAE(b0)? i= 1 A. When the second derivative of SAE(b0) is negative B. When the second derivative of SAE(b0) is positive C. When n is an even number D. When n is an odd number 2. Consider the classical regression model Y X = x ~ N(β0 + β1x, σ 2 ). Assuming this model is true, the function that relates the median of the distribution of Y to X = x is y0.5 = A. β0 + β1x 0.5σ B. β0 + β1x C x D x 3. What is a Winsorized data value? A. A data value that has been deleted B. A data value that has been standardized C. A data value that has been replaced with a less extreme data value D. A data value that has been replaced with a more extreme data value 4. The simulation studies in the reading about Winsorization showed that A. Winsorization gives the most powerful tests B. Winsorization gives the most accurate estimates C. Winsorization gives the most accurate prediction intervals D. you should not use Winsorization

27 Quiz 26: Name Closed book, notes and no electronic devices 5. (20) What does tree refer to in tree regression? A. The assumption of a tree distribution p(y x) B. The graphical output of the analysis looks like an upside-down tree C. The assumed conditional mean function is continuous, like branches of a tree D. The person who invented was named Tree. 6. (20) When a node of a tree produced by rpart is given as X >= 1.5 then the left branch under the node corresponds to A. X >=1.5 B. X < The neural network regression models discussed in the book are (select all that apply; eight points per correct selection/nonselection) A. universal approximators B. continuous functions C. polynomial functions D. trigonometric functions E. nonlinear functions

28 Quiz 27: Name Closed book, notes and no electronic devices 8. According to the authors, what is a regression model? A. A model to predict a Y variable, given values of X variable(s) B. A model for the conditional mean of a Y variable, given values of X variable(s) C. A model for the conditional distribution of a Y variable, given values of X variable(s) D. A model for the conditional variance of a Y variable, given values of X variable(s) 9. What is p hacking? A. An acceptable scientific practice B. Trying different methods until you get a p-value that supports your theory C. Use of simulated data sets to clarify your theory D. Finding the best probability model using different logistic regression functions 10. Which methods are most similar to likelihood-based methods? A. Heteroscedasticity consistent B. Bayesian C. Quantile based D. Method of Moment based 11. What is the recommended Step 1 of regression analysis, according to the authors? A. Specify your conditional mean function B. Specify your conditional distributions C. Specify your conditional variance function D. Look at your data

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

ISQS 5349 Spring 2013 Final Exam

ISQS 5349 Spring 2013 Final Exam ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Multiple Regression. Peerapat Wongchaiwat, Ph.D. Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1.(10) What is usually true about a parameter of a model? A. It is a known number B. It is determined by the data C. It is an

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

FAQ: Linear and Multiple Regression Analysis: Coefficients

FAQ: Linear and Multiple Regression Analysis: Coefficients Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

ECONOMETRICS HONOR S EXAM REVIEW SESSION

ECONOMETRICS HONOR S EXAM REVIEW SESSION ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Understanding Regression Analysis

Understanding Regression Analysis Understanding Regression Analysis A Conditional Distribution Approach Peter H. Westfall, Texas Tech University Preface These notes will hopefully someday be made into a book. They distill 30+ years of

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Quoting from the document I suggested you read (http://courses.ttu.edu/isqs5349 westfall/images/5349/practiceproblems_discussion.

Quoting from the document I suggested you read (http://courses.ttu.edu/isqs5349 westfall/images/5349/practiceproblems_discussion. Spring 14, ISQS 5349 Midterm 1. Instructions: Closed book, notes and no electronic devices. Put all answers on scratch paper provided. Points (out of 100) are in parentheses. 1. (20) Define regression

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013 Midterm 3 4/5/2013 Instructions: You may use a calculator, and one sheet of notes. You will never be penalized for showing work, but if what is asked for can be computed directly, points awarded will depend

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

ISQS 5349 Final Exam, Spring 2017.

ISQS 5349 Final Exam, Spring 2017. ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Simple Linear Regression Using Ordinary Least Squares

Simple Linear Regression Using Ordinary Least Squares Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2015-16 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Applied Econometrics Lecture 1

Applied Econometrics Lecture 1 Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold. Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Spring 2015 Instructor: Martin Farnham Unless provided with information to the contrary,

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Ref.:   Spring SOS3003 Applied data analysis for social science Lecture note SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Econometrics Homework 1

Econometrics Homework 1 Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Correlation in Linear Regression

Correlation in Linear Regression Vrije Universiteit Amsterdam Research Paper Correlation in Linear Regression Author: Yura Perugachi-Diaz Student nr.: 2566305 Supervisor: Dr. Bartek Knapik May 29, 2017 Faculty of Sciences Research Paper

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Understanding Regression Analysis

Understanding Regression Analysis Understanding Regression Analysis A Conditional Distribution Approach Peter H. Westfall, Texas Tech University Preface This book distills 35 years of my teaching regression analysis to social scientists,

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Multicollinearity Read Section 7.5 in textbook. Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Example of multicollinear

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information