STATISTICS QUESTIONS. Step by Step Solutions.

Similar documents
[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Statistics II Final Exam 26/6/18

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Statistics for Business and Economics

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Statistics for Economics & Business

/ n ) are compared. The logic is: if the two

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Chapter 11: Simple Linear Regression and Correlation

Scatter Plot x

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Economics 130. Lecture 4 Simple Linear Regression Continued

Chapter 13: Multiple Regression

Comparison of Regression Lines

Chapter 14 Simple Linear Regression

28. SIMPLE LINEAR REGRESSION III

Basic Business Statistics, 10/e

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

18. SIMPLE LINEAR REGRESSION III

Lecture 6: Introduction to Linear Regression

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

STAT 3008 Applied Regression Analysis

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

x = , so that calculated

Statistics MINITAB - Lab 2

Chapter 9: Statistical Inference and the Relationship between Two Variables

x i1 =1 for all i (the constant ).

Negative Binomial Regression

Introduction to Regression

Lecture 4 Hypothesis Testing

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

17 - LINEAR REGRESSION II

STAT 511 FINAL EXAM NAME Spring 2001

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Professor Chris Murray. Midterm Exam

experimenteel en correlationeel onderzoek

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Biostatistics 360 F&t Tests and Intervals in Regression 1

Chapter 15 Student Lecture Notes 15-1

# c i. INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased: Recall: A contrast is a linear combination of effects with coefficients summing to zero:

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

January Examinations 2015

Topic- 11 The Analysis of Variance

Learning Objectives for Chapter 11

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Regression Analysis. Regression Analysis

Topic 7: Analysis of Variance

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

The SAS program I used to obtain the analyses for my answers is given below.

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Statistics Chapter 4

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Chapter 3 Describing Data Using Numerical Measures

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Modeling and Simulation NETW 707

Statistical Evaluation of WATFLOOD

CHAPTER 8. Exercise Solutions

a. (All your answers should be in the letter!

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Chapter 15 - Multiple Regression

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Lecture 3 Stat102, Spring 2007

Chapter 12 Analysis of Covariance

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Regression. The Simple Linear Regression Model

Midterm Examination. Regression and Forecasting Models

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

First Year Examination Department of Statistics, University of Florida

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Continuous vs. Discrete Goods

SIMPLE LINEAR REGRESSION

Lecture 6 More on Complete Randomized Block Design (RBD)

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

F statistic = s2 1 s 2 ( F for Fisher )

Unit 8: Analysis of Variance (ANOVA) Chapter 5, Sec in the Text

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

β0 + β1xi and want to estimate the unknown

Correlation and Regression

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

β0 + β1xi. You are interested in estimating the unknown parameters β

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Transcription:

STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016

Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to four chldren. She obtans a sample of 16 famles, four of each sze, and dentfes the number of arrests per chld for delnquency. The data s as follows: Group 1 chldren n= Group 3 chldren n= Group 3 chldren n= Group 1 chld n= Famly 1 10 8 5 Famly 8 8 6 5 Famly 3 9 6 7 Famly 10 9 9 a) Calculate the total sum of squares. b) Calculate the mean square (between groups). c) Calculate the F-rato d) Use the Turkey HSD (alpha=0.05) to test for sgnfcance between groups. Whch groups dffered? e) Based on your results, wrte a 1- paragraph essay that descrbes your observatons obtaned from ths sample n regard to the effects of famly sze on delnquency for a group of offenders. Soluton: (a) The followng table wth descrptve statstcs s obtaned from the nformaton provded

Obs. Group 1 Group Group 3 Group 10 8 5 8 8 6 5 9 6 7 10 9 9 Mean 9.5 7.75 6.75 3.5 St. Dev. 0.957 1.58 1.708 1.5 We need to test H : 0 1 3 H A : Not all the means are equal Wth the data found n the table above, we can compute the followng values, whch are needed to construct the ANOVA table. We have: Between 1 k SS n x x from whch we get

SS 9.5 6.75 7.75 6.75 6.75 6.75 3.5 6.75 78 Between Now we also see that, k 1 SS n s Wthn 1 whch mples Wthn SS 1 0.957 1 1.58 1 1.708 1 1.5 3 Hence, SS Total = 78+3 = 101 (b) Therefore MS Between SSBetween 78 6 k 1 3 Also, we obtan that

MS Wthn http://www.mathcracker.com SSWthn 3 1.917 N k 1 1 (c) Therefore, the F-statstcs s computed as F MSBetween 6 13.565 MS 1.917 Wthn The crtcal value for 0.05, df1 3 and df 1 s gven by F 3.903 C and the correspondng p-value s p Pr F 13.565 0.000 3,1 Observed that the p-value s less than the sgnfcance level 0.05, then we reject H 0. (d) The HSD dfference s computed as follows: MSE 1.917 HSD Q*.0.91 n

The followng table s obtaned: Post hoc analyss Tukey smultaneous comparson t-values (d.f. = 1) Group Group 3 Group Group 1 3.3 6.8 7.8 9.3 Group 3.3 Group 3 6.8 3.58 Group 7.8.60 1.0 Group 1 9.3 6.13.55 1.53 crtcal values for expermentwse error rate: 0.05.97 0.01 3.89 (e) Based on the above results, we have enough evdence to reject the null hypothess of equal means, at the 0.05 sgnfcance level. Summarzng, we have the followng ANOVA table: Source SS df MS F p-value Crt. F Between Groups Wthn Groups 78 3 6 13.565 0.000 3.903 3 1 1.917 Total 101 15

The parwse dfferences that are sgnfcant are between Group 1 and Group, Group and Group, and Group 3 and Group. In fact, the mean for Group s sgnfcantly lower when compared to the means for groups 1, and 3, respectvely. Problem : Move Success. Usng the data n Table 7., make a scatter dagram for the relatonshp between producton budget and vewer ratng of moves. Estmate the correlaton coeffcent. Based on these data, do you thnk a large producton budget s lkely to result n a move wth a hgh vewer ratng? Explan. Soluton: The scatter plot s shown below. Scatterplot of Ratng vs Budget 9 8 Ratng 7 6 5 0 50 100 Budget 150 00 It seems lke there's a mld negatve lnear relatonshp between Budget and Ratng. The actual correlaton coeffcent s computed as

As predcted by the vsual trend, the correlaton s negatve, but snce t's very small, the relatonshp s farly weak. Ths means that s not certan that a larger budget wll produce a hgher ratng, as t's not certan that a larger budget wll produce a lower ratng, but there a nclnaton to have lower ratng wth hgher budgets. Problem 3: Whch of these models s a better representaton of the relatonshp between students age and startng salary? Explan your decson. Soluton: As mentoned n the prevous part, the model obtaned once the outler was elmnated s relatvely smlar to the model wth n=5 cases, as the regresson coeffcents don't change dramatcally. But stll ths relatvely small dfference n coeffcents makes a relatvely large dfference n R^. In fact, for the model wth n = 5 we get R = 0.33, and for the model wth n = 5 we get R = 0.7. Ths makes the second model (wth n = ) the preferred one. The preferred model s Startng Salary^ = -67,91.785 + 3,635.6857* Age Problem :

Compute an mprsonment rate per 1000 populaton for 000. Introduce ths ncarceraton rate as an ndependent varable nto the model run n Part B. Test the hypothess that the R squared =0. Does ths model ft the data better than the model n Part B above? Explan. Does each of the ndependent varables have a statstcally sgnfcant effect on homcde? Explan. How strong s the effect of each of the ndependent varables? Explan. Whch of the ndependent varables has the stronger effect on the homcde rate? Explan. Soluton: The new varable s computed as ImprPer1000 = Prson0/pop0 (let us recall that pop0 s already gven n 1000 s). The followng s obtaned wth Excel: Regresson Analyss R² 0.99 Adjusted R² 0.66 n 9 R 0.707 k 3

Std. Error 1.639 Dep. Var. homrt0 ANOVA table Source SS df MS F p-value Regresson 10.81 3 0.19 1.95 6.87E-07 Resdual 10.853 5.6855 Total 1.935 8 Regresson output confdence nterval varables coeffcents std. error t (df=5) p-value 95% lower 95% upper std. coeff. Intercept -1.951 1.171-1.660.100 -.3059 0.156 0.000 ImprPer1000 0.30 0.1707 1.3.1616-0.1009 0.5869 0.175 sglmom80.3703 5.6975.77.0001 1.899 35.858 0.539 unempl0 36.9631 7.997 1.35.185-18.015 91.976 0.150 The model s Homcde Rate n 000 = -1.951 + 0.30* ImprPer1000 +.3703* sglmom80 + 36.9631* unempl0 Notce that the model s sgnfcant overall, snce F(3, 5) = 1.95, p = 0.000000687 < 0.05, so then R s sgnfcantly greater than zero.

Ths model fts only slghtly better than the prevous one, snce now Adj. R = 0.66, whch means that n ths case the amount of explaned varaton n the response varable by ths model s 6.6%. Notce that n ths model, the varable sglmom80 s ndvdually sgnfcant, wth t =.77 and p = 0.0001 < 0.05, but the varable uempl0 s not ndvdually sgnfcant, t = 1.35, p = 0.185 > 0.05. The varable ImprPer1000 s not sgnfcant ether, snce t = 1.3, p = 0.1616 > 0.05. The effect of ImprPer1000 and uempl0 s qute moderate snce the standardzed coeffcents assocated to them are less than 0. (ths s, an ncrease n one standard devaton n ether of the varables brngs a change of less than 0. standard devatons n the response varable). The varable wth the strongest effect s sglmom80, wth a standardzed coeffcent of 0.539. Problem 5: Usng the data below, answer the followng questons usng a table format. X 6 3 7 y 5-1 a. x b. y c. x y 1 1 1 d. Show that x. y x y 1 1 1 x y x y x x 1 1 1 1 1 e. f. g. h. Show that. Show that ( x x ) ( y y ) 0 1 1

Soluton: We have: http://www.mathcracker.com X Y X^ Y^ X*Y 5 16 5 0 6 36 1 3-1 9 1-3 7 9 16 8 Sum = 0 10 110 6 57 (a) 1 x 0 (b) 1 y 10 (c) 1 x y 57 (d) Notce that 1 1 1 1 x y 57, and x y 0 10 00 1 1 x y x y n ths case., whch means that

(e) 1 x 110 http://www.mathcracker.com (f) 1 y 6 (g) x y 57 1 39 (h) x 1 0 00, and 1 x 110, so then x 1 1 x () we get that X 5, Y.5. Observe that X Y X-Xbar Y-Ybar 5-1.5 6 1-0.5 3-1 - -3.5 7 1.5 Sum = 0 0 so then

( x x ) ( y y ) 0 1 1 http://www.mathcracker.com