This document contains 3 sets of practice problems.

Similar documents
STATS Analysis of variance: ANOVA

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Ch 13 & 14 - Regression Analysis

Examination paper for TMA4255 Applied statistics

SMAM 314 Exam 42 Name

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Inferences for Regression

Inference for the Regression Coefficient

Multiple Regression Examples

Chapter 12 - Lecture 2 Inferences about regression coefficient

INFERENCE FOR REGRESSION

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Confidence Interval for the mean response

Inference for Regression Inference about the Regression Model and Using the Regression Line

Conditions for Regression Inference:

Six Sigma Black Belt Study Guides

23. Inference for regression

Study Guide AP Statistics

Model Building Chap 5 p251

SMAM 314 Practice Final Examination Winter 2003

Intro to Linear Regression

Intro to Linear Regression

Correlation & Simple Regression

Lecture 11: Simple Linear Regression

Warm-up Using the given data Create a scatterplot Find the regression line

Sampling Distributions: Central Limit Theorem

Simple Linear Regression Using Ordinary Least Squares

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

Ch. 3 Review - LSRL AP Stats

LI EAR REGRESSIO A D CORRELATIO

Mrs. Poyner/Mr. Page Chapter 3 page 1

11 Correlation and Regression

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Statistics 5100 Spring 2018 Exam 1

Basic Business Statistics 6 th Edition

Test 3A AP Statistics Name:

28. SIMPLE LINEAR REGRESSION III

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room:

Correlation and Linear Regression

Inference for Regression

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Simple Linear Regression: One Qualitative IV

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Analysis of Bivariate Data

Business 320, Fall 1999, Final

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Multiple Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Econometrics. 4) Statistical inference

School of Mathematical Sciences. Question 1

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH

STATISTICS Relationships between variables: Correlation

Simple Linear Regression: One Quantitative IV

Lecture 10 Multiple Linear Regression

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

9. Linear Regression and Correlation

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

This gives us an upper and lower bound that capture our population mean.

Hypothesis testing: Steps

Regression. Marc H. Mehlman University of New Haven

Notebook Tab 6 Pages 183 to ConteSolutions

1 Introduction to Minitab

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

Inference for Regression Simple Linear Regression

Hypothesis testing: Steps

df=degrees of freedom = n - 1

Statistics and Quantitative Analysis U4320

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Unit 6 - Introduction to linear regression

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

1. An article on peanut butter in Consumer reports reported the following scores for various brands

The empirical ( ) rule

Data files & analysis PrsnLee.out Ch9.xls

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Correlation and Regression

SIMPLE REGRESSION ANALYSIS. Business Statistics

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Section 3: Simple Linear Regression

Multiple Linear Regression

Chapter 9. Correlation and Regression

Final Exam Bus 320 Spring 2000 Russell

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

Quantitative Bivariate Data

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

STAT 212 Business Statistics II 1

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Review of Regression Basics

Simple Linear Regression: A Model for the Mean. Chap 7

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Chapter 3: Examining Relationships

Chapter 6 Scatterplots, Association and Correlation

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT Chapter 11: Regression

Transcription:

P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them with you to class. Solutions will be reviewed in class and you will have trouble keeping up if you do not have a copy of them with you. Note: Additional problems, and real- time solutions, are provided on line in the form of screencasts. The additional problems are also provided in PDF form via a link on that site. You are strongly encouraged to try working those problems before watching the screencasts. The additional problems will NOT be covered during class time. Primary URL: http://awarnach.mathstat.dal.ca/~joeb/stats1060_webcasts/welco me.html Alternate URL: http://web.me.com/cadair_idris/stats1060/welcome.html

C ORRELATION: Problem 1. In each of the following settings decide which variable you would consider explanatory and which one you would consider response. Predict the type of association. a. The amount of time spent studying for the 1060 midterm and the exam grade. b. Your grade on the 1060 exam and your height at age 6 c. The amount of yearly rainfall and the biomass of an ecosystem d. The weight and height of a 1060 student Problem 2. The figure shows several scatterplots. Match the following correlation coefficients to the scatterplots: 0.71, - 0.55, 0.002, - 0.908. Indicate which, if any, of these datasets might you question the use of correlation? A B C D Problem 3. Identify the errors in the following statements. a. There was a high correlation between the occupational class of the father and the occupational class of the son. b. We detected a high correlation (1.10) between the length and cost of a long distance phone call. c. The strong positive correlation between the percentage of the population that own a cell phone and life expectancy means that countries that want to improve the health of their citizens should invest in more cell phone networks. d. We found the correlation between rainfall and crop yield to be r = 0.26 bushels per inch of rainfall.

R EGRESSION: Problem 1. Variable 1 (var1) is the annual wine consumption per person per year measured in liters of alcohol consumed. Variable 2 (var2) is the annual death rate, given as number per 10,000. Data for these two variables were collected from 19 countries and analyzed using Minitab. Given the following output from Minitab, compute the slope and intercept of the least squares regression line: MTB> corr var1, var2 Correlation of var1 and var2 = - 0.843 MTB> desc var1 var2 N MEAN MEDIAN TRMEAN STDEV var1 19 3.026 2.400 2.806 2.510 var2 19 191.1 199.0 191.7 68.4 Problem 2. The variables (var1 and var2) are the same as in question 1 above. In this case, Minitab was used to conduct a linear regression analysis. Use the Minitab output in the box to answer the questions below. MTB> regr var2 var1 The regression equation is var2 = 261 23.0 var1 Predictor coef stdev t - ratio p Constant 260.56 13.38 18.83 0.0 var1-22.969 3.557-6.46 0.0 s = 37.88 R - sq=71.0% Analysis of Variance SOURCE DF SS MS F P Regression 1 59814 59814 41.69 0.00 Error 17 24931 1435 Total 18 84205 a) What is the equation for the least squares regression line? b) State the prediction of the model in words. c) Write out the ratio of the variance in the model predictions to the total variance in the data? d) What is the standard deviation of the residuals? e) If Canada and France differ by 6.7 liters of wine consumption, what would be the predicted difference in risk of heart disease? f) If we changed the units of wine consumption from liters to ounces, would the r 2 value change? Would the slope change?

Problem 3. Data were collected from 78 seventh grade students to investigate the relationship between grade point average (GPA) and intelligence quotient score (IQ). GPA was considered the response variable, and IQ was considered the predictor variable. The following statistics were obtained from these data: x = 108.9231 y = 7.4464 r = 0.6337 s x = 13.1710 s y = 2.0996 a) Is the association positive or negative? b) What is the equation for the best-fit straight line? c) What proportion of the variation in GPA can be accounted for by IQ? d) What GPA would you predict from a student with an IQ of 100? e) Given the total variation in GPA is 84342 (SST), compute the total sum of the squared errors. f) If we wanted to predict IQ from GPA what would be the best-fit regression line? Problem 4. We return to the dataset used in problem 5 above. Again, use the Minitab output in the box to answer the questions below. MTB> regr var2 var1 The regression equation is var2 = 261 23.0 var1 Predictor coef stdev t-ratio p Constant 260.56 13.38 18.83 0.0 var1-22.969 3.557-6.46 0.0 s = 37.88 R-sq=71.0% Analysis of Variance SOURCE DF SS MS F P Regression 1 59814 59814 41.69 0.00 Error 17 24931 1435 Total 18 84205 a) State the null and alternative hypotheses in terms of the slope (numerically and in words). b) Assume that the assumptions for inference have been met and perform the t-test for a relationship between the two variables (assume α = 0.05). State your conclusion. c) Construct a 95% confidence interval for the slope. d) Explain the meaning of the confidence interval.

ANOVA Problem 1: Will the grand mean of the sample always be equal to the true mean? If no, then explain why they could differ. Problem 2: A psychologist is interested in investigating if the three different treatments available to her have different effects on client improvement. She randomly chose seven clients from those undergoing each of the three different treatments, and asked them to rate their level of satisfaction on a scale of 1 to 100. (a ) Assuming that she will carry out an ANOVA on these data, state the null and alternative hypotheses. (b ) What is the minimum condition that will satisfy the alternative hypothesis in this case? Problem 3: For each of the two boxplots below (a ) state the null and alternative hypotheses, and (b ) comment on the possibility that the null hypothesis is false. Population 1 Population 2 Population 3 Treatment 1 Treatment 2 Treatment 3 Treatment 4

Problem 4: Calculate a e below by using only the summary statistics for Treatments A, B and C.! a. df 1 and df 2 b. y c. SSTR d. SSE e. SST Treatment A Treatment B Treatment C y A = 10 y B = 12 y C = 8 s A = 1 s B = 1 s C = 1 n A = 5 n B = 5 n C = 5 Problem 5: A researcher has randomly sampled grades for six students for each of three available ways to take introductory biology: (i) completely on - line, (ii) in a traditional classroom setting, and (iii) in a hybrid system having both traditional classroom lectures and online activities. Use the data in the table below to (a ) construct an ANOVA table, and (b ) comment on the possibility that student grades differ among the three groups.! Online Traditional Hybrid y A = 71.6667 y B = 74.1667 y C = 80 s A = 15.0555 s B = 13.1927 s C = 12.6491 n A = 6 n B = 6 n C = 6 Problem 6: For a - d below, find the critical value of F given values for α, df 1 and df 2. a. α = 0.01, df 1 = 1, and df 2 = 7 b. α = 0.05, df 1 = 1, and df 2 = 7 c. α = 0.05, df 1 = 2, and df 2 = 15 d. α = 0.05, df 1 = 9, and df 2 = 16 Problem 7: Complete the elements of the partial ANOVA table for an analysis of three experimental treatments, each having six observations. Source of variation DF Sum of squares Treatment Error 70 Total 140 Mean square F statistic

Problem 8: Using the ANOVA table that you completed in question 7, complete steps a - d below to test for inequality in the group means (assume α = 0.01). a. State the hypotheses (Clearly define µ i ) b. Find the F crit and the rejection rule c. Compare F data with F crii d. State your conclusion and interpretation