Cathy Walker March 5, 2010

Similar documents
Chapter 3 Describing Data Using Numerical Measures

Comparison of Regression Lines

Statistics for Business and Economics

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Statistics for Economics & Business

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Economics 130. Lecture 4 Simple Linear Regression Continued

Chapter 9: Statistical Inference and the Relationship between Two Variables

Lecture 6: Introduction to Linear Regression

Linear Regression Analysis: Terminology and Notation

Introduction to Regression

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

STAT 3008 Applied Regression Analysis

AS-Level Maths: Statistics 1 for Edexcel

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Negative Binomial Regression

Chapter 13: Multiple Regression

Statistics Chapter 4

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

Statistics MINITAB - Lab 2

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Basic Business Statistics, 10/e

Spatial Statistics and Analysis Methods (for GEOG 104 class).

The Ordinary Least Squares (OLS) Estimator

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

January Examinations 2015

Goodness of fit and Wilks theorem

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

28. SIMPLE LINEAR REGRESSION III

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

U-Pb Geochronology Practical: Background

Chapter 11: Simple Linear Regression and Correlation

Lecture 3 Stat102, Spring 2007

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Topic- 11 The Analysis of Variance

18. SIMPLE LINEAR REGRESSION III

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

STATISTICS QUESTIONS. Step by Step Solutions.

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

DUE: WEDS FEB 21ST 2018

Unit 10: Simple Linear Regression and Correlation

A Robust Method for Calculating the Correlation Coefficient

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

First Year Examination Department of Statistics, University of Florida

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Expected Value and Variance

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

/ n ) are compared. The logic is: if the two

Chapter 15 - Multiple Regression

Statistics II Final Exam 26/6/18

Properties of Least Squares

Sociology 301. Bivariate Regression II: Testing Slope and Coefficient of Determination. Bivariate Regression. Calculating Expected Values

Multiple Choice. Choose the one that best completes the statement or answers the question.

Learning Objectives for Chapter 11

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

SIMPLE LINEAR REGRESSION

Uncertainty as the Overlap of Alternate Conditional Distributions

17 - LINEAR REGRESSION II

Hydrological statistics. Hydrological statistics and extremes

HMMT February 2016 February 20, 2016

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Chapter 8 Indicator Variables

Polynomial Regression Models

Module 14: THE INTEGRAL Exploring Calculus

Chapter 14 Simple Linear Regression

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

β0 + β1xi. You are interested in estimating the unknown parameters β

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Linear Approximation with Regularization and Moving Least Squares

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Statistical Evaluation of WATFLOOD

: 5: ) A

Basic Statistical Analysis and Yield Calculations

The Geometry of Logit and Probit

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Check off these skills when you feel that you have mastered them. List and describe two types of distributions for a histogram.

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

a. (All your answers should be in the letter!

Lecture 2: Prelude to the big shrink

STAT 511 FINAL EXAM NAME Spring 2001

Generalized Linear Methods

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

e i is a random error

Transcription:

Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn d) Number of break-ns n 004 by neghborhood e) Socal Securty Number f) Impresson of a certan place selected by recpents from a scale of 1 to 5 g) Name of your brthplace h) Year of brth. Consder the followng set of numbers that are the result of a random process or phenomena: 0,0,1,,,3,6,,,9,9,9,9,10,1,14,1,3,,34,47 a) Calculate and/or dentfy the mean, medan. mode, range, nter-quartle range, maxmum, outlers, and mnmum of these numbers. b) Draw a hstogram, box-plot, and stem-and-leaf plot for these numbers 3. The heght of men n the Unted States are dstrbuted N(5 10, 3 ). Your two brothers are both 6 1. What percentage of men are taller than your two brothers? 4. How s the Standard Devaton dfferent than the Mean Absolute Devaton? Defne both of these terms and calculate each of them for the set of numbers n problem #. 5. Defne the dea of a sample space n terms of rollng two sx-sded dce. Defne the elementary outcomes for rollng two sx-sded dce. Draw a probablty mass functon for the random experment of rollng to dce and summng ther face up numbers. Determne the probablty of at least one of the de showng a 5 on ts face n a sngle roll of two dce. 6. Defne and provde examples of the followng terms: Populaton, Sample, Parameter, Random Selecton, and Samplng Frame.

Cathy Walker March 5, 010 7. You work for a recycle collecton company collectng curbsde recyclable materal from Resdental neghborhoods n Tacoma, WA. Your trucks can carry 7,000 pounds of recyclable materal before they must be drven back to the transfer staton to be unloaded. You randomly sample 40 of your resdental recycle customers and get the followng values for the weght of ther recyclables. 15 9 30 1 1 19 1 40 3 10 9 14 19 70 63 55 40 1 63 5 47 50 1 15 13 11 7 11 9 1 9 4 6 54 4 6 64 50 1 a) Estmate the mean and the 95% Confdence Interval for the mean amount of recyclables (n lbs) your average customer produces every two weeks. Interpret your results. b) How many resdental recycle customers should you schedule for each truck f you wanted to be 99% confdent that any gven truck would not exceed ts weght lmt on ts assgned route. 3

y Cathy Walker March 5, 010. Smple Lnear Regresson (Ordnary Least Squares) a) Gven the date below, plot the ponts to create a scatterplot, complete the tables wth the values below, and calculate the values ndcated. X Y 1 3 6 7 6 1 3 4 6 3 7 5 9 Fll n the followng table: x x y 0 x y x x y y x x y y Calculate the regresson parameters: Slope b: Intercept a: Usng your regresson parameters, fll n ths table and calculate the values for: Yˆ y ŷ yˆ y SSE: SSR: R : 4

Cathy Walker March 5, 010 9. Descrptve Spatal Statstcs Consder the followng pont locatons n the scatterplot below. (Note: The coordnate locatons of these ponts are provded n the table below along wth each ponts Z value) 1 10 6 4 0 0 4 6 10 1 a) What s the mean center of these ponts? b) What s the weghted mean center of these ponts? (Usng the Z values provded above as the weght) X Y Z 3 13 4 10 5 3 1 5 4 6 4 1 1 4 6 4 1 5 19 7 10 1 9 4 15 9 1 5 17 4 3 1 5 9 14 3 4 3 1 15 10 9 4 4 1 7 5

Cathy Walker March 5, 010 10. Gven the smulated raster mage below wth only 5 known values, use the Inverse Dstance Weghted (IDW) methodology to fll n all the blanks. (Ths wll probably be easest usng Excel). 6 5 7 3 1 a) How good do you thnk your IDW estmates are compared to the real unknown values? How do you thnk he accuracy of your estmates vary as a functon of the dstance from the known values provded? 6

Cathy Walker March 5, 010 Problem Set Answer Key 1. What s the level of measurement for the followng varables? a) SAT scores INTERVAL b) Number of tests or quzzes n statstcal course RATIO c) Acres of land devoted to corn RATIO d) Number of break-ns n 004 by neghborhood ORDINAL e) Socal Securty Number INTERVAL f) Impresson of a certan place selected by recpents from a scale of 1 to 5 ORDINAL g) Name of your brthplace NOMINAL h) Year of brth INTERVAL. Consder the followng set of numbers that are the result of a random process or phenomena: 0,0,1,,,3,6,,,9,9,9,9,10,1,14,1,3,,34,47 a) Calculate and/or dentfy the mean, medan. mode, range, nter-quartle range, maxmum, outlers, and mnmum of these numbers. 1 Mean: (0 0 1 3 6 9 9 9 9 10 1 14 1 3 34 47) 5 1 n 1 1 Medan: 9 Mode: 9 Range: 0 to 47; 4 numbers Outlers: 47 Mnmum: 0 ; Maxmum: 47 Inter-quartle Range: Q 1 =3+/=.5 Q 3 =1+14/=16 IQR=Q 3 -Q 1 =16-.5=13.5 b) Draw a hstogram, box-plot, and stem-and-leaf plot for these numbers 7

Cathy Walker March 5, 010 s n 1 636 131. 0 3. The heght of men n the Unted States are dstrbuted N(5 10, 3 ). Your two brothers are both 6 1. What percentage of men are taller than your two brothers? X X 73 70 3 Z 1 s 3 3 Usng Appendx Table A n the book, the Z-score value of 1 corresponds to a value of 0.3413. Snce we want to know the percentage of men that are above the heght of the two brothers and the sde of the curve totals 0.5000, we fnd that 0.5000-0.3413=0.157 or approxmately 15.7% or 16% of men are taller than the two brothers. 4. How s the Standard Devaton dfferent than the Mean Absolute Devaton? Defne both of these terms and calculate each of them for the set of numbers n problem #. The standard devaton measures the spread of the numbers n a sample for the mean. The mean absolute devaton s the mean of the absolute devatons of a set of data about the data s mean. 0,0,1,,,3,6,,,9,9,9,9,10,1,14,1,3,,34,47 Standard Devaton (s): X X 1 1 1 1 3 1 6 1 1 1 9 1 9 1 9 1 9 1 10 1... 47 1 Mean Absolute Devaton (MD): 1 1 1 MD N N 1 x x 11 1 1 31 6 1 1.. 34 1 47 1 1 160 7.6 1

Cathy Walker March 5, 010 5. Defne the dea of a sample space n terms of rollng two sx-sded dce. Defne the elementary outcomes for rollng two sx-sded dce. Draw a probablty mass functon for the random experment of rollng to dce and summng ther face up numbers. Determne the probablty of at least one of the de showng a 5 on ts face n a sngle roll of two dce. The sample space s the set or collecton of elementary outcomes. For the rollng of two dce the elementary outcomes are as follows: Probablty Mass Functon: f ( y) P e : Y( e ) y Lookng at the possble two-dce combnatons, the probablty of rollng at least one 5 n a sngle roll of two dce s 10/36 or approxmately 7.7% y f(y) 1/36 3 /36 4 3/36 5 4/36 6 5/36 7 6/36 5/36 9 4/36 10 3/36 11 /36 1 1/36 9

Cathy Walker March 5, 010 6. Defne and provde examples of the followng terms: Populaton, Sample, Parameter, Random Selecton, and Samplng Frame. Populaton - the unverse of all ndvduals from whch your sample can be taken.. Example: the populaton of the U.S. or the populaton of the world Sample - a subset or porton of the ndvduals selected from the populaton used for detaled analyss. Example: a 1,000 randomly selected college ages students (1-5 years old), used to determne the drnkng habts of ths age group wthn the U.S. populaton. Parameter - the varable wth whch the sample s gong to measure. Example: drnkng habts. Random Selecton - the procedure of selectng ndvduals for a sample of n objects that are all equally lkely. Example: the selecton of college students for a survey usng randomly selected student ID numbers. Samplng Frame - the practcal or operatonal structure that contans the entre set of elements from whch the sample wll actually be drawn. Example: the entre lst of DU student ID numbers would be the samplng frame from the above example. 10

Cathy Walker March 5, 010 7. You work for a recycle collecton company collectng curbsde recyclable materal from Resdental neghborhoods n Tacoma, WA. Your trucks can carry 7,000 pounds of recyclable materal before they must be drven back to the transfer staton to be unloaded. You randomly sample 40 of your resdental recycle customers and get the followng values for the weght of ther recyclables. 15 9 30 1 1 19 1 40 3 10 9 14 19 70 63 55 40 1 63 5 47 50 1 15 13 11 7 11 9 1 9 4 6 54 4 6 64 50 1 a) Estmate the mean and the 95% Confdence Interval for the mean amount of recyclables (n lbs) your average customer produces every two weeks. Interpret y 15 9 47 1 9 14 50 9 30 19 1 4 1 70 15 6 1 63 13 54 19 55 11 4 1 40 6 40 1 7 64 3 63 11 50 10 5 9 1 Standard Dev. = Mean = 0.6405 33.675 o s X Z u N r 95% _ Confdence : r 0.6405 33.675 1.96 33.675 6.3964 40.07lbs e 40 s u 0.6405 33.675 1.96 33.675 6.3964 7.lbs l 40 t For 40.07 lbs of recyclables the recycle truck can servce approxmately 174.66 or 174 households on a pckup run wthout needng to go back to unload. For 7. lbs of recyclables the recycle truck can servce approxmately 56.59 or 56 households on a pckup run wthout needng to go back to unload. Gven the average pounds of recyclables, you can say wth 95% confdence that the recycle trucks can servce anywhere from 174 to 56 customers on a sngle recyclables pck-up route. b) How many resdental recycle customers should you schedule for each truck f you wanted to be 99% confdent that any gven truck would not exceed ts weght lmt on ts assgned route. s X Z N 99% _ Confdence : 0.6405 33.675.5 33.675.4196 4.0949lbs 40 0.6405 33.675.5 33.675.4196 5.551lbs 40 Gven the average pounds of recyclables, you can say wth 99% confdence that the recycle trucks can servce anywhere from 166 to 77 customers on a sngle recyclables pck-up route. 11

y Cathy Walker March 5, 010. Smple Lnear Regresson (Ordnary Least Squares) b) Gven the date below, plot the ponts to create a scatterplot, complete the tables wth the values below, and calculate the values ndcated. X Y 1 3 6 7 6 1 3 4 6 3 7 5 9 5 x 3.15 4 y 6 Fll n the followng table: x x y y x xy y x x y y -.15-4.5-4.51563 4-0.15-4 0.5 0.01565 16.75 1.75.6563 1-1.15 0 0 1.6563 0 -.15-3 6.375-4.51563 9 0.75 0 0 0.76565 0-0.15 1 0.15-0.01565 1 1.75 3 5.65 3.51563 9 11.5 4.716 40 (SST) Calculate the regresson parameters: Slope b:.3594_ x x y y 1 11.5 b.3594 Intercept a:_-1.3593_ 4.716 x x Yˆ y ŷ y y -3.1336 15.06 15.06-9.55009 133.405 133.405-19.100 61. 61. -6.36673 15.936 15.936-3.1336 3.339 3.339-1.7335 350.944 350.944-9.55009 73.905 73.905-15.916 60.47 60.47-79.541 376.56 (SSE) 0 x ˆ 376.56 (SSR) 1 a y b x (6) (3.15.3594) 1.3593 SSR 376.56 R 59.414 SST 40 SSE: 376.56_ SSR: 376.56 R : 59.414 1

X Y wc WC f f fy f Cathy Walker March 5, 010 9. Descrptve Spatal Statstcs Consder the followng pont locatons n the scatterplot below. (Note: The coordnate locatons of these ponts are provded n the table below along wth each ponts Z value) 1 10 6 4 0 0 4 6 10 1 a) What s the mean center of these ponts? X 4 5 1 6 1 4 7 9 9 4 5 3 310 4 97 X c 4.5 n 0 0 Y 310 3 5 4 4 5 10 4 3 9 4 1 9 4 7 105 Yc 5.5 n 0 0 The mean center of these ponts s (4.5, 5.5) b) What s the weghted mean center of these ponts? (Usng the Z values provded above as the weght) The weghted mean center of these ponts s (3.946, 3.9577) X Y Z 3 13 4 10 5 3 1 5 4 6 4 1 1 4 6 4 1 5 19 7 10 1 9 4 15 9 1 5 17 4 3 1 5 9 14 3 4 3 1 15 10 9 4 4 1 7 X ( 13) (4 ) (5 ) (1 4) (6 1) (1 6) (4 1) ( 19) (7 1) (9 15) (9 1) ( 5)... ( ) 106 3.946 13 4 1 6 1 19 1 15 1 5 17 1 14 15 1 60 (3*13) (10* ) (3*) (5*4) (4*1) (4*6) (*1) (5*19) (10*1) (4*15) (*1) (*5)... (7 *) 109 3.9577 60 60 13

Cathy Walker March 5, 010 10. Gven the smulated raster mage below wth only 5 known values, use the Inverse Dstance Weghted (IDW) methodology to fll n all the blanks. (Ths wll probably be easest usng Excel). Between Ponts Dstance X Y Z Pont # 3. 3.49 4.05 4.3 4.6 4.9 4.73 X,1 6.00000 1 3.6 3.53 3.9 4.6 4.5 4.6 6 4.5 X, 5.3095 3 5 5 3.99 4.15 4.39 4.53 4.66 4.74 4.7 4.53 X,3 5.00000 5 4 7 3 4.19 4.44 5 4.1 5.0 4.73 4.35 4.03 X,4 7.011 6 1 1 4 4.5 4.43 4.63 4.9 7 4.79 4.06 3 X,5 1.4141 7 7 6 5 4.1 4.31 4.43 4.61 4.7 4.7 3. 3.69 X,6 4.00000 4 3 6 4.13 4.16 4.16 4.07 3.74 3.9 3.43 3.6 d x x y X 4.04 4.01 3.9 3.69 3.06 1.9 3.43 1 y1 z o s 1 s 1 1 z d 1 d k k X Y OR (Rounded) 7.70 1.63633 k = 1 = 4.79066 3 3 4 4 5 5 5 Z 4 4 4 4 5 5 6 5 4 4 4 5 5 5 5 5 4 4 5 5 5 5 4 4 4 4 5 5 7 5 4 3 4 4 4 5 5 4 4 4 4 4 4 4 4 3 3 4 4 4 4 4 3 1 3 3 ) How good do you thnk your IDW estmates are compared to the real unknown values? How do you thnk he accuracy of your estmates vary as a functon of the dstance from the known values provded? Lookng at the estmates stuated around the known values n the table some of the Z estmates seem to be a lttle lower than I would expect. Ths s especally true n the case of pont (5, 4). Wth a Z value of 7 the estmated values around ths pont seem lower than I would expect gven ths relatvely hgh Z value. 14