Hypothesis Testing hypothesis testing approach

Size: px
Start display at page:

Download "Hypothesis Testing hypothesis testing approach"

Transcription

1 Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we can make use of the hypothesis testing approach in inferential statistics, which is a multistep process: 1. State the null hypothesis (H 0 ) 2. State the alternative hypothesis (H A ) 3. Choose α, our significance level 4. Select a statistical test, and calculate the test statistic 5. Determine the critical value where H 0 will be rejected 6. Compare the test statistic with the critical value

2 Hypothesis Testing - Tests 4. Select a statistical test, and calculate the test statistic To test the hypothesis, we must construct a test statistic, which frequently takes the form: Test statistic = θ - θ 0 Std. error For example, using the normal distribution, the z-test is formulated as: z = x - µ σ x σ x = σ/ n when s is known, or σ x ~ s/ n when we have to estimate the standard deviation from the sample data

3 Hypothesis Testing - Tests 5. Determine the value where H 0 will be rejected cont. For example, supposed we are applying a z-test to compare the mean of a large sample to the mean of a population, and we choose a 95% level of significance If we formulate our alternate hypothesis as H A : x µ, we are testing whether x is significantly different from µ in either direction, so the acceptance region must include the 95% of the normal distribution on either side of the mean, and the rejection region must include 2.5% of the area in each of the two tails: Z test > Z crit H A -Z crit H 0 Z crit Z test Z crit H 0 H A H A

4 Hypothesis Testing - Tests 5. Determine the value where H 0 will be rejected cont. On the other hand, suppose we formulate our alternate hypothesis as H A : x >µ or H A : x <µ. Then, we are testing whether x is significantly different from µ in a particular direction, so the rejection region must include the 5% of the normal distribution s area in one tail, and the acceptance region must include the remaining 95% of the area (e.g. using H A : x >µ): Z test > Z crit H A Z crit Z test Z crit H 0 H 0 H A

5 Hypothesis Testing - One-Sample Z-test The example we looked at in the last lecture used the onesample z-test, which is formulated as: We use this test statistic: Z test = x - µ σ n (difference between means) (standard error) 1. To compare a sample mean to the population mean 2. If the size of the sample is reasonably large, i.e. n > When the population standard deviation is known (although we can estimate it from the sample standard deviation), so that we can use this value to calculate the standard error in the denominator

6 Hypothesis Testing - One-Sample t-test The one-sample t-test is formulated very much like the one-sample Z-test we looked at earlier: We use this test statistic: t test = x - µ s n (difference between means) (standard error) 1. To compare a sample mean to the population mean 2. If the size of the sample is somewhat small, i.e. n We do not need to know the population standard deviation to calculate the standard error, although we still need to know the population mean for purposes of comparison with the sample mean

7 Hypothesis Testing - Two-Sample t-tests Two-sample t-tests are used to compare one sample mean with another sample mean, rather than with a population parameter. The form of the two-sample t-test that is appropriate depends on whether or not we can treat the variances of the two samples as being equal If the variances can be assumed to be equal (a condition called homoscedasticity), the t-statistic is: t test = x 1 -x 2 S p (1 / n 1 ) + (1 / n 2 ) (n 1-1)s2 1 + (n 2-1)s2 2 and s p is the pooled estimate of the standard deviation: = n 1 + n 2-2

8 Hypothesis Testing - Two-Sample t-tests Two-sample t-tests that use the equal variance assumption have degrees of freedom equal to the sum of the number of observations in the two samples, less two since we are estimating the values of two means here: df = (n 1 + n 2-2) If we cannot assume that the two samples have equal variances, the appropriate t-statistic takes a slightly different form, since we cannot produce a pooled estimate for the standard error portion of the statistic: t test = x 1 -x 2 (s 12 / n 1 ) + (s 22 / n 2 )

9 Hypothesis Testing - Two-Sample t-tests Unfortunately, in the heteroscedastic case (where the variances are unequal), calculating the degrees of freedom appropriate to use for the critical t-score uses a somewhat involved formula (equation 3.17 on p. 50) As an alternative, Rogerson suggests using the lesser value of n 1-1 or n 2-1: df = min[(n 1-1),(n 2-1)] based on the grounds that this value will always be lower than that produced by the involved calculation, and thus will produce a higher t crit score at the selected α; this is a conservative assumption because it makes it even harder to reject the null hypothesis mistakenly and commit a type I error

10 Hypothesis Testing - F-test In order to make the decision as to whether or not the variances of two samples are the same or different enough to warrant the use of one form the two-sample t- test or the other, we have a further statistical test that we use to compare the variances The F-test, a.k.a. the variance ratio test, assesses whether or not the variances are equal by computing a test statistic of the form: F test = s 1 2 s 2 2 Critical values are taken from the F-distribution, which has a 2-dimensional array of degrees of freedom (i.e. n 1-1 df in the numerator, n 2-1 df in the denominator)

11 Hypothesis Testing - Matched Pairs t-tests The form of the sample statistic is based upon the calculated differences between the two samples: t test = d s d n We use this test statistic: where d is the average of the differences S d = Σ (d i -d) 2 n To compare the sample means of paired samples 2. The size of the samples is somewhat small, i.e. n When the two samples contain members that were not sampled at random but represent observations of the same entities, usually at different times or after some treatment has been applied

12 ANOVA - An F-test The ANOVA F-test is formulated as: where F test = BSS / (k - 1) WSS / (N - k) k is the number of groups N is the total number of observations BSS is the between-group sum of squares WSS is the within-group sum of squares and the total sum of squares is the sum of the betweengroup and within-group sums, i.e. TSS = BSS + WSS (important because BSS can be tedious to calculate, but by calculating WSS and TSS, BSS = TSS - WSS)

13 Arrangement of Data for ANOVA Category 1 Category 2 Category 3 Category k Obs. 1 x 11 x 12 x 13 x 1k Obs. 2 x 21 x 22 x 23 x 2k Obs. 3 x 31 x 32 x 33 x 3k Obs. 4 x 41 x 42 x 43 x 4k.. Ȯbs. i x i1 x i2 x i3 x ik No. of obs. n 1 n 2 n 3 n k Mean x +1 x +2 x +3 x +4 Std. Dev. s 1 s 2 s 3 s k Overall Mean: x ++

14 ANOVA Table A useful way to go through the process of calculating an ANOVA is to fill in an ANOVA table: Source of Sum of Degrees of Mean Square Variation Squares Freedom Variance F-Test Between BSS k - 1 MS B Groups Within WSS N - k MS W MS B Groups MS W Total TSS N - 1 Variation

15 Covariance Formulae The covariance of variable X with respect to variable Y can be calculated using the following formula: Cov [X, Y] = 1 n -1 i=n Σ x i y i -xy i=1 The formula for covariance can be expressed in many ways. The following equation is an equivalent expression of covariance (due to the distributive property): Cov [X, Y] = 1 n -1 i=n Σ (x i - x)(y i -y) i=1

16 Pearson s Correlation Coefficient A standardized measure of covariance provides a value that describes the degree to which two variables correlate with one another, expressing this using a value ranging from 1 to +1, where 1 denotes an inverse relationship and +1 denotes a positive relationship One such measure is known as Pearson s Correlation Coefficient (a.k.a. Pearson s Product Moment), and it produced through standardizing the covariance by dividing it by the product of the standard deviations of the Y and X variables: Pearson s Correlation Coefficient r = Cov [X, Y] s X s Y

17 Pearson s Correlation Coefficient As is the case with covariance, the correlation coefficient can be expressed in several equivalent ways: Pearson s Correlation Coefficient = i=n Σ (x i - x)(y i -y) i=1 (n -1) s X s Y It can also be expressed in terms of z-scores, which is convenient if you have already calculated them: Pearson s Correlation Coefficient = i=n Σ Z x Z y i=1 (n -1)

18 A Significance Test for r The sampling distribution of r follows a t-distribution with (n - 2) degrees of freedom, and we can estimate the standard error of r using: SE r = 1 - r 2 n - 2 The test itself takes the form of the correlation coefficient divided by the standard error, thus: t test = r SE r = r 1 - r 2 n - 2 = r n r 2

19 Spearmann s Rank Correlation Coefficient We have an alternative correlation coefficient we can use with ordinal data: Spearmann s Rank Correlation Coefficient (r s ) r s = 1 - i=n 6Σ d 2 i=1 i n 3 - n where n = sample size d i = the difference in the rankings of each value with respect to each variable

20 A Significance Test for r s As was the case for Pearson s Correlation Coefficient, we can test the significance of an r s result using a t-test The test statistic and degrees are formulated a little differently for r s, although many of the characteristics of the distribution of r values are present here as well: In this case, r s values follow a t-distribution with (n - 1) degrees of freedom, and their standard error can be estimated using: SE rs = yielding the test statistic: 1 n -1 t test = r s SE rs = r s n -1

21 Simple Linear Regression Simple linear regression models the relationship between an independent variable (x) and a dependent variable (y) using an equation that expresses y as a linear function of x, plus an error term: y (dependent) a error: ε b x (independent) y = a + bx + e x is the independent variable y is the dependent variable b is the slope of the fitted line a is the intercept of the fitted line e is the error term

22 Least Squares Method The least squares method operates mathematically, minimizing the error term e for all points We can describe the line of best fit we will find using the equation ŷ = a + bx, and you ll recall that from a previous slide that the formula for our linear model was expressed using y = a + bx + e y ŷ We use the value ŷ on the line to estimate the true value, y (y - ŷ) The difference between the two is (y - ŷ) = e ŷ = a + bx This difference is positive for points above the line, and negative for points below it

23 Error Sum of Squares By squaring the differences between y and ŷ, and summing these values for all points in the data set, we calculate the error sum of squares (usually denoted by SSE): n SSE = Σ (y - ŷ) 2 i = 1 The least squares method of selecting a line of best fit functions by finding the parameters of a line (intercept a and slope b) that minimizes the error sum of squares, i.e. it is known as the least squares method because it finds the line that makes the SSE as small as it can possibly be, minimizing the vertical distances between the line and the points

24 Finding Regression Coefficients The equations used to find the values for the slope (b) and intercept (a) of the line of best fit using the least squares method are: b = n Σ (x i - x) (y i -y) i = 1 a = y - bx n Σ (x i -x) 2 i = 1 Where: x i is the i th independent variable value y i is the i th dependent variable value x is the mean value of all the x i values y is the mean value of all the y i values

25 Regression Slope and Correlation The interpretation of the sign of the slope parameter and the correlation coefficient is identical, and this is no coincidence the numerator of the slope expression is identical to that of the correlation coefficient r = i=n Σ (x i - x)(y i -y) i=1 (n - 1) s X s Y The regression slope can expressed in terms of the correlation coefficient: b = n b = r s y s x Σ (x i - x) (y i -y) i = 1 n Σ (x i -x) 2 i = 1

26 Coefficient of Determination (r 2 ) The regression sum of squares (SSR) expresses the improvement made in estimating y by using the regression line: n y ŷ y SSR = Σ (ŷ i -y) 2 i = 1 The total sum of squares (SST) expresses the overall variation between the values of y and their mean y: n SST = Σ (y i -y) 2 i = 1 The coefficient of determination (R 2 ) expresses the amount of variation in y explained by the regression line (the strength of the relationship): r 2 = SSR SST

27 Partitioning the Total Sum of Squares We can decompose the total sum of squares into those two components: n n n SST = Σ (y i -y) 2 i = 1 In other words: SST = SSR + SSE and the coefficient of determination expresses the portion of the total variation in y explained by the regression line = Σ (ŷ i -y) 2 i = 1 SST + Σ (y i - ŷ) 2 i = 1 SSE y SSR ŷ y

28 Regression ANOVA Table We can create an analysis of variance table that allows us to display the sums of squares, their degrees of freedom, mean square values (for the regression and error sums of squares), and an F-statistic: Component Regression (SSR) Error (SSE) Total (SST) Sum of Squares n Σ (ŷ i -y) 2 i = 1 n Σ (y i - ŷ) 2 i = 1 n Σ (y i -y) 2 i = 1 df 1 n - 2 n - 1 Mean Square SSR / 1 SSE / (n - 2) F MSSR MSSE

29 A Significance Test for r 2 We can test to see if the regression line has been successful in explaining a significant portion of the variation in y, by performing an F-test This operates in a similar fashion to how we used the F-test in ANOVA, this time testing the null hypothesis that the true coefficient of determination of the population ρ 2 = 0 using an F-test formulated as: F test = r2 (n - 2) = MSSR 1 - r 2 MSSE which has an F-distribution with degrees of freedom: df = (1, n - 2)

30 Significance Tests for Regression Parameters In addition to evaluating the overall significance of a regression model by testing the r 2 value using an F-test, we can also test the significance of individual regression parameters using t-tests These t-tests have the regression parameter in some form in the numerator, and the standard error of the regression parameter in the denominator First, we must calculate the standard error of the estimate, also known as the standard deviation of the residuals (s e ): s e = n Σi = 1 (y i - ŷ) 2 (n - 2)

31 Significance Test for Regression Slope We can formulate a t-test to test the significance of the regression slope (b) We will be testing the null hypothesis that the true value of the slope is equal to zero, e.g. H 0 : β = 0, using the following t-test: t test = where s b is the standard deviation of the slope parameter: s b = b s b s e 2 (n - 1) s x 2 and degrees of freedom = (n - 2)

32 Significance Test for Regression Intercept We can formulate a similar t-test to test the significance of the regression intercept (a) We will be testing the null hypothesis that the true value of the intercept is equal to zero, e.g. H 0 : α = 0, using the following t-test: t test = where s a is the standard deviation of the intercept: a s a s a = s e 2 Σx i 2 and degrees of freedom = (n - 2) nσ(x i -x) 2

33 Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated with point data (not attributes associated with those locations, just where they are found) Geographic Patterns in Areal Data -These methods are used to examine the pattern of attribute values associated with polygon representations of geographic phenomena (i.e. is there a pattern in the attributes of a set of adjacent polygons?)

34 Point Pattern Analysis While being able to qualitatively describe a point pattern as being {regular, random, clustered} is useful, we want to have a rigorous, quantitative means of describing these patterns We will examine two approaches for doing so: 1. The Quadrat Method - Divide the study area into equal sections, count points per section, and derive a statistic to compare counts to expectations 2. Nearest Neighbor Analysis - Compare the distances between points to an expected distance between points

35 χ 2 Test in the Quadrat Method Once we have calculated the mean number of points per quadrat and the variance of points per quadrat, we can calculate the χ 2 test statistic using: χ 2 = (m -1) s2 x = (m -1) * VMR where: m is the # of quadrats s is the std. dev. of the points per quadrat x is the mean of the points per quadrat This χ 2 test statistic has (m -1) degrees of freedom, and is compared to a critical value from the χ 2 distribution, yet another probability distribution for which we have tables of values (Table A.6, p.221):

36 Summary of the Quadrat Method 1. Divide a study region into m cells of equal size 2. Find the mean number of points per cell, which is equal to the total number of points divided by the total number of cells 3. Find the variance of the number of points per cell (s 2 ) using i=m Σi=1 (x i x) 2 s 2 = m -1 where x i is the number of points in cell i

37 Summary of the Quadrat Method 4. Calculate the variance to mean ratio (VMR): VMR = s2 x 5. Interpret the variance to mean ratio (VMR), and if a hypothesis test is desired, calculate the χ 2 statistic for quadrat analysis: (m -1) χ 2 s2 = x = (m -1)* VMR comparing the test stat. to critical values from the χ 2 distribution with df = (m -1)

38 2. Nearest Neighbor Analysis An alternative approach to assessing a point pattern can be formulated that examines the distances between points in the pattern in terms of the distance between any given point and its nearest neighbor If we define d i as the distance between a point and its nearest neighbor, the average distance between neighboring points (R O ) can be written as: n R Σ d O = i = 1 i n

39 The Nearest Neighbor Statistic We can also calculate an expected distance between nearest neighbors (R E ) in a point pattern (where the expected pattern conforms to our usual null hypothesis of a random point pattern): R E = 2 λ 1 where λ is the number of points per unit area The ratio between the observed and expected distances is the nearest neighbor statistic (R): R = R O x = R E 1/ (2 λ) where x is the average observed distance d i

40 Interpreting the Nearest Neighbor Statistic Values of R can range from: 0 when all points are coincident and the distances between them are thus 0, UP TO A theoretical maximum of , for a perfectly uniform pattern of points spread out on an infinitely large 2-dimensional plane Through the examination of many random point patterns, the variance of the mean distances between neighbors has been found to be: V [R E ] = 4 - π 4πλn where n is the number of points

41 Interpreting the Nearest Neighbor Statistic Since we have a means of estimating the variance of R E, we can calculate a standard error for R E and formulate a test statistic to test the null hypothesis that the pattern is random: Z test = R O - R E R = O - R E V [R E ] (4 - π)/(4πλn) = (R O - R E ) λn This test statistic is normally distributed with mean 0 and variance 1, thus we can use the standard normal distribution to assess its significance

42 Contingency Tables and the χ 2 Test Once we have observed and expected frequencies for each cell in the contingency table, we can use those values to calculate the χ 2 test statistic: χ 2 = n Σi = 1 (O - E) 2 E where: O is the observed freq. E is the expected freq. n is the number of cells This χ 2 test statistic has (r -1) * (c - 1) degrees of freedom, where are r & c are the number of rows and columns in the contingency table If the observed frequencies are very different from the expected frequencies, χ 2 test will be larger than the 1- tailed critical value it will be compared it to, thus detecting the presence of a spatial pattern

43 The Joint Count Statistic The first step in this method is to enumerate all of the pairs of polygons that share a boundary by creating a binary connectivity table (a.k.a. a spatial matrix). For example using the following five region system: A C B D E 1. Label the regions 2. Create a table with the same row & column labels A B C D E A B C D E Fill in the table with 1s and 0s to indicate which regions share a boundary

44 The Joint Count Statistic We can now take the sum of all the 1 s in the binary connectivity table and divide by 2 to calculate the total number of shared boundaries in the system (J): J = n Σi = 1 x i 2 Next, we are ready to look at the attribute information associated with the polygons to determine if each pair of polygons that shares a boundary has the same values or different values The joint count statistic is designed to be used with binary nominal attributes, i.e. the attribute values need to be reduced to some 2 class description for use in this statistic

45 The Joint Count Statistic The expected number of +- boundaries is calculated as: E [+-] = 2JPM N(N - 1) where: J is the total number of shared boundaries P is the number of + polygons M is the number - polygons N is the total number of polygons For our example, E [+-] is calculated as: E [+-] = 2JPM N(N - 1) = 2*7*3*2 5(5-1) = = 4.2 We will form a statistic by comparing the expected number of +- boundaries to the observed number of +-, which we obtain by simply counting the number of shared boundaries with this characteristic (being careful not to double count)

46 The Joint Count Statistic For our example five region system, the observed number of shared +- boundaries is 5 The last ingredient we need to be able to build a test statistic is an estimate of the variance in E[+-], and unfortunately, calculating this quantity requires a somewhat involved expression: Σ L i (L i -1)PM N(N - 1) 4[J(J -1)- Σ L i (L i -1)]P(P -1)M(M -1) N(N - 1)(N - 2)(N - 3) V [+-] = E [+-] -E[+-] where L i is the total number of boundaries shared by region i In our example V [+-] = 0.56

47 The Joint Count Statistic We can now calculate a test statistic to compare the observed number of +- boundaries to the expected number of +- boundaries as a Z-statistic: (Obs. +- ) - E [+-] Z test = V [+-] This test statistic is normally distributed with mean 0 and variance 1, thus we can use the standard normal distribution to assess its significance An exceptional Z-statistic value would indicate a level of spatial autocorrelation that exceeds the expected amount for our system

48 Moran s I Statistic Thus, for each and every pair of polygons in the system, a weight expresses the degree to which they are spatiallyrelated (close to each other, connected, etc.) This weight term is multiplied by an expression that compares the attribute values of each and every pair of polygons, by calculating the mean and standard deviation for the whole data set, and then comparing the z-scores of the variable values for each polygon to that of the other: Moran s I = n ΣΣ w ij z i z i j j (n -1) ΣΣ w ij i j where n is the number of polygons w ij is the weight for combinations of the polygon in column i and the polygon in row j of the connectivity matrix z i and z j are z-scores

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Hypothesis Testing hypothesis testing approach formulation of the test statistic Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Assumptions of classical multiple regression model

Assumptions of classical multiple regression model ESD: Recitation #7 Assumptions of classical multiple regression model Linearity Full rank Exogeneity of independent variables Homoscedasticity and non autocorrellation Exogenously generated data Normal

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006 and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd Why We Use Analysis of Variance to Compare Group Means and How it Works The question of how to compare the population means of more than two groups is an important one to researchers. Let us suppose that

More information

Simple Linear Regression: One Quantitative IV

Simple Linear Regression: One Quantitative IV Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

BNAD 276 Lecture 10 Simple Linear Regression Model

BNAD 276 Lecture 10 Simple Linear Regression Model 1 / 27 BNAD 276 Lecture 10 Simple Linear Regression Model Phuong Ho May 30, 2017 2 / 27 Outline 1 Introduction 2 3 / 27 Outline 1 Introduction 2 4 / 27 Simple Linear Regression Model Managerial decisions

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections

Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections 2.1 2.3 by Iain Pardoe 2.1 Probability model for and 2 Simple linear regression model for and....................................

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Statistics Introductory Correlation

Statistics Introductory Correlation Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.

More information

Chapter Eight: Assessment of Relationships 1/42

Chapter Eight: Assessment of Relationships 1/42 Chapter Eight: Assessment of Relationships 1/42 8.1 Introduction 2/42 Background This chapter deals, primarily, with two topics. The Pearson product-moment correlation coefficient. The chi-square test

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model. Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

More information

Correlation: Relationships between Variables

Correlation: Relationships between Variables Correlation Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means However, researchers are

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different

More information

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent: Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Ch. 16: Correlation and Regression

Ch. 16: Correlation and Regression Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs) The One-Way Repeated-Measures ANOVA (For Within-Subjects Designs) Logic of the Repeated-Measures ANOVA The repeated-measures ANOVA extends the analysis of variance to research situations using repeated-measures

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

4/22/2010. Test 3 Review ANOVA

4/22/2010. Test 3 Review ANOVA Test 3 Review ANOVA 1 School recruiter wants to examine if there are difference between students at different class ranks in their reported intensity of school spirit. What is the factor? How many levels

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Week 12 Hypothesis Testing, Part II Comparing Two Populations

Week 12 Hypothesis Testing, Part II Comparing Two Populations Week 12 Hypothesis Testing, Part II Week 12 Hypothesis Testing, Part II Week 12 Objectives 1 The principle of Analysis of Variance is introduced and used to derive the F-test for testing the model utility

More information

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv). Regression Analysis Two variables may be related in such a way that the magnitude of one, the dependent variable, is assumed to be a function of the magnitude of the second, the independent variable; however,

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

This gives us an upper and lower bound that capture our population mean.

This gives us an upper and lower bound that capture our population mean. Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Reminder: Student Instructional Rating Surveys

Reminder: Student Instructional Rating Surveys Reminder: Student Instructional Rating Surveys You have until May 7 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs The survey should be available

More information

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64

More information