S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Similar documents
Regression, Inference, and Model Building

Properties and Hypothesis Testing

1 Inferential Methods for Correlation and Regression Analysis

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Simple Linear Regression

Mathematical Notation Math Introduction to Applied Statistics

(all terms are scalars).the minimization is clearer in sum notation:

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Correlation Regression

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Algebra of Least Squares

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Linear Regression Models

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Simple Linear Regression

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

11 Correlation and Regression

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Final Examination Solutions 17/6/2010

Correlation and Regression

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Stat 200 -Testing Summary Page 1

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

University of California, Los Angeles Department of Statistics. Simple regression analysis

Sample Size Determination (Two or More Samples)

Describing the Relation between Two Variables

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

Statistics 20: Final Exam Solutions Summer Session 2007

ECON 3150/4150, Spring term Lecture 3

Statistical Properties of OLS estimators

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Chapter 1 (Definitions)

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

TIME SERIES AND REGRESSION APPLIED TO HOUSING PRICE

Correlation and Covariance

Chapter 12 Correlation

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Linear Regression Demystified

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

STP 226 ELEMENTARY STATISTICS

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Common Large/Small Sample Tests 1/55

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Formulas and Tables for Gerstman

Stat 139 Homework 7 Solutions, Fall 2015

Lecture 11 Simple Linear Regression

Homework for 4/9 Due 4/16

Linear Regression Models, OLS, Assumptions and Properties

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

MA 575, Linear Models : Homework 3

Ismor Fischer, 1/11/

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Pearson Edexcel Level 3 Advanced Subsidiary and Advanced GCE in Statistics

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Comparing your lab results with the others by one-way ANOVA

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Topic 9: Sampling Distributions of Estimators

ECON 3150/4150, Spring term Lecture 1

A proposed discrete distribution for the statistical modeling of

4 Multidimensional quantitative data

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

of the matrix is =-85, so it is not positive definite. Thus, the first

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Random Variables, Sampling and Estimation

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Regression and Correlation

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 7: Properties of Random Samples

Lesson 11: Simple Linear Regression

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Chapter 13, Part A Analysis of Variance and Experimental Design

Regression and correlation

CLRM estimation Pietro Coretto Econometrics

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Topic 9: Sampling Distributions of Estimators

Paired Data and Linear Correlation

Lecture 9: Independent Groups & Repeated Measures t-test

Simple Regression Model

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

UNIT 11 MULTIPLE LINEAR REGRESSION

Transcription:

1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these two variables Preferably, both X ad Y are measured at the iterval or ratio level, although it is also commo to estimate correlatio coefficiets ad regressio lies whe oe or both variables are measured at oly the ordial level The first stage i obtaiig the estimates of correlatio ad regressio statistics is to compute ΣX, ΣY, ΣX, ΣY ad ΣXY Each summatio is across all values of X ad Y The use these summatios to calculate the followig expressios: = Σ(X X) = ΣX (ΣX) S XY = ΣXY (ΣX)(ΣY ) Correlatio coefficiet S Y Y = ΣY (ΣY ) Usig the above expressios, the correlatio coefficiet is r = S XY SXX S Y Y Regressio lie The slope b ad the itercept a of the regressio lie are b = S XY a = Ȳ b X

Sociology 405/805 Witer 004 Correlatio ad regressio formulae where Ȳ = ΣY/ ad X = ΣX/ The estimate of the regressio lie expressig the relatioship betwee the depedet variable Y ad the idepedet variable X is Ŷ = a + bx Stadard errors For this regressio lie, the stadard error of estimate is s e = ad the stadard deviatio of b is ΣY aσy bσxy s b = s e SXX The stadard deviatio of the mea predicted value Ŷ is sŷ = s 1 e + (X X) ad the stadard deviatio for a idividual predicted value, Ŷi, is sŷi = s 1 e 1 + + (X X) Compoets of the Variatio i the Depedet Variable Y The total variatio i the depedet variable is SS t = (Y i Ȳ ) = S Y Y This total variatio ca be broke ito two compoets, the explaied variatio, or regressio sum of squares, ad the uexplaied, or residual sum of squares

Sociology 405/805 Witer 004 Correlatio ad regressio formulae 3 The regressio sum of squares is SS r = (Ŷi Ȳ ) = b The uexplaied variatio, or the residual or error sum of squares is SS e = (Y i Ŷi) = ΣY aσy bσxy These two compoets of the total variatio ca be used to determie R, the goodess of fit of the regressio equatio Tests of Statistical Sigificace R = SS r SS t There are various ways of testig for the statistical sigificace of the regressio lie For each test, the ull hypothesis is that there is o relatioship betwee X ad Y The alterative hypothesis ca be costructed as either a oe or two directioal statemet These ca be stated i geeral as: H 0 : No relatioship betwee X ad Y H 1 : Some relatioship betwee X ad Y Alteratively, the research hypothesis ca be stated as a oe directioal relatioship, either a positive or a egative relatioship betwee X ad Y For a hypothesis test about the goodess of fit R, the hypotheses are: H 0 : R = 0 H 1 : R 0 The test for R is a F test with 1 ad ( ) degrees of freedom ad ca be writte as: SS r SS e /( ) = R ( ) 1 R The tests of sigificace for the correlatio coefficiet r, ad for the slope of the lie b, are usually costructed as oe directioal tests The ull hypothesis is that there is o relatioship betwee X ad Y, ad the research

Sociology 405/805 Witer 004 Correlatio ad regressio formulae 4 hypothesis is either a positive relatioship betwee X ad Y, or a egative relatioship betwee the two variables If the test is to determie whether there is a positive relatioship betwee the two variables, the hypotheses for the test of sigificace o the Pearso correlatio coefficiet r would be as follows Let ρ be the true correlatio betwee X ad Y H 0 : ρ = 0 H 1 : ρ > 0 The followig t-test with ( ) degrees of freedom tests these hypotheses: t = r 1 r To test for a positive slope for the regressio lie, b, the hypotheses are: H 0 : β = 0 H 1 : β > 0 where β is the slope of the true regressio lie whe Y is regressed o X This test is usually writte as a t-test with degrees of freedom, where t = b β s b If the ull hypothesis is that β = 0, the this test is simply t = b s b Note that for a bivariate relatioship, ivolvig oly two variables, each of the above three tests is really the same test, so ot more tha oe of these tests eed be reported That is, t = b s b = r 1 r ad R 1 R ( ) = t

Sociology 405/805 Witer 004 Correlatio ad regressio formulae 5 Aalysis of variace The decompositio of the variatio of Y is preseted as a aalysis of variace table i Table 1 Recallig that SS r = (Ŷi Ȳ ) = b ad the F test is SS e = (Y i Ŷi) = ΣY aσy bσxy, SS r SS e /( ) = R ( ) 1 R Table 1: Aalysis of Variace Table Source of Degrees of Variatio Sum of Squares Freedom Mea Square F Regressio SS r = (Ŷi Ȳ ) 1 SS r R ( )/(1 R ) Residual SS e = (Y i Ŷi) SS e /( ) Total SS t = (Y i Ȳ ) 1 Last edited February 4, 004