Correlation and Covariance

Size: px
Start display at page:

Download "Correlation and Covariance"

Transcription

1 Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio provides estimates of the relatioship betwee the depedet variable ad the idepedet variable(s) via a equatio of a lie The estimates, called coefficiets, ca be based o a sample ad ca be tested via a hypothesis test or cofidece iterval = * Correlatio ad Regressio Correlatio A measure of associatio betwee two variables Expressed as a liear relatioship Based o the co-variace - how two variables vary about their meas together Ca be show i a visual way via a scatterplot Bivariate Fit of By r =. Correlatio ad Regressio A focus o the variace!( X " X ) = TSS Total Sum of Squares Deviatios ( X " X )! = MS MeaSquared Deviatio " A focus o the co-variace #( X i " X )( Y i "Y ) i= Cov XY = A focus o the equatio of a lie Y = a + b*x where a is the itercept ad b is the slope

2 Let s revisit the Variace I statistics we are iterested i how a variable varies about its mea We represeted this as the Variace - the Mea Squared Deviatio!( X " X ) = TSS Total Sum of Squares Deviatios ( X " X )! = MS Mea Squared Deviatio " Basics of Co-Variace Let s start with a basic graph of a Y-variable vs a X variable. I will dissect the graph with the mea of X ad the Mea of Y Y-mea II Above Y-mea Below X-mea Below Y-mea Below X-mea III I Above Y-mea Above X-mea Below Y-mea Above X-mea IV X-mea 7 The Co-Variace Basics of Co-Variace The Covariace looks at how two variables, X ad Y, vary about their meas together We express it as a average, divided by (ot -) #( X i " X ) Y i "Y i= Cov XY = ( ) Cov XY = SS XY The covariace is a basic buildig block of correlatio, regressio, ad the Geeral Liear Model Y-mea II Values that ted to fall here ad III X-mea I here reflect egative covariace IV

3 Basics of Co-Variace Y-mea 9 II here reflect positive covariace III X-mea I Values that ted to fall here ad IV States with the Smartest Kids Here are distributio statistics th ad th grade math Stem ad Leaf.% maximum Mea.7 Stem Leaf Cout 99.% Std Dev. 97.%.7 Std Err Mea. 9 Upper 9% Mea. 7.% 7 Lower 9% Mea. 9.% media N..% Sum Wgt. 7.% Sum % Variace..% Skewess..% miimum Kurtosis. CV.7 N Missig. represets. Stem ad Leaf.% maximum 7 Mea 7. Stem Leaf 99.% 7 Std Dev %. Std Err Mea..7 Upper 9% Mea. 7.% Lower 9% Mea..% media 7 N..% Sum Wgt..%. Sum 7..% Variace 9..% Skewess.7 9.% miimum Kurtosis. CV. 7 N Missig. represets. Cout States with the Smartest Kids This is some data o states plus Washigto D.C. ad army bases overseas The key variables are the percet of studets i 9 who scored at a advaced level or higher for th ad th grade math, ad th ad th grade readig. Ay thoughts o this data? Smartest Kids Data: Covariace of th Math ad th Math Most of the data poits fall ito quadrats I ad III Positive co-variace As th grade percet icreases, so does th grade percet Bivariate Fit of By Covariace Matrix Fit Mea The umbers o the diagoal are the variaces - the covariace of a umber with itself is the variace

4 Shortcomigs of Co-Variace The covariace betwee two variables is a useful cocept it is the buildig block for regressio ad other multivariate techiques But as a measure of associatio it has limits It is symmetrical - ot a bad thig It is ubouded ukow high or low Covariace Matrix.. It is difficult to determie what the represets - a lot? a little? just how much???? Expressed i awkward cross-product uits Bivariate Fit of By Fit Mea Smartest Kids Data Most of the data poits fall ito quadrats I ad III Positive co-variace As th grade percet icreases, so does th grade percet Covariace Matrix Pearso Correlatio Coefficiet - r The correlatio coefficiet (r) is the co-variace adjusted for the stadard deviatios of both variables The adjustmet is simple, ad it makes it so much easier to iterpret r = r = Cov XY s X s Y #(X " X )(Y "Y ) # # (X " X ) (Y "Y ) r = SS XY SS X SS Y Properties of r Correlatio Coefficiet r Based o a liear measure of associatio Bouded betwee - ad Symmetrical relatioship: r XY = r YX Easier to iterpret Ivariat to liear scalig add/subtract or multiply/divide by a costat does ot chage the value of r betwee two variables Example: The correlatio betwee the respodet s educatio ad icome does ot chage if you express icome i total dollars or per $

5 Iterpretatio of r The closer the correlatio is to : the more perfect positive liear relatioship If r = the all values would fall o a straight lie, upward slope The closer the correlatio is to : The more perfect egative liear relatioship If r = - the all values would fall o a straight lie, dowward slope The scatterplot is a visual depictio of the correlatio coefficiet Iterpretatio of r meas o liear relatioship No-Liear Relatioship with a Near-Zero r Scatter Plot Of Crab Force by Height The correlatio is., a strog positive correlatio Bivariate Fit of By Iterpretatio of r Oe other iterestig iterpretatio of r The square of r is equal to R-square, a measure of associatio i Regressio Oly i the case of a bivariate regressio - oe idepedet variable Ad it moves us toward defiig oe variable as explaiig the other This meas that r ca be iterpreted as the percet of variability i a variable that is explaied by the other variable

6 Iterpretig a correlatio coefficiet: Rules of Thumb for Narratives The followig is a table givig guidelie for arratives ivolvig correlatios. For simplicity sake, the table is based o the absolute value of the correlatio ( r ) Ad the exact descriptio depeds upo the subject ad disciplie Correlatio Rage Percet Variability Explaied (r ) Descriptio. to. to % Weak. to.9 % to % Moderate. to.7 % to % Moderately Strog.7 to. 7% to % Strog Some poiters i correlatio ad covariace Correlatio ad co-variace requires the umber of observatios for all variables be the same cases with missig values are excluded. With Excel, this is eve more of a problem Try to put the variable you are most iterested i first colum (i.e., the Depedet Variable). The you read dow the first colum to fid the relatioship with the depedet variable with other variables Readig the correlatio betwee other variables requires you to move across rows ad dow colums To get the covariace ad correlatio I Excel it is easy Tools!!!! Data Aalysis!!!! Covariace! or Correlatio!! Iput Rage (click to the right ad grab the data - i all four colums icludig labels)! Grouped by Colums!!!! Labels i first row (Yes) I JMP you eed to go to Multivariate Methods Multivariate List the variables Click the Hot Poit to get correlatios or covariaces Omi-Bar Study You are the marketig maager for OmiFoods ad you are plaig a atio-wide itroductio of a eergy bar, OmiPower. The bar was first marketed to high ed athletes ad moutai climbers, but ow is more popular with the geeral public. The compay wats to test market the bars ad determie the effect of price ad i-store promotios o the sales of the bars. They desig a study ad test OmiPower i a sample of stores i a supermarket chai. The depedet variable is Sales i dollars. The idepedet variables are price ad promotio. Whole values have bee carefully chose for the study. Price i three levels: $.9, $.79, ad $.99 Promotio i store i Three Levels: $, $, $

7 A closer look at sales Covariace ad Correlatio.% maximum 99.% 97.% 7.%.% media.%.%.%.%.% miimum Mea Std Dev Std Err Mea Upper 9% Mea Lower 9% Mea N Sum Wgt Sum Variace Skewess Kurtosis CV N Missig Stem ad Leaf Stem Leaf Cout Note: Covariace difficult to iterpret Correlatios are relatively straight-forward Little correlatio betwee Price ad Promotio 7 represets 7 Mea level of sales is $,9 The media is cosiderably higher at $, A fair amout of spread i the data: CV is. Std. Dev is $, Covariace Matrix PRICE PROMOTION PRICE PROMOTION Correlatios PRICE PROMOTION PRICE PROMOTION The correlatios are estimated by REML method. Let s look at the relatioship of Sales with Price ad Sales with Promotio Sales by Price has a dowward slopig relatioship. As Price goes up, sales go dow - egative covariace It looks liear ad moderately strog Sales by Promotio has a upward slopig relatioship As Promotio goes up, sales go up - positive covariace It looks liear ad moderately strog Bivariate Fit of By PRICE Bivariate Fit of By PROMOTION It is a Easy step to Regressio Bivariate Fit of By PRICE PRICE PRICE PROMOTION

8 Real Life Correlatio Example Cliet:! Nicholas Hidell, Quip Laboratories The compay had two ways to measure how clea the labs were CFU ad RLU Oe was more expesive ad preferred by the compay The other was cheaper ad preferred by the cliet They wated to show the cliet that the two measures were ot the same 9 Distributios RATING 9 7.% 99.% 97.% 7.%.%.%.%.%.%.% maximum media miimum Mea Std Dev Std Err Mea upper 9% Mea lower 9% Mea N Sum Wgt Sum Variace Skewess Let s look at a example of correlatio ad covariace SALARY 9 7.% 99.% 97.% 7.%.%.%.%.%.%.% maximum media miimum Mea Std Dev Std Err Mea upper 9% Mea lower 9% Mea N Sum Wgt Sum Variace Skewess YEARS.% 99.% 97.% 7.%.%.%.%.%.%.% maximum media miimum Mea Std Dev Std Err Mea upper 9% Mea lower 9% Mea N Sum Wgt Sum Variace Skewess ORIGIN Outside Compay Iside Compay Frequecies Level Iside Compay Outside Compay Total N Missig Levels Cout Prob.7.. Let s look at a example of correlatio ad covariace The followig is some data about mid-level maagers i a compay. The variables are: RATING, a ratig scale of the maagers from to ; SALARY, the salary of the maager i $,; YEARS, years of service at the compay; ORIGIN, a dummy variable idicatig whether they were promoted iside the compay (coded as ) or were recruited from outside the compay (coded as ). At this poit we wo t worry about a depedet or idepedet variable The Covariace Matrix for Maager Ratigs Data The covariace matrix has the variaces o the diagoal (populatio variace based o ) ad the co-variaces o the off-diagoal. It is a symmetric matrix. Covariace Matrix RATING SALARY YEARS ORIGIN RATING SALARY YEARS ORIGIN

9 The Correlatio Matrix for Maager Ratigs Data The covariaces are stadardized betwee - to The diagoal is ow - a variable is perfectly correlated with itself It is a symmetrical matrix Correlatios RATING SALARY YEARS ORIGIN RATING SALARY YEARS ORIGIN Iterpretatio of Maager Ratigs Data There is a moderately strog positive relatioship betwee SALARY ad RATING - those that get higher salaries ted to have higher ratigs Almost o relatioship betwee YEARS i the compay ad the RATING (r =.77), but there is a egative relatioship betwee YEARS ad SALARY Bivariate Fit of SALARY By RATING 9 SALARY RATING Correlatios RATING SALARY YEARS ORIGIN RATING SALARY YEARS ORIGIN

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2. Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

STP 226 ELEMENTARY STATISTICS

STP 226 ELEMENTARY STATISTICS TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other Liear Regressio Aalysis Aalysis of paired data ad usig a give value of oe variable to predict the value of the other 5 5 15 15 1 1 5 5 1 3 4 5 6 7 8 1 3 4 5 6 7 8 Liear Regressio Aalysis E: The chirp rate

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Chapter 4 - Summarizing Numerical Data

Chapter 4 - Summarizing Numerical Data Chapter 4 - Summarizig Numerical Data 15.075 Cythia Rudi Here are some ways we ca summarize data umerically. Sample Mea: i=1 x i x :=. Note: i this class we will work with both the populatio mea µ ad the

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

Least-Squares Regression

Least-Squares Regression MATH 482 Least-Squares Regressio Dr. Neal, WKU As well as fidig the correlatio of paired sample data {{ x 1, y 1 }, { x 2, y 2 },..., { x, y }}, we also ca plot the data with a scatterplot ad fid the least

More information

STP 226 EXAMPLE EXAM #1

STP 226 EXAMPLE EXAM #1 STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

a is some real number (called the coefficient) other

a is some real number (called the coefficient) other Precalculus Notes for Sectio.1 Liear/Quadratic Fuctios ad Modelig http://www.schooltube.com/video/77e0a939a3344194bb4f Defiitios: A moomial is a term of the form tha zero ad is a oegative iteger. a where

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed

More information

Bivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7

Bivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7 Bivariate Sample Statistics Geog 210C Itroductio to Spatial Data Aalysis Chris Fuk Lecture 7 Overview Real statistical applicatio: Remote moitorig of east Africa log rais Lead up to Lab 5-6 Review of bivariate/multivariate

More information

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43 PAPER NO.: 444, 445 PAGE NO.: Page 1 of 1 INSTRUCTIONS I. You have bee provided with: a) the examiatio paper i two parts (PART A ad PART B), b) a multiple choice aswer sheet (for PART A), c) selected formulae

More information

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the

More information

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234 STA 291 Lecture 19 Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Locatio CB 234 STA 291 - Lecture 19 1 Exam II Covers Chapter 9 10.1; 10.2; 10.3; 10.4; 10.6

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

University of California, Los Angeles Department of Statistics. Simple regression analysis

University of California, Los Angeles Department of Statistics. Simple regression analysis Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor. Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat

More information

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS UNIVERSITY OF EAST ANGLIA School of Ecoomics Mai Series UG Examiatio 04-5 INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS ECO-400Y Time allowed: 3 hours Aswer ALL questios. Show all workig icludig

More information

Chapter 1 (Definitions)

Chapter 1 (Definitions) FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Gotta Keep It Correlatin

Gotta Keep It Correlatin Gotta Keep It Correlati Correlatio.2 Learig Goals I this lesso, ou will: Determie the correlatio coefficiet usig a formula. Iterpret the correlatio coefficiet for a set of data. ew Stud Liks Dark Chocolate

More information

Formulas and Tables for Gerstman

Formulas and Tables for Gerstman Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)

More information

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying. Lecture Mai Topics: Defiitios: Statistics, Populatio, Sample, Radom Sample, Statistical Iferece Type of Data Scales of Measuremet Describig Data with Numbers Describig Data Graphically. Defiitios. Example

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution. Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio

More information

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions Faculty of Egieerig MCT242: Electroic Istrumetatio Lecture 2: Istrumetatio Defiitios Overview Measuremet Error Accuracy Precisio ad Mea Resolutio Mea Variace ad Stadard deviatio Fiesse Sesitivity Rage

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued) Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This

More information

Elementary Statistics

Elementary Statistics Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

Paired Data and Linear Correlation

Paired Data and Linear Correlation Paired Data ad Liear Correlatio Example. A group of calculus studets has take two quizzes. These are their scores: Studet st Quiz Score ( data) d Quiz Score ( data) 7 5 5 0 3 0 3 4 0 5 5 5 5 6 0 8 7 0

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2. LINKÖPINGS UNIVERSITET Matematiska Istitutioe Matematisk Statistik HT1-2015 TAMS24 9. Simple liear regressio G2.1) Show that the vector of residuals e = Y Ŷ has the covariace matrix (I X(X T X) 1 X T )σ

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data

More information

UNIT 11 MULTIPLE LINEAR REGRESSION

UNIT 11 MULTIPLE LINEAR REGRESSION UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4

More information

Summarizing Data. Major Properties of Numerical Data

Summarizing Data. Major Properties of Numerical Data Summarizig Data Daiel A. Meascé, Ph.D. Dept of Computer Sciece George Maso Uiversity Major Properties of Numerical Data Cetral Tedecy: arithmetic mea, geometric mea, media, mode. Variability: rage, iterquartile

More information

Correlation and Regression

Correlation and Regression Correlatio ad Regressio Lecturer, Departmet of Agroomy Sher-e-Bagla Agricultural Uiversity Correlatio Whe there is a relatioship betwee quatitative measures betwee two sets of pheomea, the appropriate

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS

PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS By Noam Agrist 1 THE GOALS OF SOCIAL SCIENCE Explai the world aroud us. What is really happeig ad why. Example: do Kidles boost test

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Chapter 12 Correlation

Chapter 12 Correlation Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Simple Linear Regression Matrix Form

Simple Linear Regression Matrix Form Simple Liear Regressio Matrix Form Q.1. A foam beverage isulator (beer hugger) maufacturer produces their product for firms that wat their logo o beer huggers for marketig purposes. The firm s cost aalyst

More information

Pearson Edexcel Level 3 Advanced Subsidiary and Advanced GCE in Statistics

Pearson Edexcel Level 3 Advanced Subsidiary and Advanced GCE in Statistics Pearso Edecel Level 3 Advaced Subsidiary ad Advaced GCE i Statistics Statistical formulae ad tables For first certificatio from Jue 018 for: Advaced Subsidiary GCE i Statistics (8ST0) For first certificatio

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

bwght = cigs

bwght = cigs EEP 118 / IAS 118 Elisabeth Sadoulet ad Daley Kutzma Uiversity of Califoria at Berkeley Fall 013 Itroductory Applied Ecoometrics Midterm examiatio Scores add up to 50 (5 poits for each sub-questio) Your

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Ismor Fischer, 1/11/

Ismor Fischer, 1/11/ Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Refresher course Regression Analysis

Refresher course Regression Analysis Refresher course Regressio Aalysis http://www.swisspael.ch Ursia Kuh Swiss Household Pael (SHP), FORS 3.6.9, Uiversity of ausae Aim ad cotet of the course Refresher course o liear regressio What is a regressio?

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Nonlinear regression

Nonlinear regression oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters? CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information