Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Similar documents
regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Simple Linear Regression

Intro to Linear Regression

Intro to Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Ch 2: Simple Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Lecture 11: Simple Linear Regression

Correlation and Linear Regression

Inferences for Regression

Important note: Transcripts are not substitutes for textbook assignments. 1

Measuring the fit of the model - SSR

Introduction and Single Predictor Regression. Correlation

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Introduction to Regression

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

appstats27.notebook April 06, 2017

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Chapter 27 Summary Inferences for Regression

Psychology 282 Lecture #3 Outline

11. Regression and Least Squares

Chapter 12 - Part I: Correlation Analysis

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Inference for Regression Simple Linear Regression

Statistics for exp. medical researchers Regression and Correlation

ST505/S697R: Fall Homework 2 Solution.

The Simple Linear Regression Model

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Statistical Modelling in Stata 5: Linear Models

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Estadística II Chapter 4: Simple linear regression

Regression Models - Introduction

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Hypothesis Testing hypothesis testing approach

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Simple linear regression

TOPIC 9 SIMPLE REGRESSION & CORRELATION

ECO220Y Simple Regression: Testing the Slope

1 Correlation and Inference from Regression

Scatter plot of data from the study. Linear Regression

Section Least Squares Regression

Simple Linear Regression

General Linear Model (Chapter 4)

Ordinary Least Squares Regression Explained: Vartanian

Inference for Regression Inference about the Regression Model and Using the Regression Line

Correlation Analysis

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

Basic Business Statistics 6 th Edition

Statistical Techniques II EXST7015 Simple Linear Regression

23. Inference for regression

STATS DOESN T SUCK! ~ CHAPTER 16

Ch Inference for Linear Regression

Applied Statistics and Econometrics

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Statistics for Managers using Microsoft Excel 6 th Edition

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Scatter plot of data from the study. Linear Regression

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Correlation: Relationships between Variables

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Single and multiple linear regression analysis

Correlation and Regression Bangkok, 14-18, Sept. 2015

Topic 10 - Linear Regression

Business Statistics. Lecture 9: Simple Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Simple Linear Regression Analysis

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

SIMPLE REGRESSION ANALYSIS. Business Statistics

Predicted Y Scores. The symbol stands for a predicted Y score

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Correlation and Regression

Essential of Simple regression

STATISTICAL DATA ANALYSIS IN EXCEL

Chapter 9 - Correlation and Regression

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Inference with Simple Regression

Chapter 4 Describing the Relation between Two Variables

STAT Chapter 11: Regression

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Inference about the Slope and Intercept

Weighted Least Squares

Chapter 1 Linear Regression with One Predictor

About Bivariate Correlations and Linear Regression

Chapter 5. Elements of Multiple Regression Analysis: Two Independent Variables

Applied Econometrics (QEM)

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

28. SIMPLE LINEAR REGRESSION III

Data Analysis and Statistical Methods Statistics 651

INFERENCE FOR REGRESSION

Data Set 8: Laysan Finch Beak Widths

13 Simple Linear Regression

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.

Transcription:

10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1

10/3/011 Standard error Confidence interval SE = CI = = t value for 100(1 a)% confidence = standard deviation t distribution Confidence interval CI = = t value for 100(1 a)% confidence = standard deviation Descriptive statistics VAR = SD = Correlation and Regression SE = CI =

10/3/011 Is there a relationship between x and y? What is the strength of this relationship Pearson s r Can we describe this relationship and use it to predict y from x? Regression Is the relationship we have described statistically significant? F- and t-tests Relevance to SPM GLM Relationship between x and y Correlation describes the strength and direction of a linear relationship between two variables Regression tells you how well a certain independent variable predicts a dependent variable CORRELATION and CAUSATION In order to infer causality: manipulate independent variable and observe effect on dependent variable Scattergrams Variance vs. Covariance Do two variables change together? X X X Variance ~ DX * DX Positive correlation Negative correlation No correlation Covariance ~ DX * D Covariance Example Covariance When X and : cov (x,y) = pos. When X and : cov (x,y) = neg. When no constant relationship: cov (x,y) = 0 x y x i x y i y ( x x)( y y) i i 0 3-3 0 0-1 -1 1 3 4 0 1 0 4 0 1-3 -3 6 6 3 3 9 x 3 y 3 7 What does this number tell us? 3

10/3/011 Example of how covariance value relies on variance Pearson s R High variance data Low variance data Subject x y x error * y x y X error * y error error 1 101 100 500 54 53 9 81 80 900 53 5 4 3 61 60 100 5 51 1 4 51 50 0 51 50 0 5 41 40 100 50 49 1 6 1 0 900 49 48 4 7 1 0 500 48 47 9 Mean 51 50 51 50 Covariance does not really tell us anything Solution: standardise this measure Pearson s R: standardise by adding std to equation: Sum of x error * y error : 7000 Sum of x error * y error : 8 Covariance: 1166.67 Covariance: 4.67 Basic assumptions Pearson s R: degree of linear dependence Normal distributions Variances are constant and not zero Independent sampling no autocorrelations No errors in the values of the independent variable All causation in the model is one way (not necessary mathematically, but essential for prediction) Limitations of r r is actually r = true r of whole population = estimate of r based on data r is very sensitive to extreme values: In the real world r is never 1 or 1 interpretations for correlations in psychological research (Cohen) Correlation Negative Positive Small 0.9 to 0.10 00.10 to 0.9 Medium 0.49 to 0.30 0.30 to 0.49 Large 1.00 to 0.50 0.50 to 1.00 4

10/3/011 Regression Correlation tells you if there is an association between x and y but it doesn t describe the relationship or allow you to predict one variable from the other. To do this we need REGRESSION! Best-fit Line Aim of linear regression is to fit a straight line, ŷ = ax + b, to data that gives best prediction of y for any value of x This will be the line that minimises distance between ŷ = ax + b data and fitted line, i.e. slope the residuals ε intercept = ŷ, predicted value = y i, true value ε = residual error Least Squares Regression To find the best line we must minimise the sum of the squares of the residuals (the vertical distances from the data points to our line) Finding b First we find the value of b that gives the min sum of squares Model line: ŷ = ax + b Residual (ε) = y - ŷ a = slope, b = intercept b ε b b ε Sum of squares of residuals = Σ (y ŷ) we must find values of a and b that minimise Σ (y ŷ) Trying different values of b is equivalent to shifting the line up and down the scatter plot Finding a Minimizing sums of squares Now we find the value of a that gives the min sum of squares b b b Trying out different values of a is equivalent to changing the slope of the line, while b stays constant Need to minimize Σ(y ŷ) ŷ = ax + b so need to minimize: Σ(y -ax -b) If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared term So the min sum of squares is at the bottom of the curve, where the gradient is zero. sums of squares (S) Gradient = 0 min S Values of a and b 5

10/3/011 Computation So we can find a and b that give min sum of squares by taking partial derivatives of Σ(y - ax - b) with respect to a and b separately Then we solve these for 0 to give us the values of a and b that give the min sum of squares The solution Doing this gives the following equations for a and b: a = r s y s x r = correlation coefficient of x and y s y = standard deviation of y s x = standard deviation of x ou can see that: A low correlation coefficient gives a flatter slope (small value of a) Large spread of y, i.e. high standard deviation, results in a steeper slope (high value of a) Large spread of x, i.e. high standard deviation, results in a flatter slope (high value of a) The solution cont. Back to the model Our model equation is ŷ = ax + b This line must pass through the mean so: y = ax + b b = y ax We can put our equation into this giving: We can calculate the regression line for any data, but the important question is: How well does this line fit the data, or how good is it at predicting y from x? b = y - r s y s x x r = correlation coefficient of x and y s y = standard deviation of y s x = standard deviation of x The smaller the correlation, the closer the intercept is to the mean of y How good is our model? How good is our model cont. Total variance of y: s y = (y y) n - 1 = SS y df y Total variance = predicted variance + error variance s y = s ŷ + s er Variance of predicted y values (ŷ): (ŷ y) s ŷ = = n - 1 df ŷ Error variance: (y ŷ) s error = = n - SS pred SS er df er This is the variance explained by our regression model This is the variance of the error between our predicted y values and the actual y values, and thus is the variance in y that is NOT explained by the regression model Conveniently, via some complicated rearranging s ŷ = r s y r = s ŷ / s y so r is the proportion of the variance in y that is explained by our regression model 6

10/3/011 How good is our model cont. Is the model significant? Insert r s y into s y = s ŷ + s er and rearrange to get: s er = s y r s y = s y (1 r ) From this we can see that the greater the correlation the smaller the error variance, so the better our prediction i.e. do we get a significantly better prediction of y from our regression equation than by just predicting the mean? F-statistic: F (dfŷ,df er ) = s ŷ s er And it follows that: r (n - ) t (n-) = 1 r (because F = t ) complicated rearranging =...= r (n - ) 1 r So all we need to know are r and n! 7