Correlation and Regression

Similar documents
Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

Introduction to Statistics

Correlation and Regression

Chs. 15 & 16: Correlation & Regression

12.7. Scattergrams and Correlation

Chs. 16 & 17: Correlation & Regression

11 Correlation and Regression

Sociology 593 Exam 2 Answer Key March 28, 2002

This document contains 3 sets of practice problems.

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Data Analysis and Statistical Methods Statistics 651

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Multiple Linear Regression

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r

Do not copy, post, or distribute

Lecture 48 Sections Mon, Nov 16, 2009

Correlation: Relationships between Variables

HOMEWORK (due Wed, Jan 23): Chapter 3: #42, 48, 74

Simple Linear Regression: One Quantitative IV

Chapter 9. Hypothesis testing. 9.1 Introduction

WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin

Sociology 593 Exam 2 March 28, 2002

MAT 171. August 22, S1.4 Equations of Lines and Modeling. Section 1.4 Equations of Lines and Modeling

Ch. 16: Correlation and Regression

Correlation and Linear Regression

Unit 6 - Introduction to linear regression

15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES

Created by T. Madas. Candidates may use any calculator allowed by the regulations of this examination.

If we have high correlation, we d like to determine causation.

This gives us an upper and lower bound that capture our population mean.

Sampling Distributions: Central Limit Theorem

Final Exam - Solutions

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Recall, Positive/Negative Association:

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS IOP 201-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH)

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

Section 2.3. Function Notation and Making Predictions

a. Yes, it is consistent. a. Positive c. Near Zero

Reminder: Student Instructional Rating Surveys

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Key Concepts. Correlation (Pearson & Spearman) & Linear Regression. Assumptions. Correlation parametric & non-para. Correlation

Math 116 Practice for Exam 3

79 Wyner Math Academy I Spring 2016

Identify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.

1. Regressions and Regression Models. 2. Model Example. EEP/IAS Introductory Applied Econometrics Fall Erin Kelley Section Handout 1

Causal Inference. Prediction and causation are very different. Typical questions are:

STATISTICS Relationships between variables: Correlation

Exam Applied Statistical Regression. Good Luck!

Lecture 5: ANOVA and Correlation

Learning Objectives. IQ Scores of Students. Correlation. Correlation and Simple Linear Regression

Can you tell the relationship between students SAT scores and their college grades?

Relationships Regression

Statistics and Sampling distributions

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

CORRELATION ANALYSIS. Dr. Anulawathie Menike Dept. of Economics

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Correlation. Patrick Breheny. November 15. Descriptive statistics Inference Summary

Hypothesis testing: Steps

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014

Review. Number of variables. Standard Scores. Anecdotal / Clinical. Bivariate relationships. Ch. 3: Correlation & Linear Regression

9 Correlation and Regression

Measuring Associations : Pearson s correlation

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

I used college textbooks because they were the only resource available to evaluate measurement uncertainty calculations.

Statistics 100 Exam 2 March 8, 2017

Hypothesis testing: Steps

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Important note: Transcripts are not substitutes for textbook assignments. 1

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population

not to be republished NCERT Correlation CHAPTER 1. INTRODUCTION

You may use a calculator. Translation: Show all of your work; use a calculator only to do final calculations and/or to check your work.

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Suppose that we are concerned about the effects of smoking. How could we deal with this?

CORRELATION & REGRESSION

STAT Checking Model Assumptions

Chapter 11. Correlation and Regression

Learning Causality. Sargur N. Srihari. University at Buffalo, The State University of New York USA

Objectives for Linear Activity. Calculate average rate of change/slope Interpret intercepts and slope of linear function Linear regression

Draft Proof - Do not copy, post, or distribute

Inference in Regression Model

STATISTICS SYLLABUS UNIT I

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Chapter 10 Relationships Among Variables

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY CORRELATION OF EXPERIMENTAL DATA

Quantitative Bivariate Data

UGRC 120 Numeracy Skills

Chapter 18. Sampling Distribution Models. Bin Zou STAT 141 University of Alberta Winter / 10

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Name: Exam: In-term Two Page: 1 of 8 Date: 12/07/2018. University of Texas at Austin, Department of Mathematics M358K - Applied Statistics TRUE/FALSE

Department of Statistics & Operations Research College of Science King Saud University. STAT 324 Supplementary Examination Second Semester

Chapter 12: Estimation

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Exponential function review and practice Day 1

Computational Biology Course Descriptions 12-14

Challenges (& Some Solutions) and Making Connections

Transcription:

Correlation and Regression Linear Correlation: Does one variable increase or decrease linearly with another? Is there a linear relationship between two or more variables? Types of linear relationships: Positive linear Negative linear No relationship None or weak

Scattergrams Weak linear Strong Linear Other relationships: Nonlinear or Curvilinear Linear or Exponential?

Correlation Pearson Product Moment Correlation Coefficient: Simply called correlation coefficient, PPMC, or r-value. Linear correlation between two variables. Examples: Weight increases with height. IQ with brain size?! Used for calibrating instruments, e.g., force transducers, accelerometers, electrogoniometers (measure joint angles). Multiple Correlation: Used when several independent variables influence a dependent variable Called an R-value Defined as: Y = A + B 1 X 1 + B 2 X 2 + B 3 X 3 +... + B n Xn Examples: Heart disease is affected by family history, obesity, smoking, diet, etc. Academic performance is affected by intelligence, economics, experience, memory, etc. Lean body mass is predicted by a combination of body mass, thigh, triceps, and abdominal skinfold measures.

Significance of Correlation Coefficient Method 1 Step 1: H 0: 0 Step 2: Look up r crit for n 2 degrees of freedom (Table I or A-6). Step 3: Compute sample r (as above). Step 4: Sample r is significant if it is greater than r crit. Step 5: If significance occurs data are linearly correlated otherwise they are not. If table of significant correlation coefficients is not available or significance level ( ) is not 0.05 or 0.01 use Method 2. Method 2 Step 1: H 0: 0 Step 2: Look up t crit for n 2 degrees of freedom. Step 3: Compute sample r then t. Step 4: Sample t is significant if it is greater than t crit. Step 5: If significance occurs, data are linearly correlated otherwise they are not.

Regression Regression: Can only be done when a significant correlation exists. Equation of line or curve that defines the relationship between variables. The line of best fit. Mathematical technique is called least squares method. This technique computes the line that minimizes the squares of the deviations of the data from the line.

Coefficient of Determination and Standard Error of Estimate Coefficient of Determination Measures the strength of the relationship between the two variables. Equal to the explained variation divided by the total 2 variation = r. Usually given as a percentage, i.e., 2 coefficient of determination = r 100%. For example, an r of 0.90 has 81% of total variation explained but an r of 0.60 has only 36% of its variation. A correlation may be significant but explain very little. Standard Error Of Estimate Measure of the variability of the observed values about the regression line. Can be used to compute a confidence interval for a predicted value. standard error of estimate:

Possible Reasons for a Significant Correlation 1. There is a direct cause-and-effect relationship between the variables. That is, x causes y. For example, positive reinforcement improves learning; smoking causes lung cancer; and heat causes ice to melt. 2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x. For example, suppose a researcher believes excessive coffee consumption causes nervousness, but the researcher fails to consider that the reverse situation may occur. That is, it may be that an nervous people crave coffee. 3. The relationship between the variables may be caused by a third variable. That is z causes x and y. For example, if a statistician correlated the number of deaths due to drowning and the number of cans of soft drinks consumed during the summer, he or she would probably find a significant relationship. However, the soft drink is not necessarily responsible for the deaths, since both variables may be related to heat and humidity. 4. There may be a complexity of interrelationships among many variables. That is y is caused by x 1, x 2, x 3,, x n. For example, a researcher may find a significant relationship between students high school grades and college grades. But there probably are many other variables involved, such as IQ, hours of study, influence of parents, motivation, age and instructors. 5. The relationship may be coincidental. For example, a researcher may be able to find a significant relationship between the increase in the number of people who are exercising and the increase in the number of people who are committing crimes. But common sense dictates that any relationship between these two variables must be due to coincidence.