BIOSTATISTICS NURS 3324

Similar documents
Simple Linear Regression

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Linear Regression and Correlation. February 11, 2009

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

Correlation and Regression

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Correlation and Regression

4 The Cartesian Coordinate System- Pictures of Equations

Unit 1 Science Models & Graphing

Chapter 6: Exploring Data: Relationships Lesson Plan

Chapter 4 Describing the Relation between Two Variables

Chapter 19 Sir Migo Mendoza

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

GUIDED NOTES 2.2 LINEAR EQUATIONS IN ONE VARIABLE

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Machine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on

Simple Linear Regression Using Ordinary Least Squares

Section Least Squares Regression

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Predicted Y Scores. The symbol stands for a predicted Y score

UNIT 12 ~ More About Regression

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

2-7 Solving Quadratic Inequalities. ax 2 + bx + c > 0 (a 0)

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

Scatter plot of data from the study. Linear Regression

PS2: Two Variable Statistics

Correlation & Regression. Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

APPENDIX 1 BASIC STATISTICS. Summarizing Data

The coordinates of the vertex of the corresponding parabola are p, q. If a > 0, the parabola opens upward. If a < 0, the parabola opens downward.

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Describing the Relationship between Two Variables

Chapter 11. Correlation and Regression

Northwood High School Algebra 2/Honors Algebra 2 Summer Review Packet

SOLUTIONS FOR PROBLEMS 1-30

Business Statistics. Lecture 9: Simple Regression

Scatter plot of data from the study. Linear Regression

6x 2 8x + 5 ) = 12x 8

Example #1: Write an Equation Given Slope and a Point Write an equation in slope-intercept form for the line that has a slope of through (5, - 2).

Chapter 14 Simple Linear Regression (A)

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

JUST THE MATHS UNIT NUMBER 5.3. GEOMETRY 3 (Straight line laws) A.J.Hobson

REVIEW 8/2/2017 陈芳华东师大英语系

MATH 1150 Chapter 2 Notation and Terminology

Learning Goals. 2. To be able to distinguish between a dependent and independent variable.

Do not copy, post, or distribute

Chapter 5: Data Transformation

Ordinary Least Squares Regression Explained: Vartanian

Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and

Intro to Linear Regression

ALGEBRA 2 MIDTERM REVIEW. Simplify and evaluate the expression for the given value of the variable:

Approximate Linear Relationships

Sect The Slope-Intercept Form

Algebra II Chapter 5

SKILL BUILDER TEN. Graphs of Linear Equations with Two Variables. If x = 2 then y = = = 7 and (2, 7) is a solution.

Table of contents. Jakayla Robbins & Beth Kelly (UK) Precalculus Notes Fall / 53

Mathematics Level D: Lesson 2 Representations of a Line

appstats8.notebook October 11, 2016

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Applied Regression Analysis

MATH GRADE 8 PLD Standard Below Proficient Approaching Proficient Proficient Highly Proficient

Chapter 3: Examining Relationships

Week 8: Correlation and Regression

Statistics in medicine

Data Analysis and Statistical Methods Statistics 651

Prob and Stats, Sep 23

Important note: Transcripts are not substitutes for textbook assignments. 1

Biostatistics 4: Trends and Differences

1. Introduc9on 2. Bivariate Data 3. Linear Analysis of Data

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

CORRELATION AND REGRESSION

Chapter 10 Correlation and Regression

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

When using interval notation use instead of open circles, and use instead of solid dots.

Lecture 8 CORRELATION AND LINEAR REGRESSION

Chapter 2: Looking at Data Relationships (Part 3)

Introduction to Linear Regression

BNAD 276 Lecture 10 Simple Linear Regression Model

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression

Outline. Lesson 3: Linear Functions. Objectives:

Results and Analysis 10/4/2012. EE145L Lab 1, Linear Regression

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

A Plot of the Tracking Signals Calculated in Exhibit 3.9

Objectives for Linear Activity. Calculate average rate of change/slope Interpret intercepts and slope of linear function Linear regression

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

Systems of Nonlinear Equations and Inequalities: Two Variables

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Stat 101 L: Laboratory 5

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Lesson 26: Characterization of Parallel Lines

Chapter 1 :: Bird s-eye View Approach to Algebra CHAPTER. Bird s-eye View Approach to Algebra

GUIDED NOTES 5.6 RATIONAL FUNCTIONS

Business Statistics. Lecture 10: Correlation and Linear Regression

x y

Lesson 3 - Linear Functions

Regression M&M 2.3 and 10. Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables

Transcription:

Simple Linear Regression and Correlation Introduction Previously, our attention has been focused on one variable which we designated by x. Frequently, it is desirable to learn something about the relationship between two (or more) variables. For example, we might be interested in studying the relationship between o cholesterol level and age, o blood pressure and age, o height and weight o the amount of exercise and heart rate; o the concentration of an injected drug and heart rate o the consumption level of some nutrient and weight gain. The nature and strength of the relationships between two variables may be examined by regression and correlation analyses, two related statistical techniques that serve different purposes. Regression is used to discover the probable form of the relationship between two variables x and y by finding an appropriate equation. The ultimate objectives when this method of analysis is employed usually is to predict or estimate the value of one variable corresponding to a given value of another variable i.e. to predict or estimate the value of y for a given value of x. Correlation analysis, on the other hand, is concerned with measuring how strong is the relationship between two variables x and y i.e. the degree of the correlation between the two variables. SIMPLE LINEAR REGRESSION In simple linear the variable x is usually referred to as the explanatory or independent variable and the other variable, y is called the predicted or dependent variable, and we speak of the regression of y on x. In the above examples, the investigator could predict the cholesterol level and blood pressure from age, the weight from height, the heart rate from the concentration of injected drug.. and so on. Thus, cholesterol level, blood pressure, the weight and heart rate would be the predicted or dependent variable and; the age, the height and the concentration of injected drug would be the explanatory or independent variable. We assume that for each value of x, there is a whole population of y values which is normally distributed and all of the y populations have equal variances. In simple linear regression the object of the researcher s interest is the regression equation that describes the true relationship between the dependent variable y and the independent variable x. Scatter diagram A first step that is usually useful in studying the relationship between two variables is to prepare a scatter diagram of the data. The points are plotted by assigning values of the independent variable x to the horizontal axis and values of the dependent variable y to the vertical axis. The pattern made by the points plotted on the scatter diagram usually suggests the basic nature and the strength of the relationship between two variables. 69

Optical density Optical density Optical density BIOSTATISTICS NURS 3324 Example Relationship between and optical density Optical density 3 4 4.5 5 5 2 5.5 3 6 5 6.5 7 7 9 7.5 3.6 In our example, we can see, in general, that as the increases the optical density also increases so that they have a positive relationship. The least-square line We can also see that the points seem to be scattered around an invisible line which would describe the relationship between x and y. These impressions suggest that the relationship between points in the two variables may be described by a straight line crossing the y-axis near the origin and making approximately a 45 degree angle with the x-axis. Thinking Challenge It looks as this line would be easy to draw by hand, but it is doubtful that the lines drawn by any two people would be exactly the same. In other words, for every person drawing such a line by eye, or freehand, we would expect a different line. Which line best describes relationship between the variables? What is needed for obtaining the desired line?.6.6 69

Answer If the scatter diagram has a linear trend, we need a mathematical way to obtain the best line through the data. We need to employ a method known as the method of least squares for obtaining the desired line, and the resulting line is called the least-square line. The reason for calling the method by this name will be explained in the discussion that follow. Equation for straight line (Linear Equation) Now, recall from algebra that the general equation for straight line is given by y = a + bx Where y is a value on the vertical axis, and x is a value on the horizontal axis, a is the point where the line crosses the vertical axis, and referred to as y-intercept. b shows the amount by which y changes for each unit change in x and referred to as the slope of the line. y y = a + bx b = slope Change in y Change in x a = y intercept x To draw a line based on the equation, we need the numerical values of the constants a and b. Given these constants, we may substitute various values of x into the equation to obtain corresponding values of y. y = a + bx The resulting points may then be plotted. Computation Finding the b-value b 2 n x x 2 n xy x y b 9 284 49 2 9 18.2 -(49)(3.4).958 69

Finding the y-intercept (x) a y bx where y mean of y values and x mean of x values 3.4 y 378 9 49 x 5.444 9 a 378.958 5.444-837 Optical density (y) x 2 y 2 xy 3 9.1 4 16.4.8 4.5 5 25.625 1.125 5 2 25 24 1.6 5.5 3 35 89 1.815 6 5 36 225 2.1 6.5 7 42.25 29 3.55 7 9 49 4 3.43 7.5 3 56.25 81 3.975 Total Σ x = 49 Σ y = 3.4 Σ x 2 = 284 Σ y 2 = 1.1882 Σ xy = 18.2 Mean x = 5.444 y = 378 Alternatively y b x a n The equation for the least squares line is: y a bx y - 837+.958x y.958x - 837 Note that we use the symbol because this value is computed from the equation and is not an observed value of y. Now, we can substitute various values of x into the equation to obtain corresponding values of. The resulting points may be plotted. y y 66

Optical density BIOSTATISTICS NURS 3324 Example: Predicting y for a given x using the regression equation Choose a value for x (within the range of x values). x = 6.8 Substitute the selected x in the regression equation. y.958 6.8-837 Determine corresponding value of y. y.958x - 837 =625 According to the equation, a of 6.8 would has a 625 optical density. Drawing the least-squares line Since any two such coordinates determine a straight line, we may select any two values in the range of x, compute two corresponding y values, locate them on a graph, and connect them with a straight line to obtain the line corresponding the equation. The following point will always be on the least squares line: ( x, y) Use 5.444 and 378, the averages of the x s and the y s, respectively. Try x = 4, Compute: y =.957(4) - 835 = 965 Sketching the Line Using the Points (5.444, 378) and (4, 965).6 y =.957x - 835 Now what we have obtained is what is called the best line for describing the relationship between our two variables. By what criterion it is considered best? Before the criterion is stated, let us examine the figure obtained. Note that the least squares line does not pass through most of the observed points that are plotted on the scatter diagram. In other words, the observed points deviate from the line by varying amounts. 11

Optical density BIOSTATISTICS NURS 3324.6 Deviation Deviation y i y i y i Deviation The line that we have drawn is best in this sense: The sum of the squared vertical deviations of the observed data points (y i ) from the least square line is smaller than the sum of the squared vertical deviations of the observed data points from any other line. CORRELATION Pearson s Correlation coefficient r 1. Pearson s correlation coefficient measures the strength of the relationship between the two numerical variables represented as x and y. 2. The correlation coefficient is denoted by r, it is calculated using the formula: r Computation Table n x i y i x i y i 2 2 i i i i 2 2 n x x n y y (x) Optical density (y) xy x 2 y 2 3 9.1 4.8 16.4 4.5 5 1.125 25.625 5 2 1.6 25 24 5.5 3 1.815 35 89 6 5 2.1 36 225 6.5 7 3.55 42.25 29 7 9 3.43 49 41 7.5 3 3.975 56.25 89 x = 49 y = 3.4 xy = 18.2 x 2 = 284 y 2 = 1.1882 r 9 18.2 49 3.4 9 284 49 9 1.1882 3.4.9891. 99 2 2 1

Coefficient of Correlation Values The statistic r has the following properties: 1. r measures the extent of linear association between two variables. 2. r has value between 1 and 1. 3. r = 1 if and only if all the observations are on a straight line with positive slope. 4. r = 1 if and only if all observations are on a straight line with negative slope. 5. r tends to be close to zero if there is no linear association between x and y. 6. Although there is no fixed rule or interpretation of the strength of a correlation, we will say that the correlation is Strong if r.8 Moderate if r.8 Weak if r Coefficient of determination or r-squared (r 2 ) Sometimes the correlation is squared (r 2 ) to form a useful statistic called the coefficient of determination or r-squared. r 2 = 1. means given value of one variable can perfectly predict the value for other variable. r 2 = means knowing either variable does not predict the other variable The higher r 2 value means more correlation there is between two variables. The coefficient of determination expresses the proportion of the variance in one variable that is accounted for or explained by the variance in the other variable. So, if a study finds a correlation (r) of between salt intake and blood pressure, it could be concluded that = 6, or 16% of the variance in blood pressure in this study is accounted for by variance in salt intake. In the above example, approximately 98 (.9891.9891=.978) percent of the variation in Optical density is accounted for by variance in change, and about 2% is explained by other causes. 11

Figure Scatter plots illustrating how the correlation coefficient, r, is a measure of the linear association between two variables. 12