Chapter Objectives. Bivariate Data. Terminology. Lurking Variable. Types of Relations. Chapter 3 Linear Regression and Correlation

Similar documents
Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

STP 226 EXAMPLE EXAM #1

Stat 139 Homework 7 Solutions, Fall 2015

1 Inferential Methods for Correlation and Regression Analysis

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

STP 226 ELEMENTARY STATISTICS

Least-Squares Regression

Correlation and Covariance

Simple Linear Regression

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Lecture 11 Simple Linear Regression

ECON 3150/4150, Spring term Lecture 3

11 Correlation and Regression

Linear Regression Demystified

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Correlation Regression

Introducing Sample Proportions

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Linear Regression Models

Regression, Inference, and Model Building

a is some real number (called the coefficient) other

Introducing Sample Proportions

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

University of California, Los Angeles Department of Statistics. Simple regression analysis

Chapter 2 Descriptive Statistics

NUMERICAL METHODS FOR SOLVING EQUATIONS

n m CHAPTER 3 RATIONAL EXPONENTS AND RADICAL FUNCTIONS 3-1 Evaluate n th Roots and Use Rational Exponents Real nth Roots of a n th Root of a

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Data Description. Measure of Central Tendency. Data Description. Chapter x i

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Chapter 4 - Summarizing Numerical Data

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Paired Data and Linear Correlation

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Final Examination Solutions 17/6/2010

Polynomial Functions and Their Graphs

Properties and Hypothesis Testing

Elementary Statistics

Median and IQR The median is the value which divides the ordered data values in half.

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Statistics 511 Additional Materials

Algebra II Notes Unit Seven: Powers, Roots, and Radicals

Revision Topic 1: Number and algebra

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

Chapter 8: Estimating with Confidence

Ismor Fischer, 1/11/

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Topic 9: Sampling Distributions of Estimators

MEASURES OF DISPERSION (VARIABILITY)

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Statistical Properties of OLS estimators

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

AP Statistics Review Ch. 8

Analysis of Experimental Measurements

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Advanced Algebra SS Semester 2 Final Exam Study Guide Mrs. Dunphy

10-701/ Machine Learning Mid-term Exam Solution

Chapter Vectors

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Activity 3: Length Measurements with the Four-Sided Meter Stick

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

P.3 Polynomials and Special products

Formulas and Tables for Gerstman

Section 6.4: Series. Section 6.4 Series 413

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

WORKING WITH NUMBERS

RADICAL EXPRESSION. If a and x are real numbers and n is a positive integer, then x is an. n th root theorems: Example 1 Simplify

Essential Question How can you recognize an arithmetic sequence from its graph?

Infinite Sequences and Series

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Introduction to Signals and Systems, Part V: Lecture Summary

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Read through these prior to coming to the test and follow them when you take your test.

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

Bivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

CALCULUS BASIC SUMMER REVIEW

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Transcription:

Chapter Objectives Chapter 3 Liear Regressio ad Correlatio Descriptive Aalysis & Presetatio of Two Quatitative Data To be able to preset two-variables data i tabular ad graphic form Display the relatioship betwee two quatitative variables graphically usig a scatter diagram. Calculate ad iterpret the liear correlatio coefficiet. Discuss basic idea of fittig the scatter diagram with a best-fitted lie called a liear regressio lie. Create ad iterpret the liear regressio lie. Termiology Data for a sigle variable is uivariate data May or most real world models have more tha oe variable multivariate data I this chapter we will study the relatios betwee two variables bivariate data Bivariate Data I may studies, we measure more tha oe variable for each idividual Some examples are Raifall amouts ad plat growth Exercise ad cholesterol levels for a group of people Height ad weight for a group of people Types of Relatios Whe we have two variables, they could be related i oe of several differet ways They could be urelated Oe variable (the iput or explaatory or predictor variable) could be used to explai the other (the output or respose or depedet variable) Oe variable could be thought of as causig the other variable to chage Lurkig Variable Sometimes it is ot clear which variable is the explaatory variable ad which is the respose variable Sometimes the two variables are related without either oe beig a explaatory variable Sometimes the two variables are both affected by a third variable, a lurkig variable, that had ot bee icluded i the study Note: Whe two variables are related to each other, oe variable may ot cause the chage of the other variable. Relatio does ot always mea causatio. 1

Example 1 A example of a lurkig variable A researcher studies a group of elemetary school childre Y = the studet s height X = the studet s shoe size It is ot reasoable to claim that shoe size causes height to chage The lurkig variable of age affects both of these two variables More Examples Some other examples Raifall amouts ad plat growth Explaatory variable raifall Respose variable plat growth Possible lurkig variable amout of sulight Exercise ad cholesterol levels Explaatory variable amout of exercise Respose variable cholesterol level Possible lurkig variable diet Types of Bivariate Data Three combiatios of variable types: 1. Both variables are qualitative (attribute). Oe variable is qualitative (attribute) ad the other is quatitative (umerical) 3. Both variables are quatitative (both umerical) Two Qualitative Variables Whe bivariate data results from two qualitative (attribute or categorical) variables, the data is ofte arraged o a cross-tabulatio or cotigecy table Example: A survey was coducted to ivestigate the relatioship betwee prefereces for televisio, radio, or ewspaper for atioal ews, ad geder. The results are give i the table below: TV Radio NP Male 80 175 305 Female 115 75 170 Margial Totals This table, may be exteded to display the margial totals (or margials). The total of the margial totals is the grad total: TV Radio NP Row Totals Male 80 175 305 Female 115 75 170 760 560 Col. Totals 395 450 475 130 Note: Cotigecy tables ofte show percetages (relative frequecies). These percetages are based o the etire sample or o the subsample (row or colum) classificatios. Percetages Based o the Grad Total (Etire Sample) The previous cotigecy table may be coverted to percetages of the grad total by dividig each frequecy by the grad total ad multiplyig by 0 For example, 175 becomes 13.3% 175 0 133 130 =. TV Radio NP Row Totals Male 1. 13.3 3.1 57.6 Female 8.7 0.8 1.9 4.4 Col. Totals 9.9 34.1 36.0 0.0

Illustratio These same statistics (umerical values describig sample results) ca be show i a (side-by-side) bar graph: Percet 5 0 15 5 Percetages Based o Grad Total Male Female Percetages Based o Row (Colum) Totals The etries i a cotigecy table may also be expressed as percetages of the row (colum) totals by dividig each row (colum) etry by that row s (colum s) total ad multiplyig by 0. The etries i the cotigecy table below are expressed as percetages of the colum totals: TV Male 70.9 Female 9.1 Col. Totals 0.0 Radio 38.9 61.1 0.0 NP Row Totals 64. 57.6 35.8 4.4 0.00 0.0 0 TV Radio NP Media Note:These statistics may also be displayed i a side-by-side bar graph Oe Qualitative & Oe Quatitative Variable 1. Whe bivariate data results from oe qualitative ad oe quatitative variable, the quatitative values are viewed as separate samples. Each set is idetified by levels of the qualitative variable 3. Each sample is described usig summary statistics, ad the results are displayed for side-by-side compariso 4. Statistics for compariso: measures of cetral tedecy, measures of variatio, 5-umber summary Example: Example A radom sample of households from three differet parts of the coutry was obtaied ad their electric bill for Jue was recorded. The data is give i the table below: Northeast 3.75 40.50 33.65 31.5 4.55 50.60 37.70 31.55 38.85 1.5 Midwest 34.38 34.35 39.15 37.1 36.71 34.39 35.1 35.80 37.4 40.01 West 54.54 65.60 59.78 45.1 60.35 61.53 5.79 47.37 59.64 37.40 The part of the coutry is a qualitative variable with three levels of respose. The electric bill is a quatitative variable. The electric bills may be compared with umerical ad graphical techiques. 5. Graphs for compariso: side-by-side stemplot ad boxplot Compariso Usig Box-ad-Whisker Plots 70 The Mothly Electric Bill 60 Electric Bill 50 40 30 0 Northeast Midwest West The electric bills i the Northeast ted to be more spread out tha those i the Midwest. The bills i the West ted to be higher tha both those i the Northeast ad Midwest. Descriptive Statistics for Two Quatitative Variables Scatter Diagrams ad correlatio coefficiet 3

Two Quatitative Variables The most useful graph to show the relatioship betwee two quatitative variables is the scatter diagram Each idividual is represeted by a poit i the diagram The explaatory (X) variable is plotted o the horizotal scale The respose (Y) variable is plotted o the vertical scale Example Example: I a study ivolvig childre s fear related to beig hospitalized, the age ad the score each child made o the Child Medical Fear Scale (CMFS) are give i the table below: Age (x ) 8 9 9 11 9 8 9 8 11 CMFS (y ) 31 5 40 7 35 9 5 34 44 19 Age (x ) 7 6 6 8 9 1 15 13 CMFS (y ) 8 47 4 37 35 16 1 3 6 36 Costruct a scatter diagram for this data Solutio age = iput variable, CMFS = output variable Child Medical Fear Scale 50 Aother Example A example of a scatter diagram 40 CMFS 30 0 6 7 8 9 Age 11 1 13 14 15 Note: the vertical scale is trucated to illustrate the detail relatio! Types of Relatios There are several differet types of relatios betwee two variables A relatioship is liear whe, plotted o a scatter diagram, the poits follow the geeral patter of a lie A relatioship is oliear whe, plotted o a scatter diagram, the poits follow a geeral patter, but it is ot a lie A relatioship has o correlatio whe, plotted o a scatter diagram, the poits do ot show ay patter Liear Correlatios Liear relatios or liear correlatios have poits that cluster aroud a lie Liear relatios ca be either positive (the poits slats upwards to the right) or egative (the poits slat dowwards to the right) 4

Positive Correlatios For positive (liear) correlatio Above average values of oe variable are associated with above average values of the other (above/above, the poits tred right ad upwards) Below average values of oe variable are associated with below average values of the other (below/below, the poits tred left ad dowwards) Example: Positive Correlatio As x icreases, y also icreases: Output 60 50 40 30 0 15 0 5 30 35 40 45 50 55 Iput Negative Correlatios For egative (liear) correlatio Above average values of oe variable are associated with below average values of the other (above/below, the poits tred right ad dowwards) Below average values of oe variable are associated with above average values of the other (below/above, the poits tred left ad upwards) Example: Negative Correlatio As x icreases, y decreases: Output 95 85 75 65 55 15 0 5 30 35 40 45 50 55 Iput Noliear Correlatios Noliear relatios have poits that have a tred, but ot aroud a lie The tred has some bed i it No Correlatios Whe two variables are ot related There is o liear tred There is o oliear tred Chages i values for oe variable do ot seem to have ay relatio with chages i the other 5

Example: No Correlatio As x icreases, there is o defiite shift i y: Output 55 45 Distictio betwee Noliear & No Correlatio Noliear relatios ad o relatios are very differet Noliear relatios are defiitely patters just ot patters that look like lies No relatios are whe o patters appear at all 35 0 Iput 30 Example Examples of oliear relatios Age ad Height for people (icludig both childre ad adults) Temperature ad Comfort level for people Examples of o relatios Temperature ad Closig price of the Dow Joes Idustrials Idex (probably) Age ad Last digit of telephoe umber for adults Please Note Perfect positive correlatio: all the poits lie alog a lie with positive slope Perfect egative correlatio: all the poits lie alog a lie with egative slope If the poits lie alog a horizotal or vertical lie: o correlatio If the poits exhibit some other oliear patter: oliear relatioship Need some way to measure the stregth of correlatio Measure of Liear Correlatio Liear Correlatio Coefficiet The liear correlatio coefficiet is a measure of the stregth of liear relatio betwee two quatitative variables The sample correlatio coefficiet r is r = ( xi x ) ( yi y ) sx sy 1 Note: X, Y, S, S x y are the sample meas ad sample variaces of the two variables X ad Y. 6

Properties of Liear Correlatio Coefficiets Some properties of the liear correlatio coefficiet r is a uitless measure (so that r would be the same for a data set whether x ad y are measured i feet, iches, meters etc.) r is always betwee 1 ad +1. r = -1 : perfect egative correlatio r = +1: perfect positive correlatio Positive values of r correspod to positive relatios Negative values of r correspod to egative relatios Various Expressios for r There are other equivalet expressios for the liear correlatio r as show below: ( x x)( y y) r = ( 1) S x S y r = ( x ( x x) x)( y y) ( y y) However, it is much easier to compute r usig the short-cut formula show o the ext slide. r= Short-Cut Formula for r SS( xy) SS( x) SS( y) = ( x) SS ( x) sum of squ ares for x = x ( y) = SS ( y) sum of squares for y = y SS ( xy) = x y sum of squares for xy = xy Example Example: The table below presets the weight (i thousads of pouds) x ad the gasolie mileage (miles per gallo) y for te differet automobiles. Fid the liear correlatio coefficiet: x y x y xy.5 3.0 4.0 3.5.7 4.5 3.8.9 5.0. Sum 34.1 x 40 43 30 35 4 19 3 39 15 14 309 6.5 9.00 16.00 1.5 7.9 0.5 14.44 8.41 5.00 4.84 13.73 1600 1849 900 15 1764 361 4 151 5 196 665 y x y 0.0 19.0.0 1.5 113.4 85.5 11.6 113.1 75.0 30.8.9 xy Completig the Calculatio for r ( x ) ( 34. 1) SS( x) = x = 13. 73 = 7. 449 ( y) ( 309) SS( y) = y = 665 = 1116. 9 9 34 1 309 x y (. )( ) SS( xy) = xy =. = 4. 79 SS( xy) 4. 79 r = = = 0. 47 SS( x) SS( y) ( 7. 449)( 1116. 9) Please Note r is usually rouded to the earest hudredth r close to 0: little or o liear correlatio As the magitude of r icreases, towards -1 or +1, there is a icreasigly stroger liear correlatio betwee the two variables We ll also lear to obtai the liear correlatio coefficiet from the graphig calculator. 7

Positive Correlatio Coefficiets Examples of positive correlatio Negative Correlatio Coefficiets Examples of egative correlatio Strog Positive r =.8 Moderate Positive r =.5 Very Weak r =.1 Strog Negative r =.8 Moderate Negative r =.5 Very Weak r =.1 I geeral, if the correlatio is visible to the eye, the it is likely to be strog I geeral, if the correlatio is visible to the eye, the it is likely to be strog Noliear versus No Correlatio Noliear correlatio ad o correlatio Noliear Relatio No Relatio Both sets of variables have r = 0.1, but the differece is that the oliear relatio shows a clear patter Iterpret the Liear Correlatio Coefficiets Correlatio is ot causatio! Just because two variables are correlated does ot mea that oe causes the other to chage There is a strog correlatio betwee shoe sizes ad vocabulary sizes for grade school childre Clearly larger shoe sizes do ot cause larger vocabularies Clearly larger vocabularies do ot cause larger shoe sizes Ofte lurkig variables result i cofoudig How to Determie a Liear Correlatio? How large does the correlatio coefficiet have to be before we ca say that there is a relatio? We re ot quite ready to aswer that questio Summary Correlatio betwee two variables ca be described with both visual ad umeric methods Visual methods Scatter diagrams Aalogous to histograms for sigle variables Numeric methods Liear correlatio coefficiet Aalogous to mea ad variace for sigle variables Care should be take i the iterpretatio of liear correlatio (oliearity ad causatio) 8

Learig Objectives Liear Regressio Lie Fid the regressio lie to fit the data ad use the lie to make predictios Iterpret the slope ad the y-itercept of the regressio lie Compute the sum of squared residuals Regressio Aalysis Regressio aalysis fids the equatio of the lie that best describes the relatioship betwee two variables Oe use of this equatio: to make predictios Best Fitted Lie If we have two variables X ad Y which ted to be liearly correlated, we ofte would like to model the relatio with a lie that best fits to the data. Draw a lie through the scatter diagram We wat to fid the lie that best describes the liear relatioship the regressio lie Residuals Oe differece betwee math ad stat is that statistics assumes that the measuremets are ot exact, that there is a error or residual The formula for the residual is always Residual = Observed Predicted This relatioship is ot just for this chapter it is the geeral way of defiig error i statistics What is a Residual? Here shows a residual o the scatter diagram The regressio lie The observed value y The predicted value y The x value of iterest The residual 9

Example For example, say that we wat to predict a value of y for a specific value of x Assume that we are usig y = x + 5 as our model To predict the value of y whe x = 3, the model gives us y = 3 + 5 = 55, or a predicted value of 55 Assume the actual value of y for x = 3 is equal to 50 The actual value is 50, the predicted value is 55, so the residual (or error) is 50 55 = 5 Method of Least Squares We wat to miimize the predictio errors or residuals, but we eed to defie what this meas We use the method of least-squares which ivolves the followig 3 steps: 1. We cosider a possible liear model to fit the data. We calculate the residual for each poit 3. We add up the squares of the residuals ( We square all of the residuals to avoid the cacellatio of positive residuals ad egative residuals, sice some observed values are uder predicted, some of the observed valued are over predicted by the proposed liear model.) The lie that has the smallest overall residuals ( i.e. the sum of all the squares of the residuals) is called the least-squares regressio lie or simply the regressio lie which is the best-fitted lie to the data. Method of Least Squares Assume the equatio of the best-fittig lie: Illustratio Observed ad predicted values of y: y ˆ = b + b x y 0 1 Where ŷ (called, y hat) deotes the predicted value of Least squares method: Fid the costats b 0 ad b 1 such that the sum y y y^ ( x, y) ( x, y^ ) y^ = b + b x 0 1 ˆ) ( y y = ( y ( b0 b1 x of the overall predictio errors is as small as possible )) y y^ x Liear Regressio Lie The equatio for the regressio lie is give by ˆ = b + b x y 0 1 Yˆ deotes the predicted value for the respose variable. b 1 is the slope of the least-squares regressio lie b 0 is the y-itercept of the least-squares regressio lie Note: Differet textbooks may use differet otatios for the slope ad the itercept. Fid the Equatio of a Liear Regressio Lie The equatio is determied by: b 0 : y-itercept b 1 : slope Values that satisfy the least squares criterio: b = ( x x)( y y) SS( xy) = ( x x) SS( x) 1 ( b x) y b = 1 0 = y ( b 1 x )

Example Example: A recet article measured the job satisfactio of subjects with a 14-questio survey. The data below represets the job satisfactio scores, y, ad the salaries, x, for a sample of similar idividuals: x 31 33 4 35 9 3 37 y 17 0 13 15 18 17 1 1 1) Draw a scatter diagram for this data ) Fid the equatio of the lie of best fit (i.e., regressio lie) Fidig b 1 & b 0 Prelimiary calculatios eeded to fid b 1 ad b 0 : x y x xy 3 31 33 4 35 9 37 34 x 1 17 0 13 15 18 17 1 133 59 961 89 484 576 15 841 1369 7074 y x 76 57 660 86 360 630 493 777 4009 xy Liear Regressio Lie ( x ) 34 x 7074 = 9 5 SS( x) = =. 8 ( )( ) SS( xy) = x y xy. = 34 133 4009 8 = 118 75 SS( xy) 118. 75 b1 = = = 0. 5174 SS( x) 9. 5 b = 0 y ( b1 x) 133 (0. 5174)( 34) = 8 = 1490. Solutio ) Job Satisfactio 1 0 19 18 17 16 15 14 13 1 Scatter Diagram Job Satisfactio Survey Solutio 1) Equatio of the lie of best fit: y^ = 149. + 0. 517x 1 3 5 7 9 31 33 35 37 Salary Please Note Keep at least three extra decimal places while doig the calculatios to esure a accurate aswer Whe roudig off the calculated values of b 0 ad b 1, always keep at least two sigificat digits i the fial aswer The slope b 1 represets the predicted chage i y per uit icrease i x The y-itercept is the value of y where the lie of best fit itersects the y-axis. That is, it is the predicted value of y whe x is zero. The lie of best fit will always pass through the poit ( x, y) Please Note Fidig the values of b 1 ad b 0 is a very tedious process We should also kow to use Graphig calculator for this Fidig the coefficiets b 1 ad b 0 is oly the first step of a regressio aalysis We eed to iterpret the slope b 1 We eed to iterpret the y-itercept b 0 11

Makig Predictios 1. Oe of the mai purposes for obtaiig a regressio equatio is for makig predictios. For a give value of x, we ca predict a value of y^ 3. The regressio equatio should be used oly to cover the sample domai o the iput variable. You ca estimate values outside the domai iterval, but use cautio ad use values close to the domai iterval. 4. Use curret data. A sample take i 1987 should ot be used to make predictios i 1999. Iterpret the Slope Iterpretig the slope b 1 The slope is sometimes defied as as Rise Ru The slope is also sometimes defied as as Chage i y Chage i x The slope relates chages i y to chages i x Iterpret the Slope For example, if b 1 = 4 If x icreases by 1, the y will icrease by 4 If x decreases by 1, the y will decrease by 4 A positive liear relatioship For example, if b 1 = 7 If x icreases by 1, the y will decrease by 7 If x decreases by 1, the y will icrease by 7 A egative liear relatioship Example For example, say that a researcher studies the populatio i a tow (which is the y or respose variable) i each year (which is the x or predictor variable) To simplify the calculatios, years are measured from 1900 (i.e. x = 55 is the year 1955) The model used is y = 300 x + 1,000 A slope of 300 meas that the model predicts that, o the average, the populatio icreases by 300 per year. A itercept of 1,000 meas that the model predicts that the tow had a populatio of 1,000 i the year 1900 (i.e. whe x = 0) Iterpret the y-itercept Iterpretig the y-itercept b 0 Sometimes b 0 has a iterpretatio, ad sometimes ot If 0 is a reasoable value for x, the b 0 ca be iterpreted as the value of y whe x is 0 If 0 is ot a reasoable value for x, the b 0 does ot have a iterpretatio I geeral, we should ot use the model for values of x that are much larger or much smaller tha the observed values of x icluded (that is, it may be ivalid to predict y for x values lyig outside the rage of the observed x.) Summary Summarize two quatitative data Scatter diagrams Correlatio coefficiets Liear models of correlatio Least-squares regressio lie Predictio 1

Obtai Liear Correlatio Coefficiet ad Regressio Lie Equatio from TI Calculator 1. Tur o the diagostic tool: CATALOG[ d 0] DiagosticO ENTER ENTER. Eter the data: STAT EDIT. Eter the x-variable data ito L1 ad the correspodig y-variable data ito L 3. Obtai regressio lie ad the liear correlatio r: STAT CALC 4:LiReg(ax+b) ENTER L1, L, Y1 (Notice: to eter Y1, use VARS Y-VARS 1:Fuctio 1:Y1 ENTER). (The scree will also show r. Just igore it.) 4. Display the scatter diagram ad the fitted regressio lie: Zoom 9:ZoomStat TRACE (press up or dow arrow keys to move the cursor to the regressio lie. Now, you ca trace the poits alog the lie by pressig the right or left arrow keys. While the cursor is o the regressio lie, you ca also eter a umber, the scree will show the predicted value of y for the x value you just etered.) 13