STAB22 Statistics I. Lecture 7

Similar documents
3.1 Scatterplots and Correlation

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7 Summary Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

appstats8.notebook October 11, 2016

Chapter 5 Least Squares Regression

Chapter 6. September 17, Please pick up a calculator and take out paper and something to write with. Association and Correlation.

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Chapter 3: Examining Relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Scatterplots and Correlations

AP Stats ~ 3A: Scatterplots and Correlation OBJECTIVES:

Scatterplots and Correlation

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Continuous random variables

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

AP Statistics S C A T T E R P L O T S, A S S O C I A T I O N, A N D C O R R E L A T I O N C H A P 6

The empirical ( ) rule

If we have high correlation, we d like to determine causation.

STAT 200 Chapter 1 Looking at Data - Distributions

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

Scatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association

The response variable depends on the explanatory variable.

Chapter 6: Exploring Data: Relationships Lesson Plan

Arvind Borde / MAT , Week 5: Relationships I

Bivariate data data from two variables e.g. Maths test results and English test results. Interpolate estimate a value between two known values.

Descriptive Univariate Statistics and Bivariate Correlation

Describing Bivariate Relationships

MATH 1150 Chapter 2 Notation and Terminology

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

Chapter 7. Scatterplots, Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation

Reminders. Homework due tomorrow Quiz tomorrow

AP Statistics Two-Variable Data Analysis

5.1 Bivariate Relationships

If the roles of the variable are not clear, then which variable is placed on which axis is not important.

Bivariate Data Summary

Section 6.2: Measures of Variation

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

PS 3: Sections 110 & 111

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

Chapter 7 Linear Regression

CHAPTER 3 Describing Relationships

STAT 155 Introductory Statistics. Lecture 6: The Normal Distributions (II)

Chapter 7. Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation.

The Simple Linear Regression Model

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

Univariate (one variable) data

Graphing Skill #1: What Type of Graph is it? There are several types of graphs that scientists often use to display data.

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

STT 315 This lecture is based on Chapter 2 of the textbook.

Correlation & Simple Regression

y n 1 ( x i x )( y y i n 1 i y 2

Chapter 3. Measuring data

Chapter 5. Understanding and Comparing. Distributions

Lecture 11: Simple Linear Regression

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

How can we explore the association between two quantitative variables?

Review. Number of variables. Standard Scores. Anecdotal / Clinical. Bivariate relationships. Ch. 3: Correlation & Linear Regression

Describing Distributions

Fish act Water temp

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

7. Do not estimate values for y using x-values outside the limits of the data given. This is called extrapolation and is not reliable.

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers

Standard Deviation and z Scores

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

SECTION I Number of Questions 42 Percent of Total Grade 50

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

1 Measures of the Center of a Distribution

Chapter 6 Group Activity - SOLUTIONS

Stat 101: Lecture 6. Summer 2006

Correlation: basic properties.

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Analyzing Bivariate Data: Interval/Ratio. Today s Content

AP Statistics - Chapter 2A Extra Practice

Sociology 6Z03 Review I

Notes 6. Basic Stats Procedures part II

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Chapter 2: Tools for Exploring Univariate Data

Unit 1: Statistical Analysis. IB Biology SL

Lecture 7, Chapter 7 summary

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Descriptive Statistics Methods of organizing and summarizing any data/information.

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

AP Final Review II Exploring Data (20% 30%)

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

What s a good way to show how the results came out? The relationship between two variables can be represented visually by a SCATTER DIAGRAM.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Unit Six Information. EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15%

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Transcription:

STAB22 Statistics I Lecture 7 1

Example Newborn babies weight follows Normal distr. w/ mean 3500 grams & SD 500 grams. A baby is defined as high birth weight if it is in the top 2% of birth weights. What weight would make a baby high birth weight? 2

Checking Normality Normal density is theoretical model; when should we use it to describe real data? Can check histogram Look for bell-shape (unimodal & symmetric) A better check is Normal Probability plot 3

Normal Probability Plot Plot data values against their theoretical Normal z-scores (a.k.a. Normal quantile plot) StatCrunch: Graphics > QQplot If points lie close to straight line data welldescribed by Normal 4

Example (non-normal plots) Right skewed distr. Left skewed distr. convex plot (U-shaped) concave plot ( -shaped) 5

Relationship Between Two Quantitative Variables Consider following student data Quantitative variables Weight (in kg) Height (in cm) Name Weight Height Aubrey 77 188 Ron 75 173 Carl 70 178 What is relationship between weight & height? First step is to examine relationship visually using a scatterplot 6

Scatterplot Variables measured along horizontal (y-) and vertical (x-) axis; each dot presents combination of corresponding individual s values (Height=170, Weight=61) StatCrunch: 7 Graphics > Scatterplot

Role of Variables Usually there is a variable of interest, called response / dependent variable, and a variable whose effect on the response we want to examine, called explanatoty / independent Response goes on vertical axis (a.k.a. y-variable) and Explanatory goes on horizontal axis (a.k.a. x-variable) E.g. Want to study whether Blood Pressure increases with Age; how would you classify the variables? Response variable: Explanatory variable: 8

Types of Relationships Overall pattern of scatterplot describes form, direction & strength of relationship Form: Linear relationship Non-linear relationship 40 45 50-10 -5 0 5 8.5 9.0 9.5 10.5 11.5 8.5 9.0 9.5 10.5 11.5 9

Types of Relationships Direction: Positive relationship Negative relationship 0 10 20 30 40 50 0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Y-var. increases with X-var. and vice-versa Y-var. decreases with X-var. and vice-versa 10

Scatterplot Pattern Strength: Strong relationship Week relationship -1.0-0.5 0.0 0.5 1.0-2 -1 0 1 2 8 9 10 11 12 13 data tightly clustered around pattern 8 9 10 11 12 13 data loosely spread around pattern, forming vague cloud 11

Outliers Scatterplots also help identify outliers, i.e. extreme deviations from overall pattern -1.0-0.5 0.0 0.5 1.0 8 9 10 11 12 13 12

Example Describe relationship of variables based on scatterplot & identify any outliers -10-5 0 5 10 Form: Direction: Strength: -2-1 0 1 2 3 4 13

Correlation Correlation coefficient (r): numerical measure of linear relationship between 2 vars For variable data (x 1,,x n ) & (y 1,,y n ), given by r x x y y i x x y y i i 2 2 i StatCrunch: Stat > Summary Stats > Correlation r is always a number between 1 and 1 r describes strength & direction of linear relationships only 14

Interpretation of Correlation Coefficient 15

Correlation Properties r>0 +ve & r<0 ve relationship r magnitude, i.e. distance from 0, describes the strength of the relationship, i.e. how close the data are to a line r does not change when the x and/or y variables are shifted or rescaled r is symmetric: doesn t matter which variable is on x- or y-axis, r is the same in both cases r is sensitive to outliers 16

Example Choose corresponding r for each scatterplot -4-2 0 2 4-10 0 5 10 20 0 5 10 20 30-4 -2 0 2 4-4 -2 0 2 4-4 -2 0 2 0.8 0.4 0.0 +0.4 +0.8 0.8 0.4 0.0 +0.4 +0.8 0.8 0.4 0.0 +0.4 +0.8 17

Correlation vs Causation If two variables are correlated this does not necessarily imply that x causes y to change E.g. Height & Weight +ly correlated, but Weight does not cause Height, i.e. putting on more weight will not make you taller (r = +0.8762) Generally, Correlation Causation 18

Correlation vs Causation Observed correlation/association between two variables can be result of a third hidden or lurking variable E.g. Ice-cream sales correlated with drowning, but both variables are caused by weather ice-cream sales weather When weather is hot, people eat more ice-creams and do more swimming (& therefore drowning)! # people drowning 19