STATISTICS Relationships between variables: Correlation
|
|
- Hester Malone
- 5 years ago
- Views:
Transcription
1 STATISTICS 16 Relationships between variables: Correlation The gentleman pictured above is Sir Francis Galton. Galton invented the statistical concept of correlation and the use of the regression line. Blame him, not me.
2 Correlation 2 1. Quantitative variables can be associated A famous study found an association between the amount of soluble fiber in one s diet and the risk of cardiovascular disease. Specifically, higher dietary soluble fiber is associated with reduced disease risk (and slower disease progression in high-risk individuals). The results of such studies can have an impact on society; e.g., the above results led the American Heart Association to recommend a diet high in soluble fiber. This example illustrates the outcome of a study of the relationship between two variables (e.g., soluble fiber and risk of heart disease). In such studies both variables are measured on the same individual. If the value of one variable tends to be related to the value of the second variable when we look at a large sample of individuals we say that the two variables are associated. Country Wine consumption 1 (x) Heart disease 2 (y) Australia Austria Belgium Canada Denmark Finland France Iceland Ireland.7 Italy Netherlands New Zeland Norway Spain Sweden Switzerland U. K U. S One of the most common reasons to collect data in the first place is to look for associations. Below are some examples of questions of association that have medical implications: W. Germany Wine consumption: liters of alcohol via wine per person per year; 2 Heart disease: Deaths per 1, per year Does alcohol consumption increase, or decrease one s risk of death due to heart disease? (See above table for pairs of variables relevant to this question.) Has the incidence of breast cancer been increasing over the last years? Does a high fiber diet result in a reduced risk of heart disease? Does the concentration of a new drug affect the severity of its side-effects? A common theme in the above examples is that one variable might be used to explain or predict another variable. For example does our data allow us to predict a countries death rate due to heart disease if we have information about wine consumption? In such cases we distinguish between the RESPONSE VARIABLE (The outcome of a study, or an event you wish to predict or explain.) and the EXPLANATORY VARIABLE (the variable you hypothesize to cause a change in the response variable). Practice problem 1 (available on-line) illustrates how to think about the x and y variables.
3 Correlation 3 Unfortunately the terminology in this setting can be quite variable, so a table is presented below to help clarify the situation. Variable measuring the outcome Response variable Response variable Dependent variable y-variable (because it is plotted on the y-axis) Variable explaining the change in outcome Explanatory variable Predictor variable Independent variable x-variable (because it is plotted on the x-axis) 2. SCATTERPLOTS allow visual interpretation of association The first step in the process of studying an association between two variables is to make a scatterplot. A SCATTERPLOT is a graphical display of the relationship between two quantitative variables measured on the same individual, where each individual is represented by a single point in the plot. On the right is an example scatterplot of the relationship between wine consumption and risk of heart disease. Since we are interested in predicting one s risk of heart disease by the amount of wine consumed, we plot the risk of heart disease (the response variable) on the y-axis. Now what? Well, it s quite simple; just look at the overall pattern in the plot. This is best done by looking at the following different aspects of the plot Figure 1: Scatter plot of association between risk of heart disease (in deaths per 1, per year) and wine consumption (in liters of alcohol per year) for 19 countries. Trend: Is there a positive or negative association? Outliers: Are there any striking deviations from the overall pattern? Form: Is the trend linear, curved, clustered, or something else? Strength: Is the trend strong or weak? Variance: Does variance in y change with x?
4 Correlation 4 Southern European countries Lastly, it is sometimes the case that the individual samples from a population can be further classified into CATEGORICAL VARIABLES. In such cases this information also can be visualized in a scatter plot by using a different color or symbol to indicate each category (e.g., Figure 2 illustrates that southern European countries [France, Italy & Spain] have the lowest deaths due to heart disease and the highest rates of wine consumption). 3. When the association is linear, think in terms of CORRELATION + Canada Figure 2: The same dataset as in figure 1, but with the category Southern European country indicated by an open box, and Canada indicated by as plus (+). If two variables are (1) quantitative and (2) have a linear association, then the strength and direction of the association can be measured by using the CORRELATION COEFFICIENT (abbreviated as r ). uantifying The Strength of Linear Association: he Correlation Coefficient, r The DIRECTION of the association is indicated by the sign of the correlation coefficient (r), and the magnitude of r measures the STRENGTH (illustrated below in Quantifying The Strength of Linear Association: Figure 3). The Correlation Coefficient, r Figure 3: Illustration of how the correlation coefficient (r) provides a measure of the strength and direction of an association. The value of r is always between +1 and -1. r=.9 r=.9 r=.7 r=.7 r= r= r=.7 r=.7 r=.9 r=.9 r = -.9 r = -.7 r = r =.7 r =.9 Value of r Relationship among variables r = 1 The two variables have perfect correlation with no scatter Measures strength strength of linear of linear association between between x and x and y y Measuresr > strength (positive) of linear The two variables association tend to increase between or decrease together x and y Variables Variables x and x and y are y are quantitative r = The two variables do not vary together in any way Variables x and y are quantitative Always Always r < (negative) lies lies between between The -1 two and variables -1 and 1are 1inversely 1 related 1 r r 1 1 Always lies between -1 and 1 1 r 1 Unchanged if we if we replace replace x by x by ax+b ax+be.g., e.g., in to in cm, to cm, F to F to C C Unchanged Unchanged if we replace by by switching switching x by x ax+b and x and y e.g., y incorr(x,y)=corr(y,x) to cm, F to C Unchanged Influenced by switching by by outliers outliers x and y corr(x,y)=corr(y,x)
5 Correlation 5 4. The correlation coefficient (r): how it works In this section we take a mathematical look at how the correlation coefficient works. Don t worry, it s not too complicated. The essential task is the standardization of both the x and y values. " $ # x i! x s x % ' and & " $ # y i! y s y % ' &!! and!! Now each point is expressed in how many SDs above or below the mean that it lies; i.e., the familiar Z-SCORE from earlier in this course! There is nothing new here; recall that you learned about standardizing values when you learned about the normal distribution. Thus you know that a positive z- score indicates the original value lies above the mean and a negative z-score that the value lies below the mean. Standardization also means that the units have been removed from the correlation coefficient. The next task is to take the product for all pairs of!! and!!. The product will be either positive or negative. The product is then summed over all pairs of samples. This is the step that summarizes the strength and direction of the association for all data (see Box 1 for additional details).! = 1! 1!!!!!!!!!!! =!!!!! 1 Other formulas exist that are more convenient for doing the calculation by hand, but everyone uses a computer these days, so we will just stop here.
6 Correlation 6 BOX 1: MATHEMATICAL BASIS OF A POSITIVE CORRELATION COEFFICIENT (r)!!!!!!!!!!!!!!!!!!!!!!!!! 4 quadrants defined by:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Z y = (+) Z y = Z y = (-) Z x = (-) Z x = Z x = (+) Taking the product of standardized scores (Z y and Z x ) defines 4 quadrants within the scatter plot. The dashed lines show the locations of the standardized mean of y and x (Z y = and Z x = ). If a pair of values for y and x agree (i.e., they are both above the mean, or they are both below the mean), then their product is positive (blue). If this is not the case, then their product is negative (red). In the above example the value of r is positive because there are more blue data points than red data points. Caution: You can compute the correlation coefficient for any set of data; however it will not be possible to interpret correctly if the following ASSUMPTIONS are not met. Individuals must be selected at random from the population. For example, you can t choose individuals based on the value of one or both of the variables! The data must be comprised of independent observations. Examples of non-independent samples include siblings, or cases where sampling one member of the population somehow influences the chance of sampling another member of the population. Measurement of the x- and y-values must be independent of each other. For example, scores for mid-term exams and the final grades are not measured independently of each other. The x- and y-values must be sampled from populations that follow (approximately) a normal distribution. Outliers can strongly influence the estimate of r. All co-variation between variables must be linear. Correlation coefficients can be misleading when the data relationship is not linear! The explanatory variable, x, was not experimentally controlled. Experimenters often systematically control the explanatory variable (e.g., dose, time, flow, etc.); in such cases the setting is LINEAR REGRESSION. The confidence intervals on the r will not be correct.
7 Correlation 7 Caution: O UTLIERS can impact all statistical calculations, but correlation is especially sensitive. The following figure illustrates how the presence of a single point can have a large impact on the correlation coefficient (r). A 4 r = outlier B r = wine consumption: lieters of alcohol/person/year Figure 4: The impact of an outlier on the correlation coefficient (r). Panel A shows the plot of the data in a case where a data entry error has led to an outlier (upper right corner). Panel B shows the change in the results that is obtained when the outlier is corrected. In practice, these issues require that you always check a scatter plot before you consider summarizing your data with a correlation coefficient. Furthermore, it is always good to look at the scatter plot (if possible) when interpreting someone else s correlation coefficient! 5. The value of r requires careful interpretation. Let s return to the relationship between wine consumption and death rate due to heart disease. First let s switch the x- and y-variables; the result is that the correlation coefficient is unaffected (Figure 5 below). r = r = Figure 5: The x- and y-variables have been switched in the above two panels. It is clear that the association is exactly the same, and only the orientation of the data has changed. Note that the value of the correlation coefficient is unaffected by switching the x- and y- variables.
8 Correlation 8 Clearly the designation of the EXPLANATORY and RESPONSE VARIABLES impacts the way we think about the association, but it does not impact the correlation. So, how should we think about the data in the example; the correlation coefficient is -.843, indicating a negative relationship between the two variables. There are four possible explanations: 1. Alcohol consumption decreases the risk of death due to heart disease. 2. A death rate due to heart disease directly affects average alcohol consumption. 3. Both the level of alcohol consumption and the death rate due to heart disease are under the control of some other variable. 4. The two variables are unrelated, and the observation of a correlation was due to random chance. A: causation B: common response C: confounding! "! "! " # # Figure 6: The solid arrows show the true cause-and-effect relationship. The dashed arrows show the observed associations. (A) Simple causal relationship between the x and y variables. (B) A common response of the x and y variables to a lurking variable, called w, results in an observed association. (C) The true cause-and-effect relationships can result in confounding associations. (Adapted from Moore and McCabe.) You cannot decide between the first three possibilities above without having more information (perhaps obtained by further data collection or experimentation). The last possibility might be rejected by using a statistical framework that you will learn later in the course. The main point is that correlation does not prove causality. An excellent example of this is the well-known positive correlation between the rate of drowning and the rate of ice cream consumption. Why do you think such a correlation might exist? Practice problems 2 and 3 will be worked in class to illustrate the proper use of scatter plots.
Chapter 6 Scatterplots, Association and Correlation
Chapter 6 Scatterplots, Association and Correlation Looking for Correlation Example Does the number of hours you watch TV per week impact your average grade in a class? Hours 12 10 5 3 15 16 8 Grade 70
More informationChapter 4 Data with Two Variables
Chapter 4 Data with Two Variables 1 Scatter Plots and Correlation and 2 Pearson s Correlation Coefficient Looking for Correlation Example Does the number of hours you watch TV per week impact your average
More informationChapter 4 Data with Two Variables
Chapter 4 Data with Two Variables 1 Scatter Plots and Correlation and 2 Pearson s Correlation Coefficient Looking for Correlation Example Does the number of hours you watch TV per week impact your average
More informationIf the roles of the variable are not clear, then which variable is placed on which axis is not important.
Chapter 6 - Scatterplots, Association, and Correlation February 6, 2015 In chapter 6-8, we look at ways to compare the relationship of 2 quantitative variables. First we will look at a graphical representation,
More informationLearning Objectives. Math Chapter 3. Chapter 3. Association. Response and Explanatory Variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3 Learning Objectives 3.1 The Association between Two Categorical Variables 1. Identify variable type: Response or Explanatory 2. Define Association
More informationCh. 3 Review - LSRL AP Stats
Ch. 3 Review - LSRL AP Stats Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 3-1 The height (in feet) and volume (in cubic feet) of usable lumber
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationRelationships Regression
Relationships Regression BPS chapter 5 2006 W.H. Freeman and Company Objectives (BPS chapter 5) Regression Regression lines The least-squares regression line Using technology Facts about least-squares
More informationMath 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and
Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section 2.1.1 and 8.1-8.2.6 Overview Scatterplots Explanatory and Response Variables Describing Association The Regression Equation
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 4 Problems with small populations 9 II. Why Random Sampling is Important 10 A myth,
More informationScatterplots and Correlation
Chapter 4 Scatterplots and Correlation 2/15/2019 Chapter 4 1 Explanatory Variable and Response Variable Correlation describes linear relationships between quantitative variables X is the quantitative explanatory
More information15.0 Linear Regression
15.0 Linear Regression 1 Answer Questions Lines Correlation Regression 15.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was
More informationScatterplots and Correlations
Scatterplots and Correlations Section 4.1 1 New Definitions Explanatory Variable: (independent, x variable): attempts to explain observed outcome. Response Variable: (dependent, y variable): measures outcome
More informationThe empirical ( ) rule
The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationChapter 11. Correlation and Regression
Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of
More informationChapter 3: Examining Relationships
Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or
More informationAP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation
Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may
More informationSIMPLE LINEAR REGRESSION STAT 251
1 SIMPLE LINEAR REGRESSION STAT 251 OUTLINE Relationships in Data The Beginning Scatterplots Correlation The Least Squares Line Cautions Association vs. Causation Extrapolation Outliers Inference: Simple
More informationVocabulary: Data About Us
Vocabulary: Data About Us Two Types of Data Concept Numerical data: is data about some attribute that must be organized by numerical order to show how the data varies. For example: Number of pets Measure
More informationSTOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More information11 Correlation and Regression
Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationUnit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS. X = cigarette consumption (per capita in 1930)
BIOSTATS 540 Fall 2015 Introductory Biostatistics Page 1 of 10 Unit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS Consider the following study of the relationship
More informationCHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships 3.1 Scatterplots and Correlation The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Scatterplots and Correlation Learning
More informationChapter 5 Least Squares Regression
Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variable In this lecture: We shall look at two quantitative variables.
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variables In this lecture: We shall look at two quantitative variables.
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationReview of Regression Basics
Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationReview. Number of variables. Standard Scores. Anecdotal / Clinical. Bivariate relationships. Ch. 3: Correlation & Linear Regression
Ch. 3: Correlation & Relationships between variables Scatterplots Exercise Correlation Race / DNA Review Why numbers? Distribution & Graphs : Histogram Central Tendency Mean (SD) The Central Limit Theorem
More informationMATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression
MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot
More informationAP Statistics. Chapter 6 Scatterplots, Association, and Correlation
AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables
More informationBig Data Analysis with Apache Spark UC#BERKELEY
Big Data Analysis with Apache Spark UC#BERKELEY This Lecture: Relation between Variables An association A trend» Positive association or Negative association A pattern» Could be any discernible shape»
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationChapter 6. Logistic Regression. 6.1 A linear model for the log odds
Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationappstats8.notebook October 11, 2016
Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus
More informationChapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation
Chapter 14 Describing Relationships: Scatterplots and Correlation Chapter 14 1 Statistical versus Deterministic Relationships Distance versus Speed (when travel time is constant). Income (in millions of
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationECON Introductory Econometrics. Lecture 13: Internal and external validity
ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external
More informationRelationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.
Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height. Income is associated with education. Functional relationships between
More informationAP Statistics S C A T T E R P L O T S, A S S O C I A T I O N, A N D C O R R E L A T I O N C H A P 6
AP Statistics 1 S C A T T E R P L O T S, A S S O C I A T I O N, A N D C O R R E L A T I O N C H A P 6 The invalid assumption that correlation implies cause is probably among the two or three most serious
More informationDr. Allen Back. Sep. 23, 2016
Dr. Allen Back Sep. 23, 2016 Look at All the Data Graphically A Famous Example: The Challenger Tragedy Look at All the Data Graphically A Famous Example: The Challenger Tragedy Type of Data Looked at the
More informationLecture 7, Chapter 7 summary
1 Lecture 7, Chapter 7 summary Scatterplots, Association, and Correlation Topic: Association between two quantitative variables Use scatterplots to see the type of association It does not matter which
More informationEssential Math For Economics
Essential Math For Economics D I A N N A D A S I L V A - G L A S G O W D E P A R T M E N T O F E C O N O M I C S U N I V E R S I T Y O F G U Y A N A S E P T E M B E R 7, 2 0 1 7 Wk 2 Lecture 1... INTRODUCTION
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationChapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals
Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus
More information21st Century Global Learning
21st Century Global Learning A focus for 7th grade is on the ever-increasing global interactions in society. This focus includes a study of various societies and regions from every continent. We have already
More informationDo not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13
C h a p t e r 13 Independent-Samples t Test and Mann- Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In
More informationLooking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company
Looking at Data Relationships 2.1 Scatterplots 2012 W. H. Freeman and Company Here, we have two quantitative variables for each of 16 students. 1) How many beers they drank, and 2) Their blood alcohol
More informationChapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.
Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationLecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population
Lecture 5 1 Lecture 3 The Population Variance The population variance, denoted σ 2, is the sum of the squared deviations about the population mean divided by the number of observations in the population,
More informationChapter 6: Exploring Data: Relationships Lesson Plan
Chapter 6: Exploring Data: Relationships Lesson Plan For All Practical Purposes Displaying Relationships: Scatterplots Mathematical Literacy in Today s World, 9th ed. Making Predictions: Regression Line
More informationEssential Statistics. Gould Ryan Wong
Global Global Essential Statistics Eploring the World through Data For these Global Editions, the editorial team at Pearson has collaborated with educators across the world to address a wide range of subjects
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More informationMath 147 Lecture Notes: Lecture 12
Math 147 Lecture Notes: Lecture 12 Walter Carlip February, 2018 All generalizations are false, including this one.. Samuel Clemens (aka Mark Twain) (1835-1910) Figures don t lie, but liars do figure. Samuel
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationSTA441: Spring Multiple Regression. More than one explanatory variable at the same time
STA441: Spring 2016 Multiple Regression More than one explanatory variable at the same time This slide show is a free open source document. See the last slide for copyright information. One Explanatory
More informationAIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)
AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will
More informationCalories, Obesity and Health in OECD Countries
Presented at: The Agricultural Economics Society's 81st Annual Conference, University of Reading, UK 2nd to 4th April 200 Calories, Obesity and Health in OECD Countries Mario Mazzocchi and W Bruce Traill
More informationThe flu example from last class is actually one of our most common transformations called the log-linear model:
The Log-Linear Model The flu example from last class is actually one of our most common transformations called the log-linear model: ln Y = β 1 + β 2 X + ε We can use ordinary least squares to estimate
More informationSteps to take to do the descriptive part of regression analysis:
STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,
More informationeconomic growth is not conducive to a country s overall economic performance and, additionally,
WEB APPENDIX: EXAMINING THE CROSS-NATIONAL AND LONGITUDINAL VARIATION IN ECONOMIC PERFORMANCE USING FUZZY-SETS APPENDIX REASONING BEHIND THE BREAKPOINTS FOR THE SETS ECONOMIC GROWTH, EMPLOYMENT AND DEBT
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers
More information- a value calculated or derived from the data.
Descriptive statistics: Note: I'm assuming you know some basics. If you don't, please read chapter 1 on your own. It's pretty easy material, and it gives you a good background as to why we need statistics.
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationScatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association
Scatterplots STAT22000 Autumn 2013 Lecture 4 Yibi Huang October 7, 2013 21 Scatterplots 22 Correlation (x 1, y 1 ) (x 2, y 2 ) (x 3, y 3 ) (x n, y n ) A scatter plot shows the relationship between two
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationExample: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?
Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight? 16 subjects overfed for 8 weeks Explanatory: change in energy use from non-exercise activity (calories)
More informationRelationships between variables. Visualizing Bivariate Distributions: Scatter Plots
SFBS Course Notes Part 7: Correlation Bivariate relationships (p. 1) Linear transformations (p. 3) Pearson r : Measuring a relationship (p. 5) Interpretation of correlations (p. 10) Relationships between
More informationLinear Regression. Chapter 3
Chapter 3 Linear Regression Once we ve acquired data with multiple variables, one very important question is how the variables are related. For example, we could ask for the relationship between people
More informationAnnotated Exam of Statistics 6C - Prof. M. Romanazzi
1 Università di Venezia - Corso di Laurea Economics & Management Annotated Exam of Statistics 6C - Prof. M. Romanazzi March 17th, 2015 Full Name Matricola Total (nominal) score: 30/30 (2/30 for each question).
More informationSTA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:
STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.
More informationEvaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest
Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest Department of methodology and statistics, Tilburg University WorkingGroupStructuralEquationModeling26-27.02.2015,
More informationPart A: Salmonella prevalence estimates. (Question N EFSA-Q ) Adopted by The Task Force on 28 March 2007
The EFSA Journal (2007) 98, 1-85 Report of the Task Force on Zoonoses Data Collection on the Analysis of the baseline survey on the prevalence of Salmonella in broiler flocks of Gallus gallus, in the EU,
More informationChapter 7 Summary Scatterplots, Association, and Correlation
Chapter 7 Summary Scatterplots, Association, and Correlation What have we learned? We examine scatterplots for direction, form, strength, and unusual features. Although not every relationship is linear,
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationPh.D. course: Regression models. Introduction. 19 April 2012
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationProbabilistic Causal Models
Probabilistic Causal Models A Short Introduction Robin J. Evans www.stat.washington.edu/ rje42 ACMS Seminar, University of Washington 24th February 2011 1/26 Acknowledgements This work is joint with Thomas
More informationAP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions
AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions Know the definitions of the following words: bivariate data, regression analysis, scatter diagram, correlation coefficient, independent
More informationMachine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on
Machine Learning Module 3-4: Regression and Survival Analysis Day 2, 9.00 16.00 Asst. Prof. Dr. Santitham Prom-on Department of Computer Engineering, Faculty of Engineering King Mongkut s University of
More informationTrends in Human Development Index of European Union
Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development
More informationChapter 8. Linear Regression /71
Chapter 8 Linear Regression 1 /71 Homework p192 1, 2, 3, 5, 7, 13, 15, 21, 27, 28, 29, 32, 35, 37 2 /71 3 /71 Objectives Determine Least Squares Regression Line (LSRL) describing the association of two
More informationPh.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable
More informationOnline Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales
Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales 1 Table A.1 The Eurobarometer Surveys The Eurobarometer surveys are the products of a unique program
More informationShortfalls of Panel Unit Root Testing. Jack Strauss Saint Louis University. And. Taner Yigit Bilkent University. Abstract
Shortfalls of Panel Unit Root Testing Jack Strauss Saint Louis University And Taner Yigit Bilkent University Abstract This paper shows that (i) magnitude and variation of contemporaneous correlation are
More informationAP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1
AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1 2. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam.
More informationSCATTERPLOTS. We can talk about the correlation or relationship or association between two variables and mean the same thing.
SCATTERPLOTS When we want to know if there is some sort of relationship between 2 numerical variables, we can use a scatterplot. It gives a visual display of the relationship between the 2 variables. Graphing
More informationBIOSTATISTICS NURS 3324
Simple Linear Regression and Correlation Introduction Previously, our attention has been focused on one variable which we designated by x. Frequently, it is desirable to learn something about the relationship
More informationUncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error
Uncertainty, Error, and Precision in Quantitative Measurements an Introduction Much of the work in any chemistry laboratory involves the measurement of numerical quantities. A quantitative measurement
More information