Data Analysis and Statistical Methods Statistics 651
|
|
- Eugenia Malone
- 6 years ago
- Views:
Transcription
1 Data Analysis and Statistical Methods Statistics Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba Rao
2 Review: Test for independence In many situations we observe two variable on an individual, for example the gender and favourite colour. Often we want to see whether there is dependence between the two observations (does gender influence colour preference). If there is no dependence then the proportions with each subpopulation should be same as the proportions over the entire population. If there is a dependence, this is no longer true. The principle of the Test for independence, it so calculate expected numbers if they are independent and compare it what we observe. 1
3 Example I: Test for independence Psychologists wanted to investigate whether there was dependence between height and how bossy someone was (aka Do short men have a Napolean complex). They gathered the following data. short medium large totals bossy not bossy Test the hypothesis that there is no dependence between height and bossiness against the alternative that there is. 2
4 Solution I Recall that independence means that if you randomly selection someone the probability they will be be bossy is the same as if you were to restrict the population to tall people (or short people or middle size people) and randomly select someone in this subpopulation (of only tall, or short or middle size people). If this is the case, then size has no dependence on bossiness. In reality we cannot calculate these probabilities, because we do not observe the entire population of people, but we do have samples from the population. In this case we have a sample of 1000 people. First look at the data. We see that the proportion of short men who are bossy is larger than the proportion that the proportion of medium and 3
5 large men that are bossy. So from looking at the data, there appears to be a dependence. But this difference could be due to random variation. So we want to test whether the difference is significant or not. Our objective is to test: H 0 : There is no dependence between height and bossiness. H A : There is a dependence between height and bossiness. We first have to make a table of expected values under the null that there is no dependence between height and bossiness. 4
6 Motivation We observe that in the total population of men in the sample 30% = 300/1000 are bossy and 70% = 700/1000 are not bossy. We transfer these percentages to the subgroups of small/median and large men. short medium large totals bossy 30% of % of % of not bossy 70 % of % of % of Which gives: 5
7 short medium large totals bossy = = = not bossy = = = which is the same as: short medium large totals bossy = = = not bossy 1000 = = =
8 In summary, what you need to do... short medium large totals bossy = = = not bossy 1000 = = = So basically you just need to multiple each column number by the row number and divide by the total number to each each entry of the table. We can now evaluate the test statistic, by first taking the difference: 7
9 short medium large totals bossy (60 90) 2 60 ( ) (60 55) not bossy ( ) ( ) ( ) The test statistic is T = (60 90) ( ) ( ) ( ) (60 55) ( ) = 26 8
10 Now because there are 3 2 cells (it is a 3 by 2 table), under the null T has a χ 2 distribution with (3 1) (2 1) = 2-degrees of freedom. Look up Table 7: χ 2 2(0.05) = The p-value is P(χ 2 2 > 26) = Since T = > 5.99, there is enough evidence to reject the null. Equivalently the p-value is very small. That is, based on the data there appears to be a dependence between size and bossiness. 9
11 Example 2 A group of space explorers have discovered a planet which is inhabitated by alien creatures. They notice that there are three main groups of aliens: the Pink aliens, the Blue aliens and the Green aliens. One of the explorer s happens to be a statistican. She notices that the size of the alien tends to differ amongst the population. So she sets out to determine whether there was any dependence between the size of alien and colour of alien. She randomly selected 160 aliens and notes their colour and size (grouped as either large or small). This is the data she collected: Pink Blue Green Subtotal Big Little Subtotal State the null and alternative, what do you think were the conclusions of the statistican s research (use α = 0.05)? 10
12 Solution 2 H 0 : There is no dependence between height and colour. H A : There is a dependence between height and colour. We do a chi-squared test for independence and have need to make a table of what we expect to observe if there is no dependence between height and colour. Pink Blue Green Subtotal Big 160 = = = Little 160 = = = Subtotal
13 We now construct the T statistic Lecture 31 (MWF) Review of test for independence and linear regression T = (30 25) (20 25) (50 50) (10 15) (20 15) (30 30)2 30 = Under the null T has a χ 2 -squared distribution with (3 1) (2 1) = 2 degrees of freedom. Looking into the tables we see that χ 2 (0.05) = The p-value for 5.33 is about 0.07 (which is greater than 0.05). Since 5.33 < 5.99 there is not enough evidence to reject the null. Therefore we cannot conclude from the data that there is clear evidence for dependence between colour and height. 12
14 Linear regression Suppose I randomly pick a pick an adult and I ask you to guess their height. You would probably give me an interval, of say, 4.5 to 6.5 feet (this can be considered as a CI). Suppose I gave you the additional information that they have size 5 feet, would you reassess your previous estimate? Your would probably change your estimate. In this case you may say their height would be between The size 5 gives us additional information about that person. It allows us to narrow down our estimate and make a more precise estimate of her height. 13
15 Put into statistical terms, without knowledge of their shoe size the standard deviation is quite large. Recall that standard devation is a measure of error. Once we know their shoe size the standard devation (amount of error) decreases. Often we believe that one variable may have an influence on another variable. For example the variable X (shoe size of person) may influence the variable Y (the height height of that person). We call X the independent variable. We call Y the dependent variable. To see if X has an influence on Y we often plot a scatter plot with X on the x-axis and Y on the Y -axis. We look for a relationship between the two. 14
16 Sometimes it is not clear what influences what (for example does shoe size have an influence on height or height have an influence on shoe size), in which case, you let the dependent variable Y be the variable of interest. 15
17 Smoking and lung cancer The independent variable is number of cigerattes smoked per capita in a state and the dependent variable is the incidence of lung cancer per 100K people. 16
18 Smoking and leukemia The independent variable is number of cigerattes smoked per capita in a state and the dependent variable is the incidence of leukemia per 100K people. 17
19 None of the plots follow exactly a linear line. To check if x has an effect on Y in a linear way we could fit a line through the points. We can use the line to predict the average value of Y given x. For example, the average height of a person with size 5 feet. What line is the best line to use? How can we check whether this line has any meaning at all (after all we can put a line through any scatterplot)? 18
20 Recall the equation of a line y = mx + c Y m = y x c X In linear regression we fit this line through the data. 19
21 Least squares - the line of best fit We fit the line β 0 + β 1 x through the data, the way we choose β 0 and β 1 is using the method of least squares. We have the observations {(y 1, x 1 ),...,(y n, x n )}, and believe that y i depends linearly on x i. We use x i to predict y i. The predictor is ŷ i, where ŷ i = ˆβ 0 + ˆβ 1 x i. We want ŷ i to be as close as possible to y i, hence we choose the ˆβ 0 and ˆβ 1 such that it minimises the quantity n (y i ŷ i ) 2 = i=1 n (y i ˆβ 0 ˆβ 1 x i ) 2. i=1 20
22 A graphical representation y.. (x 3, y ) 3 y y 3 3. (x 5, y 5 ) y y 5 5 y y 4 4 (x 4, y 4 ). 1 1 (x, y ) 1 1 y y. y y 2 2 (x 2, y ) 2 x 21
23 Quantities required We need the average of the x s: x = 1 n n i=1 x i. And the average of the y s ȳ = 1 n n i=1 y i. We need to calculate: S xy = (y 1 ȳ)(x 1 x) (y n ȳ)(x n x) = S xx n = (x 1 x) (x n x) 2 = (x i x) 2 i=1 n (y i ȳ)(x i x) i=1 22
24 The equations for the parameter estimator The least squares estimator minimises the squared sum of all these vertical distances. Basically it gives the line of best fit through the observations. The line ˆβ 1 and ˆβ 0 can be evaluated using the formulas: ˆβ 1 = S xy S xx where S xy = n i=1 (y i ȳ)(x i x) and S xx = n i=1 (x i x) 2 with x and ȳ, the sample means of X and Y : x = 1 n n i=1 x i and ȳ = 1 n n i=1 y i. And ˆβ 0 = ȳ ˆβ 1 x. 23
25 Therefore given an ˆβ 0 ˆβ1, given any regressor (explanatory variable) x, we can predict y using the predictor ŷ i = ˆβ 0 + ˆβ 1 x i. 24
26 What S xy and S xx mean The S xy and S xx just fall out when trying to minimise the least squares equation. However, they do have an useful interpretation. We start by centralising the data, ie. Y i Ȳ and X i X, this does not change the slope. Let us suppose that X i exerts a positive influence on Y i. This means that large negative values of X i X are likely to result in large negative values of Y i Ȳ and large positive values of X i X are likely to result in large positive values of Y i Ȳ. What this means is that the (X i X)(Y i Ȳ ) is likely to be positive and thus i (X i X)(Y i Ȳ ) is highly likely to be positive (highly likely because remember that data is random so we can never be sure that an effect is seen in the data). Using a similar argument, we can argue that if X i exerts a negative influence on Y i, then it is highly likely i (X i X)(Y i Ȳ ) will be negative. On the 25
27 other hand, if X i does not exert any linear influence on Y i, then the product (X i X)(Y i Ȳ ) can be either negative or positive and the sum i (X i X)(Y i Ȳ ) will cancel out the negative and positive and is likely to be close to zero. S xx is simply the sample standard deviation before dividing by n 1, and measure the amount of variation of the independent variables. The value of the coefficient ˆβ 1 will vary according to the units you use. For example, suppose you want to measure the temperature has on the volume of ice on a lake, if you measure the temperature in Celcius, the slope will be different to if you measure the temperature in Fahrenheit. Thus the slope (like the mean) is sensitive to the units used. 26
28 Toy Example: size of a person and their shoe size This the mechanics of how the slope and intercept are calculated. You do not have to learn the precise details. However, it does give you some idea of what exactly the S xx and S yy are. Let x i be the shoe size and y i their height. We observe the height and shoe size of 5 people: Height y i Feet size x i It is natural to believe there is a possible linear dependence between the shoe size and height. Summary statistics: ȳ = 14 and x = 4. 27
29 Height y i 6 Lecture 31 (MWF) Review of test for independence and linear regression ȳ = 14 feet size x i x = 4 Height y i y i ȳ feet size x i x i x (x i x) (y i ȳ)(x i x) ( 8) ( 3) = = 36 S xy = 4 i=1 (y i ȳ)(x i x) = = 60. S xx = 4 i=1 (x i x) 2 = = 20. Then we have ˆβ 1 = = 3 and ˆβ 0 = ȳ ˆβ 1 x = = 2. 28
30 The line of best fit is Ŷ = 2 + 3x. Lecture 31 (MWF) Review of test for independence and linear regression We plot this below. the points are the observations and the line is the line of best fit y x 29
31 Intepretating the slope What does the slope Ŷ = 2 + 3x (where x is the shoe size and Ŷ is the predictive length) tell us about the relationship between shoe size and height? On face value, the fact that 3 is large and positive may make you think that there is a positive relationship (since the slope is not zero - since a zero slope indicates no relationship). DO NOT be fooled by this! We have estimated the size of the slope from a sample of 5 people, the slope 3 could easily be obtained randomly (when there is no relationship at all). Recall that three of the terms (X i X)(Y i Ȳ ). The slope ˆβ 1 = 3 is an estimate of the true slope (which we define later). 30
32 Therefore our objectives are: Lecture 31 (MWF) Review of test for independence and linear regression (i) Is the slope (estimator) significant? Ie. is there really evidence of a relationship. Here we need to use statistical techniques since as usual we do not observe the entire population (in this example it is just 5 people!). (ii) If there is a relationship, how strong is this relationship (again the size of 5 does not mean anything), the strength of a relationship is determined by how well the line fits the points. 31
Data Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationBlack White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126
Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;
More informationChapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line
Chapter 7 Linear Regression (Pt. 1) 7.1 Introduction Recall that r, the correlation coefficient, measures the linear association between two quantitative variables. Linear regression is the method of fitting
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 5 (MWF) Probabilities and the rules Suhasini Subba Rao Review of previous lecture We looked
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 6 (MWF) Conditional probabilities and associations Suhasini Subba Rao Review of previous lecture
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 9 (MWF) Calculations for the normal distribution Suhasini Subba Rao Evaluating probabilities
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationUnit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS. X = cigarette consumption (per capita in 1930)
BIOSTATS 540 Fall 2015 Introductory Biostatistics Page 1 of 10 Unit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS Consider the following study of the relationship
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationt-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression
t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review In the previous lecture we considered the following tests: The independent
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationLECTURE 15: SIMPLE LINEAR REGRESSION I
David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More informationBusiness Statistics 41000: Homework # 5
Business Statistics 41000: Homework # 5 Drew Creal Due date: Beginning of class in week # 10 Remarks: These questions cover Lectures #7, 8, and 9. Question # 1. Condence intervals and plug-in predictive
More informationEnd of year revision
IB Questionbank Mathematical Studies 3rd edition End of year revision 163 min 169 marks 1. A woman deposits $100 into her son s savings account on his first birthday. On his second birthday she deposits
More informationCorrelation 1. December 4, HMS, 2017, v1.1
Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationConditions for Regression Inference:
AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a
More informationNotes 11: OLS Theorems ECO 231W - Undergraduate Econometrics
Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html https://www.openintro.org/stat/textbook.php?stat_book=os (Chapter 2) Lecture 5 (MWF) Probabilities
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationEcn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationWe're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).
Sampling distributions and estimation. 1) A brief review of distributions: We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation,
More informationPsych 230. Psychological Measurement and Statistics
Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationBIOSTATISTICS NURS 3324
Simple Linear Regression and Correlation Introduction Previously, our attention has been focused on one variable which we designated by x. Frequently, it is desirable to learn something about the relationship
More informationIntroduction to Linear Regression
Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46
More informationWarm-up Using the given data Create a scatterplot Find the regression line
Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444
More informationTopic 10 - Linear Regression
Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationAnswer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1
9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationApplied Regression Analysis. Section 2: Multiple Linear Regression
Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationLecture 15: Chapter 10
Lecture 15: Chapter 10 C C Moxley UAB Mathematics 20 July 15 10.1 Pairing Data In Chapter 9, we talked about pairing data in a natural way. In this Chapter, we will essentially be discussing whether these
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationExam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015
Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the
More informationSTAT 350 Final (new Material) Review Problems Key Spring 2016
1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationBiostatistics 4: Trends and Differences
Biostatistics 4: Trends and Differences Dr. Jessica Ketchum, PhD. email: McKinneyJL@vcu.edu Objectives 1) Know how to see the strength, direction, and linearity of relationships in a scatter plot 2) Interpret
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More information4. Nonlinear regression functions
4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change
More informationSection 11: Quantitative analyses: Linear relationships among variables
Section 11: Quantitative analyses: Linear relationships among variables Australian Catholic University 214 ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced or
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationThis gives us an upper and lower bound that capture our population mean.
Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when
More informationChapter 5 Least Squares Regression
Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!
More information1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as
ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationHOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable
Chapter 08: Linear Regression There are lots of ways to model the relationships between variables. It is important that you not think that what we do is the way. There are many paths to the summit We are
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationExtra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015
Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 12.00 14.45, July 2, 2015 Also hand in this exam and your scrap paper. Always motivate your answers. Write your answers in
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationBivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationa) Do you see a pattern in the scatter plot, or does it look like the data points are
Aim #93: How do we distinguish between scatter plots that model a linear versus a nonlinear equation and how do we write the linear regression equation for a set of data using our calculator? Homework:
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationLecture 10: F -Tests, ANOVA and R 2
Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationa. Yes, it is consistent. a. Positive c. Near Zero
Chapter 4 Test B Multiple Choice Section 4.1 (Visualizing Variability with a Scatterplot) 1. [Objective: Analyze a scatter plot and recognize trends] Doctors believe that smoking cigarettes lowers lung
More informationStatistical View of Least Squares
May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples
More informationLecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000
Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationLecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)
Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationTHE PEARSON CORRELATION COEFFICIENT
CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There
More information