11. Regression and Least Squares

Size: px
Start display at page:

Download "11. Regression and Least Squares"

Transcription

1 11. Regression and Least Squares Prof. Tesler Math 186 Winter 2016 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

2 Regression Given n points ( 1, 1 ), ( 2, 2 ),..., we want to determine a function = f () that is close to them. Scatter plot of data (,) Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

3 Regression Based on knowledge of the underling problem or on plotting the data, ou have an idea of the general form of the function, such as: Line = β 0 β 1 Polnomial = β + β 1 β β " = " Eponential = Ae (B) Deca = Ae B Logistic Curve = A (1 = + BA/(1 C ) + B/C ) Goal: Compute the parameters (β 0, β 1,... or A, B, C,...) that give a best fit to the data. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

4 Regression The methods we consider require the parameters to occur linearl. It is fine if (, ) do not occur linearl. E.g., plugging (, ) = (2, 3) into = β 0 + β 1 + β β 3 3 gives 3 = β 0 + 2β 1 + 4β 2 + 8β 3. For eponential deca, = Ae B, parameter B does not occur linearl. Transform the equation to: ln = ln(a) B = A B When we plug in (, ) values, the parameters A, B occur linearl. Transform the logistic curve = A/(1 + B/C ) to: ( ) A ln 1 = ln(b) ln(c) = B + C where A is determined from A = lim (). Now B, C occur linearl. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

5 Least squares fit to a line = " Given n points ( 1, 1 ), ( 2, 2 ),..., we will fit them to a line ŷ = β 0 + β 1 : Independent variable:. We assume the s are known eactl or have negligible measurement errors. Dependent variable:. We assume the s depend on the s but fluctuate due to a random process. We do not have = f (), but instead, = f () + error. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

6 Least squares fit to a line = " Given n points ( 1, 1 ), ( 2, 2 ),..., we will fit them to a line ŷ = β 0 + β 1 : Predicted value (on the line): ŷ i = β 0 + β 1 i Actual data ( ): i = β 0 + β 1 i + ɛ i Residual (actual minus prediction): ɛ i = i ŷ i = i (β 0 + β 1 i ) Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

7 Least squares fit to a line = " We will use the least squares method: pick parameters β 0, β 1 that minimize the sum of squares of the residuals. n L = ( i (β 0 + β 1 i )) 2 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

8 Least squares fit to a line L = n ( i (β 0 + β 1 i )) 2 To find β 0, β 1 that minimize this, solve L = ( L β 0, L β 1 ) = (0, 0): L β 0 = 2 L β 1 = 2 n ( i (β 0 + β 1 i )) = 0 nβ 0 + n ( i (β 0 + β 1 i )) i = 0 ( n ) i β 0 + ( n ) i β 1 = ( n i 2 ) β 1 = n i n i i which has solution (all sums are i = 1 to n) β 1 = n ( i i i ) ( i i) ( i i) n ( i i 2 ) ( i i) 2 = i ( i )( i ȳ) i ( i ) 2 β 0 = ȳ β 1 Not shown: use 2nd derivatives to confirm it s a minimum rather than a maimum or saddle point. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

9 Best fitting line = β 0 + β 1 + ε = α 0 + α 1 + ε = slope = = slope = The best fit for = β 0 + β 1 + error or = α 0 + α 1 + error give different lines = β 0 + β 1 + error assumes the s are known eactl with no errors, while the s have errors. = α 0 + α 1 + error is the other wa around. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

10 Total Least Squares / Principal Components Analsis = " = # 0 + # 1 + " = slope = = slope = First principal component of centered data All three slope = = = In man eperiments, both and have measurement errors. Use Total Least Squares or Principal Components Analsis, in which the residuals are measured perpendicular to the line. Details require advanced linear algebra, beond Math 20F. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

11 Confidence intervals = " true line sample data best fit line 95% prediction interval r 2 = The best fit line is different than the true line. We found point estimates of β 0 and β 1. Assuming errors are independent of and normall distributed gives Confidence intervals for β 0, β 1. A prediction interval to etrapolate = f () at other s. Warning: it ma diverge from the true line when we go out too far. Not shown: one can also do hpothesis tests on the values of β 0 and β 1, and on whether two samples give the same line. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

12 Confidence intervals The method of least squares gave point estimates of β 0 and β 1 : ˆβ 1 = n i i i ( i i) ( i i) n ( i i 2 ) ( i i i) 2 = ( i )( i ȳ) i ( ˆβ i ) 2 0 = ȳ ˆβ 1 The sample variance of the residuals is s 2 = 1 n ( i ( ˆβ 0 + ˆβ 1 i )) 2 (with df = n 2). n 2 100(1 α)% confidence intervals: ( s i β 0 : ˆβ 0 t i 2 α/2,n 2, ˆβ 0 + t n i ( α/2,n 2 i ) β 1 : ( ˆβ 1 t α/2,n 2 s, ˆβ 1 + t i ( α/2,n 2 i ) s i i 2 n i ( i ) s i ( i ) ) ) at new : (ŷ w, ŷ + w) with ŷ = β 0 + β 1 and w = t α/2,n 2 s n + ( )2 i ( i ) 2 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

13 Covariance Let X and Y be random variables, possibl dependent. Let µ X = E(X), µ Y = E(Y) ( ((X Var(X + Y) = E((X + Y µ X µ Y ) 2 ) ( )) ) 2 ) = E µx + Y µy ( (X ) ) ( 2 (Y ) ) 2 = E µx + E µy + 2E ( (X µ X )(Y µ Y ) ) = Var(X) + Var(Y) + 2 Cov(X, Y) where the covariance of X and Y is defined as Cov(X, Y) = E ( (X µ X )(Y µ Y ) ) = E(XY) E(X)E(Y) Independent variables have E(XY) = E(X)E(Y), so Cov(X, Y) = 0. But Cov(X, Y) = 0 does not guarantee X and Y are independent. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

14 Covariance and independence Independent variables have E(XY) = E(X)E(Y), so Cov(X, Y) = 0. But Cov(X, Y) = 0 does not guarantee X and Y are independent. Consider the standard normal distribution, Z. Z and Z 2 are dependent. Cov(Z, Z 2 ) = E(Z 3 ) E(Z)E(Z 2 ). The standard normal distribution has mean 0: E(Z) = 0. E(Z 3 ) = 0 since Z 3 is an odd function and the pdf of Z is smmetric around Z = 0. So Cov(Z, Z 2 ) = 0. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

15 Covariance properties Covariance properties Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) Cov(X, X) = Var(X) Cov(X, Y) = Cov(Y, X) Cov(aX + b, cy + d) = ac Cov(X, Y) Sign of covariance Cov(X, Y) = E((X µ X )(Y µ Y )) When Cov(X, Y) is positive: There is a tendenc to have X > µ X when Y > µ Y and vice-versa, and X < µ X when Y < µ Y and vice-versa. When Cov(X, Y) is negative: There is a tendenc to have X > µ X when Y < µ Y and vice-versa, and X < µ X when Y > µ Y and vice-versa. When Cov(X, Y) = 0: a) X and Y might be independent, but it s not guaranteed. b) Var(X + Y) = Var(X) + Var(Y) Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

16 Sample variance and covariance Variance of a random variable: σ 2 = Var(X) = E((X µ X ) 2 ) = E(X 2 ) (E(X)) 2 Sample variance from data 1,..., n to estimate σ 2 : ( s 2 = var() = 1 n n ) ( i ) 2 = 1 2 i n 1 n 1 n n 1 2 Covariance between random variables X, Y: σ XY = Cov(X, Y) = E((X µ X )(Y µ Y )) = E(XY) E(X)E(Y) Sample covariance from data ( 1, 1 ),..., ( n, n ) to estimate σ XY : ( s XY = cov(, ) = 1 n n ) ( i )( i ȳ) = 1 i i n n 1 n 1 n 1 ȳ Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

17 Correlation coefficient Let X and Y be two random variables. Their correlation coefficient is ρ(x, Y) = Cov(X, Y) Var(X) Var(Y) This is a normalized version of covariance, and is between ±1. For a line Y = ax + b with a, b constants (a 0), ρ(x, Y) = a Var(X) Var(X) Var(aX) = aσ2 σ a σ = a a = ±1 (sign of a) ρ(x, Y) = ±1 iff Y = ax + b with a, b constants (a 0). Closer to ±1: more linear. Closer to 0: less linear. If X and Y are independent then ρ(x, Y)=0. The converse is not valid: dependent variables can have ρ(x, Y)=0. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

18 Correlation coefficient ρ(x,y) is estimated from data b the sample correlation coefficient (a.k.a. Pearson product-moment correlation coefficient): cov(, ) i r(, ) = = ( i )( i ȳ) var() var() i ( i ) 2 i ( i ȳ) 2 = n i i i ( i i)( i i) n i i 2 ( i i) 2 n i i 2 ( i i) 2 People often report r 2 (between 0 and 1) instead of r. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

19 Sample correlation coefficient r Middle row: Perfect linear relation Y = ax + b gives r = 1 for lines with positive slope (a > 0) r = 1 for lines with negative slope (a < 0) r undefined for horizontal line (Y = b) Other rows: coming up Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

20 Interpretation of r 2 Let ŷ i = ˆβ 1 i + ˆβ 0 be the predicted -value for i based on the least squares line. Write the deviation of i from ȳ as i ȳ = ( i ŷ i ) + (ŷ i ȳ) Total Uneplained Eplained deviation b line b line It can be shown that the sum of squared deviations for all s is i ( i ȳ) 2 = i ( i ŷ i ) 2 + i (ŷ i ȳ) i ( i ŷ i )(ŷ i ȳ) Total Uneplained Eplained = 0 b a miracle variation variation variation (Tedious algebra not shown) and that i r 2 = (ŷ i ȳ) 2 Eplained variation i ( = i ȳ) 2 Total variation r = 1: 100% of the variation is eplained b the line and 0% is due to other factors, and the slope is positive. r =.8: 64% of the variation is eplained b the line and 36% is due to other factors, and the slope is negative. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

21 Sample correlation coefficient r Top row: Linear relations with varing r. Bottom: r = 0, et X and Y are dependent in all of these (ecept possibl the last). Each plot shows a nonlinear relationship eists. Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

22 Correlation does not impl causation High correlation between X and Y doesn t mean X causes Y or vice-versa. It could be a coincidence. Or the could both be caused b a third variable. Website tlervigen.com plots man data sets (various quantities b ear) against each other to find spurious correlations: Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

23 More about interpretation of correlation Low r 2 does NOT guarantee independence; it just means that a line = β 0 + β 1 is not a good fit to the data. r is an estimate of ρ. The estimate improves with higher n. With additional assumptions on the underling joint distribution of X, Y, we can use r to test H 0 : ρ = 0 vs. H 1 : ρ 0 (or other values). Best fits and correlation generalize to other models, including Polnomial regression Multiple linear regression Weighted versions = β 0 + β 1 + β β p p = β 0 + β 1 t + β 2 u + + β p w t, u,..., w: multiple independent variables : dependent variable When the variance is different at each value of the independent variables Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter / 23

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 You can t see this text! Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Probability Review - Part 2 1 /

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Covariance and Correlation Class 7, 18.05 Jerem Orloff and Jonathan Bloom 1. Understand the meaning of covariance and correlation. 2. Be able to compute the covariance and correlation

More information

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51 Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables

More information

Jointly Distributed Random Variables

Jointly Distributed Random Variables Jointly Distributed Random Variables CE 311S What if there is more than one random variable we are interested in? How should you invest the extra money from your summer internship? To simplify matters,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

The number of occurrences of a word (5.7) and motif (5.9) in a DNA sequence, allowing overlaps Covariance (2.4) and indicators (2.

The number of occurrences of a word (5.7) and motif (5.9) in a DNA sequence, allowing overlaps Covariance (2.4) and indicators (2. The number of occurrences of a word (5.7) and motif (5.9) in a DNA sequence, allowing overlaps Covariance (2.4) and indicators (2.9) Prof. Tesler Math 283 Fall 2016 Prof. Tesler # occurrences of a word

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability

More information

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Prof. Tesler Math 283 Fall 2015 Prof. Tesler Principal Components Analysis Math 283 / Fall 2015

More information

Chapter 4 continued. Chapter 4 sections

Chapter 4 continued. Chapter 4 sections Chapter 4 sections Chapter 4 continued 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP:

More information

STOR Lecture 16. Properties of Expectation - I

STOR Lecture 16. Properties of Expectation - I STOR 435.001 Lecture 16 Properties of Expectation - I Jan Hannig UNC Chapel Hill 1 / 22 Motivation Recall we found joint distributions to be pretty complicated objects. Need various tools from combinatorics

More information

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18 Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We

More information

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1 Joint probability distributions: Discrete Variables Two Discrete Random Variables Probability mass function (pmf) of a single discrete random variable X specifies how much probability mass is placed on

More information

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) Last modified: March 7, 2009 Reference: PRP, Sections 3.6 and 3.7. 1. Tail-Sum Theorem

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

1 x 2 and 1 y 2. 0 otherwise. c) Estimate by eye the value of x for which F(x) = x + y 0 x 1 and 0 y 1. 0 otherwise

1 x 2 and 1 y 2. 0 otherwise. c) Estimate by eye the value of x for which F(x) = x + y 0 x 1 and 0 y 1. 0 otherwise Eample 5 EX: Which of the following joint density functions have (correlation) ρ XY = 0? (Remember that ρ XY = 0 is possible with symmetry even if X and Y are not independent.) a) b) f (, y) = π ( ) 2

More information

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Linear regression Class 25, 18.05 Jerem Orloff and Jonathan Bloom 1. Be able to use the method of least squares to fit a line to bivariate data. 2. Be able to give a formula for the total

More information

Glossary. Also available at BigIdeasMath.com: multi-language glossary vocabulary flash cards. An equation that contains an absolute value expression

Glossary. Also available at BigIdeasMath.com: multi-language glossary vocabulary flash cards. An equation that contains an absolute value expression Glossar This student friendl glossar is designed to be a reference for ke vocabular, properties, and mathematical terms. Several of the entries include a short eample to aid our understanding of important

More information

ECON Fundamentals of Probability

ECON Fundamentals of Probability ECON 351 - Fundamentals of Probability Maggie Jones 1 / 32 Random Variables A random variable is one that takes on numerical values, i.e. numerical summary of a random outcome e.g., prices, total GDP,

More information

Inference about the Slope and Intercept

Inference about the Slope and Intercept Inference about the Slope and Intercept Recall, we have established that the least square estimates and 0 are linear combinations of the Y i s. Further, we have showed that the are unbiased and have the

More information

Math 510 midterm 3 answers

Math 510 midterm 3 answers Math 51 midterm 3 answers Problem 1 (1 pts) Suppose X and Y are independent exponential random variables both with parameter λ 1. Find the probability that Y < 7X. P (Y < 7X) 7x 7x f(x, y) dy dx e x e

More information

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics Chapter 13 Student Lecture Notes 13-1 Department of Quantitative Methods & Information Sstems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analsis QMIS 0 Dr. Mohammad

More information

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression Dependence and scatter-plots MVE-495: Lecture 4 Correlation and Regression It is common for two or more quantitative variables to be measured on the same individuals. Then it is useful to consider what

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Lecture 16 - Correlation and Regression

Lecture 16 - Correlation and Regression Lecture 16 - Correlation and Regression Statistics 102 Colin Rundel April 1, 2013 Modeling numerical variables Modeling numerical variables So far we have worked with single numerical and categorical variables,

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance Covariance Lecture 0: Covariance / Correlation & General Bivariate Normal Sta30 / Mth 30 We have previously discussed Covariance in relation to the variance of the sum of two random variables Review Lecture

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM)

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Alexandr Malusek Division of Radiological Sciences Department of Medical and Health Sciences Linköping University 2014-04-15

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Simple Linear Regression Estimation and Properties

Simple Linear Regression Estimation and Properties Simple Linear Regression Estimation and Properties Outline Review of the Reading Estimate parameters using OLS Other features of OLS Numerical Properties of OLS Assumptions of OLS Goodness of Fit Checking

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Joint Distribution of Two or More Random Variables

Joint Distribution of Two or More Random Variables Joint Distribution of Two or More Random Variables Sometimes more than one measurement in the form of random variable is taken on each member of the sample space. In cases like this there will be a few

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Correlation analysis 2: Measures of correlation

Correlation analysis 2: Measures of correlation Correlation analsis 2: Measures of correlation Ran Tibshirani Data Mining: 36-462/36-662 Februar 19 2013 1 Review: correlation Pearson s correlation is a measure of linear association In the population:

More information

Outline Properties of Covariance Quantifying Dependence Models for Joint Distributions Lab 4. Week 8 Jointly Distributed Random Variables Part II

Outline Properties of Covariance Quantifying Dependence Models for Joint Distributions Lab 4. Week 8 Jointly Distributed Random Variables Part II Week 8 Jointly Distributed Random Variables Part II Week 8 Objectives 1 The connection between the covariance of two variables and the nature of their dependence is given. 2 Pearson s correlation coefficient

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Microarra Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 5 Linear Regression dr. Petr Nazarov 14-1-213 petr.nazarov@crp-sante.lu Statistical data analsis in Ecel. 5. Linear regression OUTLINE Lecture

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Bivariate Distributions. Discrete Bivariate Distribution Example

Bivariate Distributions. Discrete Bivariate Distribution Example Spring 7 Geog C: Phaedon C. Kyriakidis Bivariate Distributions Definition: class of multivariate probability distributions describing joint variation of outcomes of two random variables (discrete or continuous),

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable CHAPTER 4 MATHEMATICAL EXPECTATION 4.1 Mean of a Random Variable The expected value, or mathematical expectation E(X) of a random variable X is the long-run average value of X that would emerge after a

More information

Notes for Math 324, Part 19

Notes for Math 324, Part 19 48 Notes for Math 324, Part 9 Chapter 9 Multivariate distributions, covariance Often, we need to consider several random variables at the same time. We have a sample space S and r.v. s X, Y,..., which

More information

Optimization and Simulation

Optimization and Simulation Optimization and Simulation Variance reduction Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M.

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Regression Analysis. Ordinary Least Squares. The Linear Model

Regression Analysis. Ordinary Least Squares. The Linear Model Regression Analysis Linear regression is one of the most widely used tools in statistics. Suppose we were jobless college students interested in finding out how big (or small) our salaries would be 20

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

Gaussian random variables inr n

Gaussian random variables inr n Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b

More information

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline. Random Variables Amappingthattransformstheeventstotherealline. Example 1. Toss a fair coin. Define a random variable X where X is 1 if head appears and X is if tail appears. P (X =)=1/2 P (X =1)=1/2 Example

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Chapter 10. Simple Linear Regression and Correlation

Chapter 10. Simple Linear Regression and Correlation Chapter 10. Simple Linear Regression and Correlation In the two sample problems discussed in Ch. 9, we were interested in comparing values of parameters for two distributions. Regression analysis is the

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t, CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

More information

Linear correlation. Contents. 1 Linear correlation. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College

Linear correlation. Contents. 1 Linear correlation. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College Introductor Statistics Lectures Linear correlation Testing two variables for a linear relationship Anthon Tanbakuchi Department of Mathematics Pima Communit College Redistribution of this material is prohibited

More information

Ch. 5 Joint Probability Distributions and Random Samples

Ch. 5 Joint Probability Distributions and Random Samples Ch. 5 Joint Probability Distributions and Random Samples 5. 1 Jointly Distributed Random Variables In chapters 3 and 4, we learned about probability distributions for a single random variable. However,

More information

Topic - 12 Linear Regression and Correlation

Topic - 12 Linear Regression and Correlation Topic 1 Linear Regression and Correlation Correlation & Regression Univariate & Bivariate tatistics U: frequenc distribution, mean, mode, range, standard deviation B: correlation two variables Correlation

More information

The data can be downloaded as an Excel file under Econ2130 at

The data can be downloaded as an Excel file under Econ2130 at 1 HG Revised Sept. 018 Supplement to lecture 9 (Tuesda 18 Sept) On the bivariate normal model Eample: daughter s height (Y) vs. mother s height (). Data collected on Econ 130 lectures 010-01. The data

More information

Properties of Summation Operator

Properties of Summation Operator Econ 325 Section 003/004 Notes on Variance, Covariance, and Summation Operator By Hiro Kasahara Properties of Summation Operator For a sequence of the values {x 1, x 2,..., x n, we write the sum of x 1,

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

STAT/MATH 395 PROBABILITY II

STAT/MATH 395 PROBABILITY II STAT/MATH 395 PROBABILITY II Bivariate Distributions Néhémy Lim University of Washington Winter 2017 Outline Distributions of Two Random Variables Distributions of Two Discrete Random Variables Distributions

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

matrix-free Elements of Probability Theory 1 Random Variables and Distributions Contents Elements of Probability Theory 2

matrix-free Elements of Probability Theory 1 Random Variables and Distributions Contents Elements of Probability Theory 2 Short Guides to Microeconometrics Fall 2018 Kurt Schmidheiny Unversität Basel Elements of Probability Theory 2 1 Random Variables and Distributions Contents Elements of Probability Theory matrix-free 1

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D. Research Design - - Topic 15a Introduction to Multivariate Analses 009 R.C. Gardner, Ph.D. Major Characteristics of Multivariate Procedures Overview of Multivariate Techniques Bivariate Regression and

More information

Correlation 1. December 4, HMS, 2017, v1.1

Correlation 1. December 4, HMS, 2017, v1.1 Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample

More information

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B) REVIEW OF MAIN CONCEPTS AND FORMULAS Boolean algebra of events (subsets of a sample space) DeMorgan s formula: A B = Ā B A B = Ā B The notion of conditional probability, and of mutual independence of two

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one. Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is

More information

18.600: Lecture 24 Covariance and some conditional expectation exercises

18.600: Lecture 24 Covariance and some conditional expectation exercises 18.600: Lecture 24 Covariance and some conditional expectation exercises Scott Sheffield MIT Outline Covariance and correlation Paradoxes: getting ready to think about conditional expectation Outline Covariance

More information

Nonparametric hypothesis tests and permutation tests

Nonparametric hypothesis tests and permutation tests Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 8 Fall 2007

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 8 Fall 2007 UC Berkeley Department of Electrical Engineering and Computer Science EE 6: Probablity and Random Processes Problem Set 8 Fall 007 Issued: Thursday, October 5, 007 Due: Friday, November, 007 Reading: Bertsekas

More information

Regression and Covariance

Regression and Covariance Regression and Covariance James K. Peterson Department of Biological ciences and Department of Mathematical ciences Clemson University April 16, 2014 Outline A Review of Regression Regression and Covariance

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review You can t see this text! Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Matrix Algebra Review 1 / 54 Outline 1

More information

Copyright, 2008, R.E. Kass, E.N. Brown, and U. Eden REPRODUCTION OR CIRCULATION REQUIRES PERMISSION OF THE AUTHORS

Copyright, 2008, R.E. Kass, E.N. Brown, and U. Eden REPRODUCTION OR CIRCULATION REQUIRES PERMISSION OF THE AUTHORS Copright, 8, RE Kass, EN Brown, and U Eden REPRODUCTION OR CIRCULATION REQUIRES PERMISSION OF THE AUTHORS Chapter 6 Random Vectors and Multivariate Distributions 6 Random Vectors In Section?? we etended

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y) HW5 Solutions 1. (50 pts.) Random homeworks again (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] Answer: Applying the definition of expectation we have

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra Econometrics I Professor William Greene Stern School of Business Department of Economics 3-1/29 Econometrics I Part 3 Least Squares Algebra 3-2/29 Vocabulary Some terms to be used in the discussion. Population

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

The Multivariate Normal Distribution. In this case according to our theorem

The Multivariate Normal Distribution. In this case according to our theorem The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this

More information

We introduce methods that are useful in:

We introduce methods that are useful in: Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Random Variables and Expectations

Random Variables and Expectations Inside ECOOMICS Random Variables Introduction to Econometrics Random Variables and Expectations A random variable has an outcome that is determined by an experiment and takes on a numerical value. A procedure

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula Lecture 4: Proofs for Expectation, Variance, and Covariance Formula by Hiro Kasahara Vancouver School of Economics University of British Columbia Hiro Kasahara (UBC) Econ 325 1 / 28 Discrete Random Variables:

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Class 11 Maths Chapter 15. Statistics

Class 11 Maths Chapter 15. Statistics 1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class

More information