Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU

Size: px
Start display at page:

Download "Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU"

Transcription

1 Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU

2 Least squares regression What we will cover Box, G.E.P., Use and abuse of regression, Technometrics, 8 (4), , Adv. Eng. Stat., Jay Liu 1

3 [FYI]Least squares vs. interpolation Given the data, there are two choices when we want to know the value of y at x = (x 1 + x 2 )/2 x y y x 1 y1 x 1 x 2 x x 2 y 2 least squares? or interpolation? Interpolation is recommended when data are subject to negligible experimental error (or noise) Ex. In using steam tables Otherwise, least squares is recommended Adv. Eng. Stat., Jay Liu 2

4 Least squares - usage examples Quantify relationship between 2 variables (or 2 sets of variables): Manager: How does yield from the lactic acid batch fermentation relate to the purity of sucrose? Engineer: The yield can be predicted from sucrose purity with an error of plus/minus 8% Manager: And how about the relationship between yield and glucose purity? Engineer: Over the range of our historical data, there is no discernible relationship Adv. Eng. Stat., Jay Liu 3

5 Least squares - usage examples Two general applications Predictive modeling usually when an exact model form is unknown. Modeling data trends in order to predict future y values Simulation usually when parameters in the model are unknown. Getting parameter values in the known model form (e.g., calculate activation energy from reaction data) Terminology y : response variables, output variables, dependent variables, x : input variables, regressor variables, independent variables, Adv. Eng. Stat., Jay Liu 4

6 Review: covariance Consider measurements from a gas cylinder: temperature (K) and pressure (kpa). Ideal gas law applies under moderate condition: pv = nrt Fixed volume, V = m3 = 20 L Moles of gas, n = 14.1 mols of chlorine gas, (1 kg gas) Gas constant, R = J/(mol.K) Simplify the ideal gas law to: p = b 1 T, where b 1 nr V Adv. Eng. Stat., Jay Liu 5

7 Review: covariance (Cont.) Adv. Eng. Stat., Jay Liu 6

8 Review: covariance (Cont.) Formal definition: 2. Multiply the centered values: cov( x, y) E x x y y where E( z) z 1. Calculate deviation variables: T T and p p Subtracting off mean centers the vector at zero Calculate the expected value (mean): Covariance has units: [K kpa] c.f) Covariance between temperature and humidity is 202 [K %] Covariance with itself is the variance: T T p p cov( x, x) V ( x) E x x x x or T T centered p centered Adv. Eng. Stat., Jay Liu 7

9 Review: correlation Q: Which one (pressure or temperature) has stronger relationship with temperature? Covariance depends on units: e.g. different covariance for grams vs kilograms Correlation removes the scaling effect: corr( x, y) Divides by the units of x and y: dimensionless result Gas cylinder example: cov( xy, ) corr(temperature, pressure) = corr(temperature, humidity) = E x x y y x y x y 1 corr( x, y) xy Adv. Eng. Stat., Jay Liu 8

10 Review: correlation (cont.) Want to find a relationship y = f(x) other than the above? Adv. Eng. Stat., Jay Liu 9

11 Review: correlation (cont.) Remember!? Adv. Eng. Stat., Jay Liu 10

12 Least squares? Least squares regression? Regression is the act of choosing the best values for the unknown parameters in a model on the basis of a set of measured data. Linear regression is the special case where the model is linear in the parameters. A straight line has the form: There are many possible ways to define the best fit. However, the most commonly used measure for bestness is the sum of squared residuals. y a a x( e) 0 1 Least sum of squares of errors least squares in short. Important: error is from y, not from x Adv. Eng. Stat., Jay Liu 11

13 [FYI] why minimize the sum of squares? The least squares model: has the lowest possible variance for a 0 and a 1 when certain assumptions are met (more later) computationally tractable by hand easy to prove various mathematical properties intuitive: penalize deviations quadratically Other forms: multiple solutions, unstable, high variance solutions, mathematical proofs are difficult Adv. Eng. Stat., Jay Liu 12

14 Least squares (regression) It is the basis for : DOE (Design of Experiments) Latent variable methods We consider only 2 (sets of) variables : x and y (or x s and y) Simple least squares Multiple least squares Generalized least squares Adv. Eng. Stat., Jay Liu 13

15 Simple least squares Wind tunnel example How can we find the best line that describe the following data? Data from wind tunnel experiments: Drag force (F) at various wind velocities Adv. Eng. Stat., Jay Liu 14

16 Wind tunnel example (cont.) From the plot, a linear line seems adequate. y = a 0 + a 1 x At a data point (x i, y i ), error between the line and the point is: (see the figure on the right) e i = y i yˆi = y i a 0 a 1 x i Earlier, least squares means least sum of squares of errors. For all data points, sum of squares of errors is: We need to find model parameters a 0 and a 1 that minimize S r. Least squares Adv. Eng. Stat., Jay Liu 15

17 Wind tunnel example (cont.) How to find model parameters? Take a look at Sr. S r is a parabolic function w.r.t a o and a 1 and sign of a and 2 2 o 1 are plus. S r becomes minimum where S a r a Sr 0 & 0. a 0 1 a 1 a 0 Rearranging and solving for a 0 and a Adv. Eng. Stat., Jay Liu 16

18 Wind tunnel example (cont.) Calculations Adv. Eng. Stat., Jay Liu 17

19 Wind tunnel example (cont.) Calculations This is called simple least squares Adv. Eng. Stat., Jay Liu 18

20 Wind tunnel example (cont.) Results Is this OK with you? Adv. Eng. Stat., Jay Liu 19

21 General modeling procedure Define modeling objective Variable selection Identify the response variables (i.e., y variables), and the regressor variables (i.e., x variables) that are to be considered Design of experiment Design an experiment and use it to generate the data that will be used to fit the model Define the model Choose an appropriate form for the model Fit the model Estimate values for the parameters in the model N Does the model fit? Use the model Adv. Eng. Stat., Jay Liu 20 Y Statistical tools + prior knowledge

22 Simple least squares Summary Model form: y = a 0 + a 1 x + e Rearranging and solving for a 0 and a 1 becomes minimizes where S a r Sr 0 & 0. a Adv. Eng. Stat., Jay Liu 21

23 Simple least squares (cont.) Properties i. e., y yˆ 0 i i a 0 a Adv. Eng. Stat., Jay Liu 22

24 Simple least squares (cont.) Questions a 1 a 1 a 0 a 1 a 0 a 1 what if our model we want to find is non-linear? Ex. Activation energy in rate constant k k0e E RT Linearize! Adv. Eng. Stat., Jay Liu 23

25 Linearization Want to model non-linear relationships between independent (x) and dependent (y) variables. 1. Make a simple linear model through a suitable transformation. y = f(x) + e y = a 0 + a 1 x + e 2. Use previous results (simple least squares) Caution: nonlinear transformation also changes P.D.F of variables (and errors) We will discuss about this in model assessment Adv. Eng. Stat., Jay Liu 24

26 Linearization (Cont.) Adv. Eng. Stat., Jay Liu 25

27 Polynomial regression For quadratic form Sum of squares Again, S r has a parabolic shape w.r.t a 0, a 1, and a 2. with plus signs of a, a, and a S a r 0 S a S a r 1 r ( yi a0 a1xi a2xi ) xi ( yi a0 a1xi a2xi ) 0 2 x ( y a a x a x ) i i 0 1 i 2 i Adv. Eng. Stat., Jay Liu 26

28 Polynomial regression (Cont.) Rearranging the previous equations gives the above equations can be solved easily. (three unknowns and three equations.) For general polynomials 2 n xi x i a0 y i 2 3 xi xi xi a1 xi yi xi xi x i a 2 xi y i From the results of two cases (y = a 0 + a 1 x & y = a 0 + a 1 x + a 2 x 2 ) Sr Sr Sr 0 a a a 0 1 m we need to solve (m+1) linear algebraic equations for (m+1) parameters Adv. Eng. Stat., Jay Liu 27

29 Multiple least squares Consider when there are more than two independent variables, x 1, x 2,, x m. regression plane. y a 0 a 1 x 1 a 2 x 2 a m x m e For 2-D case, y = a 0 + a 1 x 1 + a 2 x 2. Again, S r has a parabolic shape w.r.t a 0, a 1. S r ( yi i a0 a1x1, i a2x2, ) 2 S a r 0 S a S a r 1 r 2 2 ( y a a x a x ) 0 i 0 1 1, i 2 2, i 2 x ( y a a x a x ) 0 1, i i 0 1 1, i 2 2, i 2 x ( y a a x a x ) 0 2, i i 0 1 1, i 2 2, i a 2 a Adv. Eng. Stat., Jay Liu 28

30 Multiple least squares (Cont.) Rearranging and solve for a 0, a 1 and a 2 gives For an m-dimensional plane, y a 0 a a a Same as in general polynomials, 1 x 1 2 x 2 m x m e Sr Sr Sr 0 a a a 0 1 m we need to solve (m+1) linear algebraic equations for (m+1) parameters Adv. Eng. Stat., Jay Liu 29

31 General least squares The following form includes all cases (simple least squares, polynomial regression, multiple regression) Ex. Simple and multiple least squares polynomial regression Same as before, Sr Sr Sr 0 a a a 0 1 we need to solve (m+1) linear algebraic equations for (m+1) parameters. m Adv. Eng. Stat., Jay Liu 30

32 Quantification of errors S t y i y 2 S r 2 ei yi a0z0, i a1z1, i amzm, i 2 Total sum of squares around the mean for the response variable, y Sum of squares of residuals around the regression line Adv. Eng. Stat., Jay Liu 31

33 Quantification of errors (Cont.) S y 1 2 St yi y n 1 n 1 S y x Sr n ( m 1) Standard deviation of y Standard error of predicted y (S E ) quantify appropriateness of regression Adv. Eng. Stat., Jay Liu 32

34 Quantification of errors (Cont.) Coefficients of determination, R 2 2 St Sr The amount of variability in the data explained R S by the regression model. t R 2 = 1 when S r = 0 : perfect fit (a regression curve passes through data points) R 2 = 0 when S r = S t : as bad as doing nothing It is evident from the figures that a parabola is adequate. R 2 of (b) is higher than that of (a) Adv. Eng. Stat., Jay Liu 33

35 Quantification of errors (Cont.) Warning! : R 2 1 does not guarantee that the model is adequate, nor the model will predict new data well. It is possible to force R 2 to be one by adding as many terms as there are observations. S r can be big when variance of random error is large. (Usual assumption on error is that error is random is unpredictable) Practice using Excel (1) Wind tunnel example with higher polynomials (2) Simple regression with increasing random noise Adv. Eng. Stat., Jay Liu 34

36 Confidence intervals - coefficients Coefficients in the regression model have confidence interval. Why? They are also statistics like & s. That is, they are numerical quantities calculated in a sample (not entire population). They are estimated values of parameters. x Statistic that we want to find its confidence interval statistic A statistic Value that depends on P.D.F of the statistic & confidence level a Standard error of the statistic statistic A statistic x x n x z a/2 t n,a/2 sx n The standard error of a statistic is the standard deviation of the sampling distribution of that statistic Adv. Eng. Stat., Jay Liu 35

37 Confidence intervals coefficients (cont.) Matrix representation of GLS y Za e Z y T T a T e m+1: number of coefficients n: number of data points Adv. Eng. Stat., Jay Liu 36

38 Confidence intervals coefficients (Cont.) Example Fitting quadratic polynomials to five data points Adv. Eng. Stat., Jay Liu 37 x y e x a x a a y e Za y e e e e e a a a Can you solve this problem? Three unknowns Five equations

39 Confidence intervals coefficients (Cont.) Solutions y Za e Sum of squares of errors S r e i r a 0 2 e T e T y Za y Za S T T Z Z a Z y Called normal equations 1. LU decomposition or other methods to solve L.A.E 2. Matrix inversion T T Z Za Z y " Ax b" T T T 1 T Z Za Z y a Z Z Z y computationally not efficient, but statistically useful Adv. Eng. Stat., Jay Liu 38

40 Confidence intervals coefficients (Cont.) Matrix inversion approach a T 1 T Z Z Z y Denote 1 Z ii as the diagonal element of Confidence interval of estimated coefficients a t S Z 2 1 i1 n( m1), a /2 y x ii Z T Z 1 t n( m1), a /2 Student t statistics S y x Sr n ( m 1) Standard error of estimate Adv. Eng. Stat., Jay Liu 39

41 Confidence intervals coefficients (Cont.) For a linear model, C.I. for a 1 (slope) C.I. for a 0 (intercept) a t S a t S 1 n ( m 1), a /2 y / x 2 0 n( m1), a /2 y/ x 2 What if confidence intervals contain zero? x 1 i 2 1 x n x x x i Adv. Eng. Stat., Jay Liu 40

42 Confidence intervals prediction C.I for predicted y, yˆi 2 0 yˆ t x n( m1), a /2 Sy/ x 2 0 n xi x 1 x x Adv. Eng. Stat., Jay Liu 41

43 Model assessment When we do not know the model form, we have to assess the model before use it after we fit a regression model. However, in order to assess the model and make inferences about the parameters and predictions from the model, we will have to employ statistics and make some assumptions about the nature of the disturbance. Tools for model assessment S y/x, R 2 (quantitative) ( Do not use) Residual Plots (qualitative) Normal probability chart (qualitative or quantitative) Test for lack of fit (quantitative) This is used when the dataset includes replicates. It is based on analysis of variance (ANOVA) Adv. Eng. Stat., Jay Liu 42

44 Model assessment - assumptions What is the most desirable errors in regression? Assumptions on error unpredictable Error is additive y 1 a0 a1x e y ( a0 a1x1 ) e The variance of the error is constant and is not related to values of the response or values of the regressor variables. There is no error associated with the values of the regressor variables. Error is a random variable with Gaussian distribution N(0, 2 ) ( 2 usually unknown) Adv. Eng. Stat., Jay Liu 43

45 Model assessment residual plots Recall the assumptions on error Error is not related to the values of response or regressor variables. Then, assumptions will not be valid if the model is wrong. Following residual plots will reveal this. Residuals vs. regressor variables Residuals vs. fitted y values ( yˆi ) Residuals vs. lurking variables (i.e. time or order) These plots will show some patterns when a model is inadequate Adv. Eng. Stat., Jay Liu 44

46 Model assessment residual plots (con t) Examples e e x x Adv. Eng. Stat., Jay Liu 45

47 Model assessment residual plots (con t) Examples of residual plots Adv. Eng. Stat., Jay Liu 46

48 Model assessment normal probability plot Recall the assumptions on error Error is a random variable with Gaussian distribution N(0, 2 ) ( 2 usually unknown) Then, errors will fall onto a straight line (y = x) in a normal probability plot. (especially useful when the number of data points is large) Normal probability plot Alternatively, normality test can be used Adv. Eng. Stat., Jay Liu 47

49 Model assessment ANOVA (Test for lack of fit) The variance breakdown t 2 S y y i S t S r S e ˆ 2 S y y r i i Re g ˆ 2 S y y i S S S t Re g r Really? Prove to yourself Adv. Eng. Stat., Jay Liu 48

50 Model assessment ANOVA (Test for lack of fit) The variance breakdown S S S t Re g r Ratio of S Reg /S r follows F distribution when corrected with degree of freedom. If regression is not meaningful, the ratio (S e /S r ) is small and S t S r. t 2 S y y i S t S r ˆ 2 S y y r i i S Reg Re g ˆ 2 S y y i Adv. Eng. Stat., Jay Liu 49

51 Model assessment ANOVA (Test for lack of fit) a 0 a 1 S t SRe g Sr Adv. Eng. Stat., Jay Liu 50

52 Model assessment ANOVA (Test for lack of fit) ANOVA Table Source of Var. Sum of Degrees of Mean Square F 0 Squares Freedom Regression S Reg p MS Reg =S Reg /p MS Reg / MS E Residual error S r n-p MS E =S r /(n-p) Total S t n-1 Compare F 0 to the critical value F p,n-p;a What we are doing is a test of hypothesis. We are testing the hypothesis: H 0 : b0 b p 0 H 1 : at least one parameter is not equal to zero Adv. Eng. Stat., Jay Liu 51

53 [FYI]Meaning of a p-value in hypothesis test A measure of how much evidence we have against the null hypothesis. Null hypothesis (H 0 ) represents the hypothesis of no change or no effect. Much research involves making a hypothesis and then collecting data to test that hypothesis. Then researchers will collect data and measure the consistency of this data with the null hypothesis. A small p-value is evidence against the null hypothesis while a large p-value means little or no evidence against the null hypothesis. Traditionally, researchers will reject a null hypothesis if the p-value is less than 0.05 (a = 0.05). p-value can mean that the possibility that you can be wrong when rejecting the null hypothesis Adv. Eng. Stat., Jay Liu 52

54 Integer variables in the model Integer variables 0 and 1 can represent qualitative variables. Example: raw material from Spain, India, or Vietnam y = a 0 + a 1 x a k x k + r 1 d 1 + r 2 d 2 + r 3 d 3 d 1 = 1 and d 2 = 0 and d 3 = 0 for Spain d 1 = 0 and d 2 = 1 and d 3 = 0 for India d 1 = 0 and d 2 = 0 and d 3 = 1 for Vietnam Often called indicator variables for this reason Adv. Eng. Stat., Jay Liu 53

55 Integer variables in the model Example Want to predict yield when two different impeller used. Yield = f(temperature, impeller type) Build two different models (one for axial, one for radial) axial Build one model using indicator variable. y = a 0 + a 1 T + rd radial y = a 0 + a 1 T + rd i d i = 0 for axial, d i = 1 for radial Adv. Eng. Stat., Jay Liu 54

56 Leverage effect Unusual observations influence the model parameters and our interpretation y y x x Outliers have an over-proportional effect on resulting regression curves. To avoid the leverage effect, Remove outliers before regression (but do not delete without investigation) Use different S r (no longer least squares) Adv. Eng. Stat., Jay Liu 55

57 Causal relation and correlation Causal relation Cause and effect relation Has physical/chemical/engineering meanings x and y are not interchangeable Direction exists. Correlation (Linear) relationship between two variables No physical/chemical/engineering meanings. Average height of 20 s men vs. year x and y are interchangeable Adv. Eng. Stat., Jay Liu 56

58 Advanced topics Testing of least-squares models Adv. Eng. Stat., Jay Liu 57

59 Advanced topics - Testing of least-squares models Adv. Eng. Stat., Jay Liu 58

60 Advanced topics Correlated x s MLR solution 2 T T y Za e S e r S r 0 a i e e y Za y Za T 1 T a Z Z Z y T Z Z a T Z y When two or more x s are correlated, ill-conditioned. T ZZ 1 becomes nearly singular, i.e., Adv. Eng. Stat., Jay Liu 59

61 Advanced topics correlated x s High/no correlation between x 1 and x 2 T ZZ T ZZ T ZZ 1 T ZZ What if very small (measurement ) noises added to x s T ZZ T ZZ T ZZ 1 T ZZ What will happen to your MLR model? Adv. Eng. Stat., Jay Liu 60

62 Advanced topics correlated x s If high correlation among x s: unstable solutions for a predictions uncertain also Geometrically speaking High correlation between x 1 and x Adv. Eng. Stat., Jay Liu 61

63 Advanced topics correlated x s Remedies? Use selected x variables stepwise regression Use ridge regression Use multivariate methods (will not be covered in this lecture) Adv. Eng. Stat., Jay Liu 62

64 Advanced topics What we want to know: How do we select the form of the model? Which variables should be included? Should we include transformations of the regressor variables?. What we want: We would like to build the best regression model We would like to include as many regressor variables as is necessary to adequately describe the behaviour of y. At the same time, we want to keep the model as simple as possible. Stepwise regression start off by choosing an equation having the single best x variables and the attempts to build up with subsequent additions of x s one at a time as long as these additions are worthwhile Adv. Eng. Stat., Jay Liu 63

65 Advanced topics stepwise regression Procedure 1. Add a x variable to the model (the variable that is most highly correlated with y). 2. Check to see whether or not this has significantly improved the model. One way is to see whether or not the confidence interval for the parameter includes zero. (of course you can use hypothesis test) If the new term or terms are not significant, remove them from the model. 3. Find one of the remaining x variables that is highly correlated with the residuals and repeat the procedure Adv. Eng. Stat., Jay Liu 64

66 Advanced topics Ridge regression (Hoerl, 1962; Hoerl & Kennard, 1970a,b) A modified regression method specifically for ill-conditioned datasets that allows all variables to be kept in the model. This is possible by adding additional information to the problem to remove the ill-conditioning. The objective function to minimize: T T y Xb y Xb kb b Least squares estimates have no bias but large variance, while ridge regression estimates have small bias and small variance Adv. Eng. Stat., Jay Liu 65

67 Advanced topics ridge regression Procedure 1. Mean center and scale all x s to unit variance f i x x i s x i 2. Rewrite the model as: x1 x xp x y y a1s x apsx s s 1 1 p x1 xp y bf b f 1 Y Fb ε p p Adv. Eng. Stat., Jay Liu 66

68 Advanced topics ridge regression 3. The objective function to use for the optimization is to minimize J T Y Fb Y Fb kbb Therefore, T 1 T b* F FZ ki F Y 4. Solve the optimization problem in Step 3 for several values of k between 0 and 1 and choose that value of k at which the estimates of b see to stabilize. Otherwise, choose k by validation on new data Adv. Eng. Stat., Jay Liu 67

69 Advanced topics Non-linear regression General form of a non-linear regression model y f( xa, ) T In a linear model, f ( x, a) x a. In a non-linear model, f() woulds have any form. E.g., b1 e x y 1 b x 2 Remember that nonlinear transformation also changes P.D.F of variables (and errors)? What does this mean? Adv. Eng. Stat., Jay Liu 68

70 Advanced topics non-linear regression The approach is exactly the same as for linear models We use the same objective function: S r εε n i1 n i1 ( y yˆ ) i i 2 ( y f x, a ) i i 2 All we need is to minimize S over a. but how? Adv. Eng. Stat., Jay Liu 69

71 Advanced topics non-linear regression The big difference between linear and nonlinear regression is that in general, the optimization problem for a nonlinear model does not have an exact analytical solution. Therefore, we have to use a numerical optimization algorithm such as: Gauss-Newton Steepest Descent Conjugated Gradients Any other optimization algorithm Adv. Eng. Stat., Jay Liu 70

72 Advanced topics non-linear regression When using an optimization algorithm to solve nonlinear regression problems, one needs to be able to specify: 1. an expectation function (i.e. the form of the model) 2. Data 3. starting guesses for a 4. stopping criteria 5. possibly other tuning parameters associated with the optimization algorithm Adv. Eng. Stat., Jay Liu 71

73 Advanced topics non-linear regression Problems with Numerical Optimization Failure to converge Finding only a local minimum and not the global minimum Requires good starting guesses for the parameters Can be sensitive to the choice of convergence criteria and other tuning parameters of the algorithm Sometimes requires specification of the derivatives of the model with respect the the parameters Adv. Eng. Stat., Jay Liu 72

Simple Linear Regression for the Climate Data

Simple Linear Regression for the Climate Data Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Statistics for Engineers

Statistics for Engineers Statistics for Engineers ChE 4C3 and 6C3 Kevin Dunn, 2014 kevin.dunn@mcmaster.ca http://learnche.mcmaster.ca/4c3 Overall revision number: 119 (January 2014) 1 Copyright, sharing, and attribution notice

More information

Statistics for Engineers. Kevin Dunn, Least Squares Modelling

Statistics for Engineers. Kevin Dunn, Least Squares Modelling Statistics for Engineers Kevin Dunn, 2016 kevin.dunn@mcmaster.ca http://learnche.mcmaster.ca/ Least Squares Modelling Copyright, sharing, and attribution notice This work is licensed under the Creative

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Basic Probability Reference Sheet

Basic Probability Reference Sheet February 27, 2001 Basic Probability Reference Sheet 17.846, 2001 This is intended to be used in addition to, not as a substitute for, a textbook. X is a random variable. This means that X is a variable

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

OPTIMIZATION OF FIRST ORDER MODELS

OPTIMIZATION OF FIRST ORDER MODELS Chapter 2 OPTIMIZATION OF FIRST ORDER MODELS One should not multiply explanations and causes unless it is strictly necessary William of Bakersville in Umberto Eco s In the Name of the Rose 1 In Response

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Estadística II Chapter 4: Simple linear regression

Estadística II Chapter 4: Simple linear regression Estadística II Chapter 4: Simple linear regression Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 29 Multivariate Linear Regression- Model

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Linear Regression In God we trust, all others bring data. William Edwards Deming

Linear Regression In God we trust, all others bring data. William Edwards Deming Linear Regression ddebarr@uw.edu 2017-01-19 In God we trust, all others bring data. William Edwards Deming Course Outline 1. Introduction to Statistical Learning 2. Linear Regression 3. Classification

More information

Simple Linear Regression Estimation and Properties

Simple Linear Regression Estimation and Properties Simple Linear Regression Estimation and Properties Outline Review of the Reading Estimate parameters using OLS Other features of OLS Numerical Properties of OLS Assumptions of OLS Goodness of Fit Checking

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Curve Fitting. Objectives

Curve Fitting. Objectives Curve Fitting Objectives Understanding the difference between regression and interpolation. Knowing how to fit curve of discrete with least-squares regression. Knowing how to compute and understand the

More information

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 19 Modeling Topics plan: Modeling (linear/non- linear least squares) Bayesian inference Bayesian approaches to spectral esbmabon;

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

p(z)

p(z) Chapter Statistics. Introduction This lecture is a quick review of basic statistical concepts; probabilities, mean, variance, covariance, correlation, linear regression, probability density functions and

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Correlation and Regression Theory 1) Multivariate Statistics

Correlation and Regression Theory 1) Multivariate Statistics Correlation and Regression Theory 1) Multivariate Statistics What is a multivariate data set? How to statistically analyze this data set? Is there any kind of relationship between different variables in

More information

Chapter 7 Linear Regression

Chapter 7 Linear Regression Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati 405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Prof. M. El-Sakka Dept of Economics Kuwait University In this chapter we take a critical

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Estadística II Chapter 5. Regression analysis (second part)

Estadística II Chapter 5. Regression analysis (second part) Estadística II Chapter 5. Regression analysis (second part) Chapter 5. Regression analysis (second part) Contents Diagnostic: Residual analysis The ANOVA (ANalysis Of VAriance) decomposition Nonlinear

More information

Biostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich

Biostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich Biostatistics Correlation and linear regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1 Correlation and linear regression Analysis

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

REGRESSION ANALYSIS AND INDICATOR VARIABLES

REGRESSION ANALYSIS AND INDICATOR VARIABLES REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora

More information

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Multiplication of Polynomials

Multiplication of Polynomials Summary 391 Chapter 5 SUMMARY Section 5.1 A polynomial in x is defined by a finite sum of terms of the form ax n, where a is a real number and n is a whole number. a is the coefficient of the term. n is

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance

More information

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect

More information