Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

Size: px
Start display at page:

Download "Diagnostics can identify two possible areas of failure of assumptions when fitting linear models."

Transcription

1 1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important to detect and correct a non constant error variance. If this problem is not eliminated the least squared estimators will still be unbiased but they will no longer have the minimum variance property. This means that the regression coefficient estimators will have larger standard errors than necessary. Both problems in (i) and (ii) can be remedied by taking an appropriate transformation of the response variable Y. There may be a third reason for taking transformations, namely (iii) to simplify the relationship between the response variable and the explanatory variables - fitting a polynomial of high degree should be avoided, if possible, since interpretation of such models is difficult. Transforming variables may simplify the relationship sufficiently to avoid the need of using a high degree polynomial. Transformation are also used (iv) to turn non-linear models into linear models. 1.2 Ad-hoc methods Consider the four curved relationships shown on the next page between a response variable Y and an explanatory variable x. Notice that these curves have only one bend in them and that the bulge of the bend points either towards larger or towards smaller values of x (or Y ). For example in the first figure the bulge points towards larger values of x and smaller values of Y, whereas in the second figure the bulge points towards smaller values of both Y and x. 1

2 . 2

3 An one bulge relationship can be made into a straight line relationship by either transforming the Y -values or the x-values using a power transformation i.e. by taking either Y = Y k or by taking x = x k for some k. The value of k depends on the direction of the bulge. If the bulge points towards lower values of the variable you are transforming then take k < 1 for the relationship to be transformed into a straight line. (The case k = 0 represents a logarithmic transformation) If the bulge points towards larger values of the variable you are transforming then take k > 1 to linearise the relationship. The exact value of k is determined by trial and error. It must be remembered however that transforming the response variable Y affects the Normality and homogeneity of variance assumptions. The usual values of k that are considered are k =..., 1, 3/4, 1/2, 1/4, 0, 1/4, 1/2, 3/4, 1.25, 1.5,... Two bend relationships, an example of which is displayed below cannot be made into a straight line relationship by a power transformation. In toxicity studies the proportion Y of a fixed number of organisms surviving a toxic substance is related to the dose x of the the toxin. However, this relationship is S- shaped i.e. a two bend relationship. 3

4 Common transformations that change such a two bend relationship into a straight line relationship are Logit transformation: Y = log ( ) Y 1 Y Arc-sine transformation: Y = sin ( ) ( ) 1 Y = arcsin Y Probit transformation : Y = Φ 1 (Y ) where Φ (.) is the standard Normal distribution function. Once again it must be remembered that these transformations on the response variable may either disturb or not necessarily rectify the Normality and variance homogeneity assumptions. 1.3 Intrinsically linear models. The following models although at first sight may not appear to be linear they van in fact be converted into linear models, without much difficulty. (a) Y i = αx β i ε i Y = α + βx i + ε i under Y = log Y, x = log x transformation (b) Y i = αe βx i ε i Y = α + βx i + ε i by a logarithmic transformation (c) Y i = x i α + βx i + ε i Y = β + αx i + ε i under the Y = 1 Y, X = 1 X transformation. α (d) Y i = Y = γ βx 1 + γe βx i +ε under the Y = log( α i εi Y 1) transformation, provided α is known. Note that for the transformed models to be useful the least squares assumptions on the error distribution must apply to the the errors ε after the transformation. 4

5 1.4 Analytic ways of determining transformations Transformations in the explanatory variable to simplify relationships The Box-Tidwell method Suppose we want to find the power transformation such that i.e. w = Y i = α + βw i + ε i Y i = α + βx λ i + ε i { x λ if λ 0 log x if λ = 0 i = 1, 2,..., n i = 1, 2,..., n The Box-Tidwell method of determining the appropriate value λ 0 of λ argues as follows. For the appropriate value λ 0 we have the linear relationship between Y and x λ 0 Y i = α + βx λ 0 i + ε i i = 1, 2,..., n with Residual sum of squares RSS(λ 0 ). However, for an inappropriate choice λ the relationship between y and x λ is not linear and if we insist on fitting Y i = α + βx λ i + ε i i = 1, 2,..., n the fit will be poor and in particular RSS(λ ) of such a fit will have RSS(λ ) RSS(λ 0 ) Hence the appropriate choice λ 0 satisfies RSS(λ 0 ) = min RSS(λ) λ i.e. λ 0 is that value of λ which minimises the RSS(λ) of the model Y i = α + βx λ i + ε i i = 1, 2,..., n The minimization is achieved using the following iterative procedure. 5

6 Step 1 Let λ (1) be a first approximation to to the required value λ 0 and let w (1) i = x λ (1) i i = 1, 2,..., n (usually we take λ (1) = 1 so that w (1) i = x i i.e. no transformation is applied at the first iteration of the procedure) Fit the model Y i = α + βw (1) i + ε i i = 1, 2,..., n and let ˆβ (1) be the l.s.e of β in the fitted model. If, in the correct relationship Y i = α + βx λ 0 i + ε i i = 1, 2,..., n we take the first order Taylor expansion of x λ 0 about λ = λ (1) we get to a d first order approximation (recall dλ xλ = x λ log x) where Step 2 Y i = α + β[x λ (1) i + (λ 0 λ (1) )x λ (1) 1 log x i ] + ε i (1) = α + βw (1) i + β (λ 0 λ (1) ) x λ (1) i log x λ (1) i + ε i λ (1) (2) = α + βw (1) i + γw (1) i log w (1) i + ε i (3) γ = β( λ 0 λ (1) 1) (4) Fit the model in (3) by regressing Y i on two explanatory variables w (1) i and w (1) i log w (1) i. Let ˆγ (1) be the l.s.e. of γ in fitting the model in (3). Aside Notice that if we have taken, as recommended, λ (1) = 1 testing at this stage the hypothesis H 0 : γ = 0 against H 1 : γ 0 6

7 is equivalent to testing the hypothesis H 0 : λ 0 = 1 against H 1 : λ 0 1 i.e. it is equivalent to testing the null hypothesis that no transformation of the explanatory variable is required to linearize the relationship between the response variable and the explanatory variable against the alternative that some transformation is required. Result In order to test whether the relationship between the response variable and an explanatory variable x is linear, fit the model which contains both the explanatory variable x and the generated explanatory variable x log x and proceed to test the hypothesis whether the coefficient γ of the variable x log x in the fitted model is equal to zero or not. If there is statistical evidence that γ 0 that is an indication that the relationship between the response variable and x is not linear. If there is no statistical evidence that γ 0 then that is an indication that the relationship between the response variable and x is linear. Returning to the procedure for determining the value of the power λ 0 in the power transformation that linearises the relationship between the response and explanatory variable we are ready to take the next step. Step 3 From equation (4) we see that λ 0 is approximately equal to λ (1) ( ˆγ (1) ˆβ (1) + 1) = λ (2) i.e. λ (2) is a better approximation to λ 0 than λ (1). Note that ˆβ (1) is the l.s.e. of β in Step 1 and ˆγ (1) is the l.s.e. of γ in Step 2. 7

8 Step 4 Repeat steps 1 to 3 so that at the end of the rth iteration you have the improved approximation to λ 0 λ (r+1) = ( ˆγ (r) ˆβ (r) + 1)λ (r) The iterations stop when λ (r+1) agrees with λ (r) to within the degree of required accuracy. Usually the convergence is fairly fast. Note that at the start of the (r + 1)th iteration we have w (r+1) i = x λ (r+1) i λ (r) ( ˆγ (r) +1) ˆβ = x (r) i = [w (r) i ] ( ˆγ (r) +1) ˆβ (r) Remark: It must be noted that a power transformation of the explanatory variable can only succeed in linearizing the relationship between the response variable and explanatory variable only if the original relationship between the response and the explanatory variable is an one bend relationship. Remark: This iterative procedure can be used unaltered to determine the required power transformation of a particular regressor in a multiple regression model in order to linearize the relationship between the response variable and that particular regressor. Example An engineer is investigating the use of windmills for power generation. Below are the data collected by the engineer on the DC output from a new design of windmill and the corresponding wind velocity. The data are also plotted in Figure 1. 8

9 Row DCoutput WVelocity

10 The scatter plot suggests that the relationship between DC output (Y ) and wind velocity (x) is not a straight line but being an one bent relationship it may be linearised using a power transformation of the regressor x. Starting with the initial guess λ (1) = 1 we fit a straight line model between DC output (Y ) and wind velocity (x) with the following results. Regression Analysis: DCoutput versus WVelocity The regression equation is DCoutput = WVelocity Predictor Coef SE Coef T P Constant WVelocit S = R-Sq = 87.4% R-Sq(adj) = 86.9% Defining xlogx = WVelocity log(wvelocity) and fitting the model E(Y ) = α + βx + γ(xlogx) 10

11 we get the results Regression Analysis: DCoutput versus WVelocity, xlogx The regression equation is DCoutput = WVelocity xlogx Predictor Coef SE Coef T P Constant WVelocit xlogx S = R-Sq = 97.4% R-Sq(adj) = 97.1% The improved estimate of λ 0 is therefore λ (2) = ˆγ (1) ˆβ (1) + 1 = = 0.92 To perform the second iteration we define the new regressor w = x 0.92 and fit a straight line model between DCoutput Y and w. The results are Regression Analysis: DCoutput versus w The regression equation is DCoutput = w Predictor Coef SE Coef T P Constant w S = R-Sq = 98.1% R-Sq(adj) = 98.0% 11

12 Now define the second regressor wlogw and fit the model The results are E(Y ) = α + βw + γ(wlogw) Regression Analysis: DCoutput versus w, wlogw The regression equation is DCoutput = w wlogw Predictor Coef SE Coef T P Constant w wlogw S = R-Sq = 98.1% R-Sq(adj) = 97.9% The second iteration estimate of λ 0 is λ (3) = ( ˆγ (2) + 1)λ (2) = ( ) ( 0.92) = 0.84 ˆβ (2) To perform the third iteration we define the new regressor w = x 0.84 and a fit a straight line model between DCoutput Y and w. The results are Regression Analysis: DCoutput versus w The regression equation is DCoutput = w Predictor Coef SE Coef T P Constant w S = R-Sq = 98.1% R-Sq(adj) = 98.0% 12

13 Now define the second regressor wlogw and fit the model The results are E(Y ) = α + βw + γ(wlogw) Regression Analysis: DCoutput versus w, wlogw The regression equation is DCoutput = w wlogw Predictor Coef SE Coef T P Constant w wlogw S = R-Sq = 98.1% R-Sq(adj) = 97.9% The third iteration estimate of λ 0 is λ (4) = ( ˆγ (3) + 1)λ (3) = ( ) ( 0.84) = 0.83 ˆβ (3) To perform the fourth iteration we define the new regressor w = x 0.83 and a fit a straight line model between DCoutput Y and w. The results are Regression Analysis: DCoutput versus w The regression equation is DCoutput = w Predictor Coef SE Coef T P Constant w S = R-Sq = 98.1% R-Sq(adj) = 98.0% 13

14 Now define the second regressor wlogw and fit the model The results are E(Y ) = α + βw + γ(wlogw) Regression Analysis: DCoutput versus w, wlogw The regression equation is DCoutput = w wlogw Predictor Coef SE Coef T P Constant w wlogw S = R-Sq = 98.1% R-Sq(adj) = 97.9% The fourth iteration estimate of λ 0 is λ (5) = ( ˆγ (4) + 1)λ (5) = ( ) ( 0.83) = 0.83 ˆβ (5) which, to two decimal points is the same as λ (4), the estimate in the last iteration. The iterative procedure therefore terminates and λ 0 is taken as Thus the relationship between E(Y ) and w = x 0.83 is linear. The plot below confirms this. 1.5 Variance Stabilizing Transformations 1. Suppose that the variance σ 2 of the observations Y are dependent on the mean value of Y i.e. σ 2 = g(µ) where µ = E(Y ) and g is a known function. We require a transformation Y = T (Y ) 14

15 so that the variance of the transformed data Y is constant i.e. V ar(y ) = τ 2. Now as a first order approximation we have through a Taylor series expansion about µ Y = T (Y ) T (µ) + (Y µ)t (µ) (5) where T denotes first derivative. Thus τ 2 = V ar(y ) = V ar(y µ)[t (µ)] 2 (6) i.e. or Hence Examples τ 2 = g(µ)[t (µ)] 2 T (µ) = T (µ) τ [g(µ)] 1/2 1 dµ (7) [g(µ)] 1/2 (a) If Y Poisson with mean µ then σ 2 = µ i.e. g(µ) = µ and T (Y ) 1 y dy y i.e. if Y is Poisson the square root transformation will stabilize the variance (b) If ny is Binomial(n, µ) distributed then E(Y ) = nµ n = µ and V ar(y ) = 1 n nµ(1 µ) = 1 µ(1 µ) = g(µ) 2 n The variance stabilizing transformation is therefore T (y) 1 y(1 y) dy arcsin( y) 15

16 (c) If g(µ) = µ 2k, some k, then the variance stabilizing transformation is 1 T (y) dy y1 k yk i.e. a power transformation. The case k = 1 corresponds to the logarithmic transformation i.e. T (y) = log y Warning In examples (a) and (b), even though the transformations may turn the transformed values into Normal values, non-the-less, before rushing to take the suggested variance stabilizing transformation and regressing the transformed variable on a number of explanatory variables, consider first the possibility of using generalised linear regression models (covered in the second semester). 2. Suppose that on inspecting the residual plots you find evidence that the variance of the observations depends on the size of one of the explanatory variables, say x j, in the fitted model, i.e. σ 2 = v(x j )τ 2 (8) where x j is the jth explanatory in the fitted model. If the functional form v(.) is known or can be guessed then one can stabilize the variance of the observations by transforming all variables, response as well as explanatory variables, as follows Y i = Y v(xij ) i = 1, 2,..., n and x ir = x ir v(xij ) i = 1, 2,..., n r = 1, 2,..., p. Thus if the original model was p Y i = β 0 + β r x ir + ɛ i r=1 i = 1, 2,..., n then dividing throughout by v(x ij ) we get Y i = β 0 x i0 + p β r x ir + ɛ i r=1 i = 1, 2,..., n 16

17 where x 1 i0 = and v(xij ) ɛ i = ɛ i so that V v(xij ) ar(ɛ i ) = 1 V ar(ɛ v(x ij ) i) = τ 2. Thus under the transformed model the observations have constant variance. However, usually v(.) is not known. In the case when v(.) is monotonic an approximation to it can be obtained as follows: Suppose that the plot of the residuals ê i (or standardised residuals r i ) against the values x ij of the jth explanatory variable x j indicates that the variance of the observations either increases or decreases with the x j values. (If the evidence is not compelling or ambiguous, it can be confirmed statistically with the following test. Order the observations/cases in the data set according to the size of the values of x j. Remove the middle (1/5)th of the ordered cases and keep the remaining as two distinct groups, the first (2/5)ths of the ordered cases in one group and the last (2/5)ths of the ordered cases in the second group. For each group of observations fit the model separately and calculate the residual mean squares RMS 1 and RMS 2 for the two groups. Clearly, if the observations are independent then so are RMS 1 and RMS 2 and, further, if the observations have constant variance σ 2 then both RMS 1 and RMS 2 estimate σ 2 and where F = RMS 1 RMS 2 F n1 k,n 2 k n 1 = number of cases in the first group n 2 = number of cases in the second group k = number of parameters in the fitted model. (a proof of this will be provided in the Linear Models lectures).if however the variance of the observations is either increasing or decreasing with the values of x j then RMS 1 /RMS 2 will either be too small or too large respectively. An F-test will therefore test the hypothesis of constant variance against the alternative of either increasing or decreasing variance. If the test is significant [i.e. if F > F n1 k,n 2 k;α/2 or F < F n1 k,n 2 k;1 α/2 where F n1 k,n 2 k;γ denotes the 100(1 γ) percentile of the F -distribution and α is the level of significance you are working at] fit each of the five models (i) ê i = α 1 + α 2 x ij + ɛ i, 17

18 (ii) ê i = α 1 + α 2 xij + ɛ i, (iii) ê i = α 1 + α 2 x ij + ɛ i, (iv) ê i = α 1 + α 2 xij + ɛ i, (v) ê i = α 1 + α 2 log x ij + ɛ i, and test for the significance of α 1 and α 2 in each model. Choose the function v(.) to be the functional form in x ij in the model, amongst the above five, for which α 1 is NOT significant and α 2 is significant. If none of the above models satisfy the above condition then choose the function v(.) to be the form on the r.h.s. (less ɛ i ) of the model for which both α 1 and α 2 are significant. 1.6 Transformations to improve Normality and stabilise the variance - the Box Cox method When fitting a linear model, if a probability plot and residual plot indicate that the observations are not Normally distributed with constant variance transformation of the response variable may improve Normality and stabilize the variance. The Box - Cox method suggests the power transformation Y λ 1 Y (λ) = if λ 0 λc λ 1 c log Y if λ = 0 where λ has to be decided by the experimenter so that the Y (λ) i s are Normally distributed with constant variance σλ 2 and E(Y (λ) i ) = x T i θ λ, i = 1, 2,..., n where x i is a vector of covariate values associated with the ith case and θ λ the vector of their unknown coefficients. To see how λ is chosen, note that if y = (y 1, y 2,..., y n ) T are the observed values of Y = (Y 1, Y 2,..., Y n ) T and y (λ) = (y (λ) 1, y (λ) 2,..., y (λ) ) T are the values n (9) 18

19 of Y (λ) = (Y (λ) 1, Y (λ) 2,..., Y (λ) n ) T corresponding to y, the likelihood of the values y is expressed in terms of the likelihood of the values y (λ) as follows. L(λ, θ λ, σ 2 λ) = f Y (y) = n i=1 f (λ) Y = f Y (λ)(y (λ) ) (y (λ) i ) y(λ) i n i=1 y i y (λ) i y i 1 = ( 2πσ λ ) exp( 1 n (y (λ) n 2σλ 2 i x i θ λ ) 2 ) i=1 = (2πσ λ ) n 1 n 2 exp( S 2σλ 2 λ ) ( y i i=1 c )(λ 1) = (2πσ λ ) n 1 2 exp( S 2σλ 2 λ ) (ẏ c )n(λ 1) n i=1 y (λ) i y i where ẏ = n i=1 y 1 n i = geometric mean of the y i s and n S λ = (y (λ) i x T i θ λ ) 2 = (y (λ) Xθ λ ) T (y (λ) Xθ λ ) (10) i=1 is the sum of squares of the deviations of the model fitted to the transformed y (λ) i s. Here X is the design matrix with x T i as its ith row. Choose the arbitrary constant c to be c = ẏ so that the likelihood L(λ, θ λ, σλ) 2 reduces to L(λ, θ λ, σλ) 2 = (2πσ λ ) n 1 2 exp( S 2σλ 2 λ ) (11) and the log-likelihood to l(λ, θ λ, σλ) 2 = log L(λ, θ λ, σλ) 2 = n 2 log σ2 λ 1 S 2σλ 2 λ + constant (12) This needs to be maximised with respect to θ λ, σ 2 λ and λ to make the observed data values y as likely as possible. Maximising first with respect to θ λ we see from (12) that this is equivalent to minimising S λ = (y (λ) Xθ λ ) T (y (λ) Xθ λ ) with respect to θ λ. Thus the maximising value ˆθ λ is the l.s.e of θ λ in the linear model y (λ) = Xθ λ + ɛ and the minimised value of S λ is min S λ = (y (λ) X ˆθ λ ) T (y (λ) X ˆθ λ ) = RSS(λ) (13) 19

20 Following this maximisation with respect to θ λ the log-likelihood is now l(λ, ˆθ λ, σλ) 2 = n 2 log σ2 λ 1 RSS(λ) + constant (14) 2σλ 2 Now, maximising this with respect to σ 2 λ we get the maximising value σ 2 λ of σ 2 λ as σ 2 λ = 1 n RSS(λ) = n 1 n ˆσ2 λ (15) where ˆσ 2 λ is the l.s.e. of σ 2 λ in the linear model y (λ) = Xθ λ + ɛ. Following this maximisation with respect to σ 2 λ the log-likelihood is now l(λ, ˆθ λ, ˆσ 2 λ) = n 2 log RSS(λ) + κ = n 2 log ˆσ2 λ + κ (16) where κ and κ are constants. Finally, as can be seen from (16), the maximising value λ of λ minimises the RSS(λ) or the l.s.e. ˆσ 2 λ of σ 2 λ, the common variance of the transformed values. In practice λ is usually determined by calculating either the RSS(λ) (or the l.s.e. ˆσ 2 λ) for different values of λ in the range -2 to 2, and plotting RSS(λ) (or ˆσ λ ) against λ and reading off the plot the point λ at which the minimum is attained. The Box Cox procedure, therefore involves three steps 1. Transform the data y 1, y 2,..., y n by taking y (λ) i = yi λ 1 λẏ λ 1 if λ 0 ẏ log y i if λ = 0 (17) i = 1, 2,..., n 2. Fit the model y (λ) = Xθ λ + ɛ to the transformed data. 3. Calculate the RSS(λ) of the fitted model in (2) Do these three steps for different values of λ in the range -2 to 2 and plot RSS(λ)against λ and read off the plot the minimising value λ. Confidence intervals for the appropriate power λ can be constructed as follows. Suppose we wished to test the hypothesis H 0 : appropriate λ = λ 0 20

21 where λ 0 is a given value, against the alternative H 0 : appropriate λ λ 0 When the null hypothesis is true the best chances of getting the observed values y 1, y 2,..., y n is given by the maximized likelihood L(λ 0, ˆθ λ0, ˆσ 2 λ 0 ) whose log-likelihood is given in (16) with ˆθ λ0 and ˆσ 2 λ 0 the l.s.e s respectively of θ λ0 and σ 2 λ 0 in the linear model y (λ 0) = Xθ λ0 + ɛ. When indeed the null hypothesis H 0 is true log-likelihood l(λ 0, ˆθ λ0, ˆσ 2 λ 0 ) will be close to l( λ, ˆθ λ, ˆσ 2 λ) and the difference l(λ 0, ˆθ λ0, ˆσ 2 λ 0 ) l( λ, ˆθ λ, ˆσ 2 λ) = n 2 [log RSS( λ) RSS(λ 0 )] will be small. Conversely, when the difference l(λ 0, ˆθ λ0, ˆσ 2 λ 0 ) l( λ, ˆθ λ, ˆσ 2 λ) is small that is an indication that the null hypothesis is acceptable. It can, in fact be shown that 2[l(λ 0, ˆθ λ0, ˆσ 2 λ 0 ) l( λ, ˆθ λ, ˆσ 2 λ)] = n[log RSS( λ) RSS(λ 0 )] (18) is chi-squared distributed with 1 degree of freedom when H 0 is true. Thus one should accept the null hypothesis H 0 at the 5% level of significance if n[log RSS( λ) RSS(λ 0 ) χ 2 1;0.05 (19) or equivalently if RSS(λ 0 ) RSS( λ) eχ2 1;0.05 /n (20) But when n is large e χ2 1;0.05 /n = 1 + χ2 1;0.05. Thus one should accept the null n hypothesis at the 5% level of significance if RSS(λ 0 ) RSS( λ)(1 + χ2 1;0.05 n ) (21) We can therefore infer from the above that any λ 0 for which (21) is satisfied is an acceptable power, at the 5% level of significance, for a power transformation of the data. This is equivalent to saying that the set C of values defined by {C = λ 0 : RSS(λ 0 ) RSS( λ)(1 + χ2 1;0.05 )} is a 95% confidence interval n 21

22 for λ. As can be seen from a Box-Cox plot, once λ is identified from the plot and RSS( λ) read from the vertical axis of the plot, a horizontal line at height RSS( λ)(1 + χ2 1;0.05 ) can be drawn on the plot. At the two points where this n horizontal line intersects the curve of RSS(λ) two vertical lines are dropped onto the horizontal axis. The interval between these two vertical lines identifies the confidence interval {C = λ 0 : RSS(λ 0 ) RSS( λ)(1 + χ2 1;0.05 n )}. Clearly, if this confidence interval includes the value 1 then it would be advisable not to transform the data. On the other hand, if C does not include 1, then any value in it which is not awkward, makes the interpretation of the transformation easy and is close to λ, can be used for the power transformation to Normalize the data and stabilize their variance. 22

23 Comment: The Box Cox transformation may not always achieve its aims. In fact this transformation is trying to achieve three things simultaneously. 1. Fit the model E(Y (λ) ) = Xθ λ, 2. Stabilize the variance of the transformed data, 3. Achieve Normality of transformed response variable. If the model E(Y (λ) ) = Xθ λ that is fitted is too restrictive, in trying to fit such a restrictive model the transformation may fail to achieve its latter two aims. It is therefore advisable to make the model that is fitted as flexible as possible, to begin with, e.g. do not insist on a linear dependence on a given explanatory variable but allow a quadratic term in this explanatory variable in the model. If there is more than one explanatory variable in the model it may be helpful if you allow an interaction term (to be discussed at length later on in the course) between the two variables in your model. 23

Ch. 5 Transformations and Weighting

Ch. 5 Transformations and Weighting Outline Three approaches: Ch. 5 Transformations and Weighting. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3.

More information

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5.

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5. Ch. 5: Transformations and Weighting 1. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3. Weighted regression -

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

2.6.3 Generalized likelihood ratio tests

2.6.3 Generalized likelihood ratio tests 26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Tutorial 6: Linear Regression

Tutorial 6: Linear Regression Tutorial 6: Linear Regression Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction to Simple Linear Regression................ 1 2 Parameter Estimation and Model

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Advanced Statistics I : Gaussian Linear Model (and beyond)

Advanced Statistics I : Gaussian Linear Model (and beyond) Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

BIOS 2083: Linear Models

BIOS 2083: Linear Models BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, is Y ijk = µ ij + ɛ ijk = µ + α i + β j + γ ij + ɛ ijk with i = 1,..., I, j = 1,..., J, k = 1,..., K. In carrying

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

MATH11400 Statistics Homepage

MATH11400 Statistics Homepage MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 4. Linear Regression 4.1 Introduction So far our data have consisted of observations on a single variable of interest.

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Diagnostics and Remedial Measures: An Overview

Diagnostics and Remedial Measures: An Overview Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Inference in Constrained Linear Regression

Inference in Constrained Linear Regression Inference in Constrained Linear Regression by Xinyu Chen A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the Degree of Master of

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information