Parametrisations, splines

Size: px
Start display at page:

Download "Parametrisations, splines"

Transcription

1 / 7 Parametrisations, splines Analysis of variance and regression course Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers, December, 20

2 2 / 7 Outline Parameters Definition Re-parametrisation Comparison of parameterisations Contrasts Model comparison Linear splines Dose-response models

3 Acknowledgements written by Lene Theil Skovgaard 2006, 2007 updated by Julie Lyng Forman 2008, November 2009 updated by Marc Andersen 2 April 2009, April 200, November 200, April 20, November 20 Dept. of Biostatistics 2 StatGroup 3 / 7

4 4 / 7 Parameter definition Parameter unknown quantity that we want to estimate (to estimate means to provide a good guess) the decrease in blood pressure following treatment the difference in decrease for treatment and placebo the increase in insulin growth factor (IGF-) with age (nuisance) parameter needed for technical reasons to complete the model, e.g. the residual variance in ANOVA part of a statistical model Parametrisation choice of which parameters are to enter the model description Re-parametrisation shift to a new set of parameters

5 / 7 Why choose a specific parametrisation? Ease - the program has some default parametrisations Test of specific hypotheses - difference between treatment and placebo - difference in height for boys and girls at the age of 4 Estimation of specific quantities: - the potency of a drug, ED 0 or ED 90 (explained later)

6 6 / 7 Example of re-parametrisations Change of scale/units Do we measure height in cm or m? Take the relation of lung capacity versus height: fev = α + β height If we change from measuring height in cm to m the regression coefficient (the parameter) changes from β to β = 00 β Change of origin/intercept choice of another reference group in ANOVA subtracting e.g. 70cm from all height measurements

7 7 / 7 Re-parametrisation Re-parametrisations do not change the model as such! same fitted values same normal regions and prediction limits but a possibility for interpretations of specific interest

8 8 / 7 Other re-parametrisations Re-parametrisation is applied in the more advanced situations (beyond linearity): non-linear regression, logistic regression, correlated observations Reasons for re-parametrisation knowledge of distributional assumptions Some parameter estimates may be more close to the normal distribution than others we like to be able to construct symmetric confidence intervals, using the standard error Avoid negative lower confidence limit for a positive parameter Linear models In linear models the estimates have exact normal distributions (provided the model assumptions are correct, of course...)

9 9 / 7 Example: Reumatoid Arthritis Population: 4 patients with Reumatoid Arthritis Treatment: Randomised to one out of 6 possible treatments: Placebo Aspirin (20 mg) One of 4 doses of an active anti-inflammatory drug, X (dose: 0 mg, mg, 20 mg, 2 mg) Outcome: An index index summing up the effectiveness of the treatment (decrease in various symptoms), the higher the better

10 0 / 7 Outcome: index-values Reference: Woolson, R.F. & Clarke, W.R.: Statistical methods for the analysis of biomedical data. 2ed., Wiley, (Exercise 0.4 page 409)

11 / 7 Representation of RA data in SAS Obs group type dose index placebo placebo placebo placebo placebo placebo placebo placebo placebo placebo placebo placebo placebo placebo x20 active x20 active x20 active x2 active x2 active x2 active x2 active x2 active 2 4.7

12 RA data: 4 active X-groups only 2 / 7

13 3 / 7 Two different parametrisations in One-way ANOVA Reference level supplemented with differences to reference level One level for the reference group (in SAS by default the last, numerically or alphabetically), supplemented with differences in levels from this reference group to each of the remaining groups good for describing diffences between groups Model: Y gi = α + β g + ε gi,β last = 0 One level for each group good for describing the individual groups Model: Y gi = µ g + ε gi

14 4 / 7 Traditional One-way ANOVA in SAS proc glm data=drug; where type= active ; class group; model index=group / solution; run; which yields the estimates: Standard Parameter Estimate Error t Value Pr > t Intercept B <.000 group x B <.000 group x B <.000 group x B <.000 group x B... NOTE: The X X matrix has been found to be singular...

15 / 7 Test that group levels are the same The GLM Procedure Dependent Variable: index Sum of Source DF Squares Mean Square F Value Pr > F Model <.000 Error Corrected Total R-Square Coeff Var Root MSE index Mean Source DF Type I SS Mean Square F Value Pr > F group <.000 Source DF Type III SS Mean Square F Value Pr > F group <.000 The test corresponds to all differences equal to zero

16 6 / 7 Details on the ANOVA-parametrisation in SAS Model written as a multiple regression: Y = β 0 + β x + β 2 x 2 + β 3 x 3 + ǫ The x s are so-called dummy variables or design variables: x is if subject i belongs to the first group, and 0 otherwise x 2 is if subject i belongs to the second group, and 0 otherwise x 3 is if subject i belongs to the third group, and 0 otherwise With this parametrisation, β 0 corresponds to the level for the last group (the reference group, here group 4); β is the difference in level between group and group 4 β 2 is the difference in level between group 2 and group 4,...

17 7 / 7 Parametrised with one level for each group No intercept Thus no reference group: Each group level is estimated. proc glm data=drug; where type= active ; class group; model index=group / noint solution; run; Standard Parameter Estimate Error t Value Pr > t group x <.000 group x <.000 group x <.000 group x <.000

18 8 / 7 Parametrised with one level for each group, continued All the tests now refer to the hypothesis of a zero level (which is not interesting). Please note that DF of group equals 4! Dependent Variable: index Sum of Source DF Squares Mean Square F Value Pr > F Model <.000 Error Uncorrected Total R-Square Coeff Var Root MSE index Mean Source DF Type I SS Mean Square F Value Pr > F group <.000 Source DF Type III SS Mean Square F Value Pr > F group <.000

19 9 / 7 Summary: Two parametrisations in One-way ANOVA One level, β 0 = µ 4, for the reference group supplemented with differences in level from the reference group to each of the remaining groups, β,β 2, and β 3 : β g = µ g µ 4 so that β g = 0 iff µ g = µ 4 good for testing of identity and certain pairwise comparisons One level, µ g, for each group good for estimation, not suited for testing!!

20 20 / 7 Contrasts If we want to compare dose 0 with dose : Test the hypothesis H : µ = µ 2, or equivalently H : µ 2 µ = 0 i.e. test that a contrast equals zero, namely µ 2 µ = µ + µ µ µ 4 The coefficients of the contrast are -,, 0, and 0

21 2 / 7 Estimate statements in GLM To compute the contrast and test that it is equal to zero: proc glm data=drug; where type= active ; class group; model index=group / noint solution; estimate dose mg vs. dose 0 mg group - 0 0; run; from which we get the additional output Standard Parameter Estimate Error t Value Pr > t dose mg vs. dose 0 mg <.000

22 22 / 7 Checking for variance heterogeneity Note: We have disregarded the problem of variance heterogeneity: proc glm data=drug; where type= active ; class group; model index=group / noint solution; means group / hovtest=levene; run; Levene s Test for Homogeneity of index Variance ANOVA of Squared Deviations from Group Means Sum of Mean Source DF Squares Square F Value Pr > F group Error A clear indication that the variance increases with dose. In practice: Deal with this! (Welch s test, log-transform...)

23 23 / 7 RA Example continued: 4 X-doses with linear regression line Can we use a simple model with linear dose effect?

24 24 / 7 Linear regression We look at the simple linear regression model: index = α + β dose + ε α is the intercept and β is the slope of the line, ε is the vertical distance from observation to line (the residual) SAS Program statements: proc reg data=drug; where type= active ; model index=dose / clb; run;

25 2 / 7 Linear regression results Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.000 dose <.000 Conclusion: index increases approx. 2.8 points with each unit of dose in the range 0 2, if the model is correct...

26 26 / 7 Continued analysis Is this a reasonable description? Can we test the linearity? E.g. Is this model almost as good as a quadratic model the ANOVA model (Still ignoring differences in variances)

27 27 / 7 Quadratic fit: include new variable dose2=dose 2 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept dose <.000 dose

28 28 / 7 Model reduction: group means to linear F test for comparison of explained variation We have to compare two models: Model The ANOVA model with 4 separate groups Model 2 The linear regression model with actual dose as covariate Note: The models have to be nested, i.e. one is derived from the other, typically be demanding some parameters to be equal to each other or equal to zero Indeed, the linear regression model is a special case of the ANOVA model, in which certain contrast are equal.

29 Model comparison: Sums of squares Model Sum of Squares(Model): i (ŷ i ȳ) 2 Explained variation: How much do the predicted values vary? (the bigger the better) Residuals Sum of Squares(Residual): i (y i ŷ i ) 2 Unexplained variation (residual variation): How large are the deviations from the model? (the smaller the better) Total Sum Squares(Total): i (y i ȳ) 2 Total variation from overall mean. The Model and Residual variation partitions the total variation. (this is fixed) Notation: y i : Observed, ŷ i : Predicted, y i ŷ i : Residual, ȳ: overall mean. 29 / 7

30 30 / 7 Comparing models How Look at changes in the explained variation Sum of Squares for the model (ie where source is not error) Sum Sq = Sum Sq(Model ) Sum Sq(Model 2 ) > 0, by defintion as more parameters can always explain more of the variation How much less is explained by the simpler model? Sum Sq(Model 2 ) Sum Sq(Residuals 2 ) { Model 2 }} { { Residual 2 }} { Sum Sq } {{ } } {{ } Model Residual Sum Sq(Model ) Sum Sq(Residuals )

31 3 / 7 Comparing models details df = df(model ) df(model 2 ) Mean Sq = Sum Sq/ df F = Mean Sq Mean Sq(Residual) How large can this get, just by chance/coincidence? F F(df 2 df, df ) where df and df 2 denote the degrees of freedom for the Mean Square(Residual) for the two models. Note: df(model ) + df = N, df(model 2 ) + df 2 = N.

32 32 / 7 Comparing models using F-test df SS MS Model - Groups Model 2 - Linear Change Model Residual F = = , F F(2, 2) corresponds to a P-value of If we have many groups, the test is not very powerful Note: The excessive number of digits are only used for being able to trace and re-produce the calculations. Usually the numbers should be shown with fewer digits.

33 33 / 7 SAS Type I and Type III sums of squares Type I Type III : sum of squares obtained for the effect in question while keeping the preceding effects (sequential) : sum of squares obtained when removing the effect and keeping all other effects Type I sum of squares depends on the order the terms in the model are specified. Often used to document a sequence of model fit. Type III sum of squares are used to identify effects, that can be removed from the model.

34 34 / 7 Comparing models using SAS The usual way of testing linearity: Trick SAS into doing the test by including dose both as a linear effect and as a class effect: proc glm data=drug; where type= active ; class group; model index=dose group / solution; run; The term group represents the variation of dose group means around the straight line

35 3 / 7 SAS output, Sum of Squares and Type I analysis Dependent Variable: index Sum of Source DF Squares Mean Square F Value Pr > F Model <.000 Error Corrected Total Source DF Type I SS Mean Square F Value Pr > F dose <.000 group

36 36 / 7 SAS output, Type III analysis and parameter estimates Source DF Type III SS Mean Square F Value Pr > F dose group Standard Parameter Estimate Error t Value Pr > t Intercept B dose B <.000 group x B group x B group x B... group x B... Note: The effect group now has only 2 degrees of freedom, since we are not testing equality, but linearity! Note: Type III test of dose is not meaningful here and the parameters do not have a simple interpretation.

37 37 / 7 Identifying linearity using contrasts Relation between linear regression and ANOVA model If the linear regression model is true, then all the contrasts µ 2 µ, µ 3 µ 2, and µ 4 µ 3 are equal to β as the doses are 0,, 20, 2. Derivation of coefficients contrasts µ 2 µ = µ 3 µ 2 if and only if (µ 3 µ 2 ) (µ 2 µ ) = µ 2µ 2 + µ 3 = 0 µ 3 µ 2 = µ 4 µ 3 if and only if (µ 4 µ 3 ) (µ 3 µ 2 ) = µ 2 2µ 3 + µ 4 = 0

38 38 / 7 Contrast statement in GLM Contrast statement in SAS proc glm data=drug; where type= active ; class group; model index=group / noint solution; contrast dev linearity group -2 0, group 0-2 ; run; SAS output Contrast DF Contrast SS Mean Square F Value Pr > F dev linearity

39 39 / 7 Unequally spaced doses The previous simple expression was due to equally spaced doses In case doses were 0, 7, 9, and 2: µ 2 µ = 7 β µ 3 µ 2 = 2 β µ 4 µ 3 = 6 β Hence test linearity by testing that µ 2 µ 7 = µ 3 µ 2 2 = µ 4 µ 3 6

40 40 / 7 Example: Growth Relate weight to age and gender, in the Juul-data for Tanner stage Again: Problems with variance homogeneity...

41 4 / 7 Two different parametrisations in ANCOVA One regression line for the reference group (in SAS the last, numerically or alphabetically), supplemented with the differences in intercept and slope from this reference group to the remaining group good for describing the diffence between groups One regression line for each group good for describing the evolution in the individual groups Analysis of covariance, ANCOVA (model with one Class variable and one quantitative variable.)

42 42 / 7 Test that regression lines are parallel Source DF Type I SS Mean Square F Value Pr > F sex <.000 age <.000 age*sex Source DF Type III SS Mean Square F Value Pr > F sex age <.000 age*sex

43 Type I and Type III analysis The model can be described as weigth = α sex + β sex age + ǫ Type I analysis is sequential testing (from below) of: H 0 : β boy β girl = 0, i.e. slopes are the same H : β = 0, i.e. no age-effect H 2 : α boy α girl = 0, i.e. no sex-effect either Type III analysis is non-sequential: H 0 : β boy β girl = 0, i.e. slopes are the same H 0 : β boy = 0, i.e. slope of reference is zero H 0 : α boy α girl = 0, i.e. intercepts are the same slopes may differ 43 / 7

44 44 / 7 Parameter estimates Standard Parameter Estimate Error t Value Pr > t Intercept B sex female B sex male B... age B <.000 age*sex female B age*sex male B... NOTE: The X X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter B are not uniquely estimable. Interpretation Boys increase their weight more rapidly than girls: approx. 692 g. more per year.

45 4 / 7 SAS-computation of both lines simultaneously Subgroup analysis Keep the interaction term sex*age Leave out the marginal effect age this will merge the marginal effect into the interaction term Leave out the intercept (use option noint) this will merge the intercept into the sex effect proc glm data=juul; where tanner=; class sex; model weight=sex age*sex / noint solution clparm; run;

46 46 / 7 SAS-computation of both lines simultaneously The GLM Procedure Dependent Variable: weight Sum of Source DF Squares Mean Square F Value Pr > F Model <.000 Error Uncorrected Total R-Square Coeff Var Root MSE weight Mean Source DF Type I SS Mean Square F Value Pr > F sex <.000 age*sex <.000

47 47 / 7 SAS-computation of both lines simultaneously, estimates Source DF Type III SS Mean Square F Value Pr > F sex <.000 age*sex <.000 Standard Parameter Estimate Error t Value Pr > t sex female <.000 sex male age*sex female <.000 age*sex male <.000 Parameter 9% Confidence Limits sex female sex male age*sex female age*sex male

48 48 / 7 Simultaneous test of both sex-effects SAS code proc glm data=juul; where tanner=; class sex; model weight=sex age*sex / noint solution clparm; contrast sex and sex*age sex -, age*sex - ; run; SAS output Contrast DF Contrast SS Mean Square F Value Pr > F sex and sex*age

49 49 / 7 Estimate the expected weight difference at age 4 years SAS code proc glm data=juul; where tanner=; class sex; model weight=sex age*sex / noint solution clparm; estimate difference at 4 sex - age*sex -4 4; run; SAS output Standard Parameter Estimate Error t Value Pr > t difference at Parameter 9% Confidence Limits difference at

50 0 / 7 Summary: Same model, 2 different parametrisations proc glm data=juul; where tanner=; class sex; model weight=sex age age*sex / solution; run; An intercept for the reference group (sex= male ) A difference from sex= female to sex= male An effect of age (slope) for the reference group A difference in slopes from sex= female to sex= male proc glm data=juul; where tanner=; class sex; model weight=sex age*sex / noint solution; run; An intercept for each group (sex) An effect of age (slope) for each group (sex)

51 / 7 Reumatioid Arthritis example including placebo Now we include the Placebo group as dose = 0. Does the Placebo treatment fit in here? No, obviously not. Either we have a placebo effect or a threshold effect. But how could we make a formal test?

52 2 / 7 Model with placebo-effect Including placebo effect in model index = α + β dose + γ I(placebo) + ε gi γ is the deviation of the placebo group from the regression line I(placebo) is the indicator of the placebo-group, is the treatment is placebo, 0 otherwise. Test H 0 : γ = 0, the placebo group is in line with the other doses

53 3 / 7 Model with placebo-effect, SAS code and output SAS code data drug; set drug; active_drug=(dose>0); run; proc glm data=drug; where group ne aspirin ; class active_drug; model index= dose active_drug / solution; run; SAS output Source DF Type I SS Mean Square F Value Pr > F dose <.000 active_drug <.000

54 4 / 7 Model with placebo-effect, parameter estimates SAS output Standard Parameter Estimate Error t Value Pr > t Intercept B <.000 dose <.000 active_drug B <.000 active_drug B... NOTE: The X X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter B are not uniquely estimable. Interpretation The placebo group (active_drug=0) lies an estimated above the expected 0-dose level.

55 Example: Juul-data, Serum IGF- for boys We have seen different age dependencies in separate Tanner stages. Is this simply due to a nonlinear age effect? / 7

56 6 / 7 Modelling non-linear effects Polynomials tend to be very wiggly (use i=rq or i=rc in the symbol-statement) Splines: piecewise interpolations, often linear or cubic

57 7 / 7 Separate lines for each age group Approach Subdivide age into groups, using appropriate thresholds Fit linear effect of age in each age group The result is unconnected lines:

58 8 / 7 Linear splines Approach Subdivide age into groups, using appropriate thresholds Fit linear effect of age in each age group Make the linear pieces meet at the thresholds The result is a bended line, a so-called linear spline:

59 Linear spline explained Using threshold ages 0, 2, 3, and years ssigf = δ 0 + δ age + δ 2 (age 0) + + δ 3 (age 2) + + δ 4 (age 3) + + δ (age ) + where (age x) + = age x if age > x and zero otherwise the line for age 0 has slope β = δ the line for age 0 2 has slope β 2 = δ + δ 2 the line for age 2 3 has slope β 3 = δ + δ 2 + δ 3 the line for age 3 has slope β 4 = δ + δ 2 + δ 3 + δ 4 the line for age 20 has slope β = δ + δ 2 + δ 3 + δ 4 + δ Parameters represent changes in slope δ 2 = β 2 β, δ 3 = β 3 β 2,... 9 / 7

60 60 / 7 Linear spline, SAS code data juul; set juul; /* define new intercept value ssigf-level at age ) */ age=age-; /* define number of years above certain threshold ages */ extra_age0=max(age-0,0); extra_age2=max(age-2,0); extra_age3=max(age-3,0); extra_age=max(age-,0); run; /* fit splines for boys and girls separately */ proc sort data=juul; by sex; run; proc reg data=juul; where age ge and age le 20; by sex; model ssigf=age extra_age0 extra_age2 extra_age3 extra_age; run;

61 6 / 7 Derived threshold variables age_ extra_ extra_ extra_ extra_ Obs age group age0 age2 age3 age

62 62 / 7 Fitting model with threshold variables Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.000 age extra_age extra_age extra_age extra_age <.000

63 63 / 7 Can we reduce the model to a simpler one? Quadratic age effect... not possible - why not? To simple linearity... not reasonable: proc glm data=juul; where age ge and age le 20; by sex; model ssigf=age extra_age0 extra_age2 extra_age3 extra_age / solution; contrast all extra_age0, extra_age2, extra_age3, extra_age ; run; Contrast DF Contrast SS Mean Square F Value Pr > F all <.000

64 64 / 7 Test of adequacy of linear spline? Test against a more complicated model separate regressions for each age group Inclusion of tanner as a Class-variable... Don t forget to check the residual plots!

65 6 / 7 Dose-response curves Example of a typical dose-response relation, for moderate doses We have almost linearity in this dose range (dose 22 38)

66 66 / 7 Dose-response curves full range For extreme doses we see a clear deviation from linearity and: smaller variation in the ends

67 67 / 7 Theoretical dose response relation This looks like sigmoid shape S-shaped, increasing from 0 to 00 % example: a logistic curve response = 00 +γ exp( β log(dose)) Note: the logit transformation is: logit(response) = log ( response 00 response )

68 68 / 7 Example from anaesthesia 47 patients to be operated with two different anesthetics Halothane Neurolept Y: Twitch response at the ulnar nerve (at the thumb), in % X: Dose of muscle relaxantia group=halothane patient dose depression group=neurolept patient dose depression

69 69 / 7 Anaesthesia data Halotan Neurolept

70 70 / 7 Dosis response curve proc gplot; plot logit*dose / haxis=axis vaxis=axis2 frame; axis logbase=2 logstyle=expand value=(h=2) minor=none label=(h=3 Dose on log scale ); axis2 value=(h=2) minor=none label=(a=90 R=0 H=3 Logit transformed response ); symbol v=circle i=sm60 h=3 c=black l= w=2; run; For the logistic dose response model we get linearity by transforming the data using the logit transformation: logit(response) = logγ + β log(dose)

71 7 / 7 Transformation to linearity in the logistic dose response model y x ( ) twitch : logit_twitch = log 00 twitch : logdose = log(dose) Assume linear relation: logit_twitch = α + βlogdose + ε where α = logγ

72 Neurolept: Logistic dose response model seems OK 72 / 7

73 73 / 7 Neurolept: Parameter estimates Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.000 logdose <.000 Parameter Estimates Variable DF 9% Confidence Limits Intercept logdose

74 74 / 7 Deriving estimates for α and β We get estimates of α and β from the equation logit_twitch = α + β logdose But: What about the parameters of interest, i.e. ED 0 and ED 90? From α + β log(ed 0 ) = logit(0) = 0 we get ˆ ED 0 = exp( ˆα/ˆβ) How do we calculate s.e.( ˆ ED 0 )? Re-parametrisation: γ = log(ed 0 ) = α/β γ 2 = log(ed 90 ) = (logit(90) α)/β

75 7 / 7 The model may then be written as: y= logit_twitch = logit(90) x γ γ 2 γ = 2.97 x γ γ 2 γ This function is nonlinear in γ and γ 2! Direct estimation of γ and γ 2 using non-linear regression... more about this in subsequent lectures

Analysis of variance and regression. November 22, 2007

Analysis of variance and regression. November 22, 2007 Analysis of variance and regression November 22, 2007 Parametrisations: Choice of parameters Comparison of models Test for linearity Linear splines Lene Theil Skovgaard, Dept. of Biostatistics, Institute

More information

Answer to exercise 'height vs. age' (Juul)

Answer to exercise 'height vs. age' (Juul) Answer to exercise 'height vs. age' (Juul) Question 1 Fitting a straight line to height for males in the age range 5-20 and making the corresponding illustration is performed by writing: proc reg data=juul;

More information

Nonlinear regression. Nonlinear regression analysis. Polynomial regression. How can we model non-linear effects?

Nonlinear regression. Nonlinear regression analysis. Polynomial regression. How can we model non-linear effects? Nonlinear regression Nonlinear regression analysis Peter Dalgaard (orig. Lene Theil Skovgaard) Department of Biostatistics University of Copenhagen Simple kinetic model Compartment models Michaelis Menten

More information

6. Multiple regression - PROC GLM

6. Multiple regression - PROC GLM Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491

More information

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Analysis of variance and regression. November 29, 2007

Analysis of variance and regression. November 29, 2007 Analysis of variance and regression November 29, 2007 Nonlinear regression: Simple kinetic model Compartment models Michaelis Menten reaction Dose-response relationships Lene Theil Skovgaard, Dept. of

More information

Analysis of Variance

Analysis of Variance 1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen April 22, 2008 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen November 20, 2007 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

The General Linear Model. April 22, 2008

The General Linear Model. April 22, 2008 The General Linear Model. April 22, 2008 Multiple regression Data: The Faroese Mercury Study Simple linear regression Confounding The multiple linear regression model Interpretation of parameters Model

More information

The General Linear Model. November 20, 2007

The General Linear Model. November 20, 2007 The General Linear Model. November 20, 2007 Multiple regression Data: The Faroese Mercury Study Simple linear regression Confounding The multiple linear regression model Interpretation of parameters Model

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Topic 28: Unequal Replication in Two-Way ANOVA

Topic 28: Unequal Replication in Two-Way ANOVA Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant

More information

Correlated data. Introduction. We expect students to... Aim of the course. Faculty of Health Sciences. NFA, May 19, 2014.

Correlated data. Introduction. We expect students to... Aim of the course. Faculty of Health Sciences. NFA, May 19, 2014. Faculty of Health Sciences Introduction Correlated data NFA, May 19, 2014 Introduction Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of Copenhagen The idea of the course

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

1. (Problem 3.4 in OLRT)

1. (Problem 3.4 in OLRT) STAT:5201 Homework 5 Solutions 1. (Problem 3.4 in OLRT) The relationship of the untransformed data is shown below. There does appear to be a decrease in adenine with increased caffeine intake. This is

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Linear Combinations of Group Means

Linear Combinations of Group Means Linear Combinations of Group Means Look at the handicap example on p. 150 of the text. proc means data=mth567.disability; class handicap; var score; proc sort data=mth567.disability; by handicap; proc

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Simple, Marginal, and Interaction Effects in General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744

More information

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Least Squares Analyses of Variance and Covariance

Least Squares Analyses of Variance and Covariance Least Squares Analyses of Variance and Covariance One-Way ANOVA Read Sections 1 and 2 in Chapter 16 of Howell. Run the program ANOVA1- LS.sas, which can be found on my SAS programs page. The data here

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan Topic 22 - Interaction in Two Factor ANOVA - Fall 2013 Outline Strategies for Analysis when interaction not present when interaction present when n ij = 1 when factor(s) quantitative Topic 22 2 General

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

Outline Topic 21 - Two Factor ANOVA

Outline Topic 21 - Two Factor ANOVA Outline Topic 21 - Two Factor ANOVA Data Model Parameter Estimates - Fall 2013 Equal Sample Size One replicate per cell Unequal Sample size Topic 21 2 Overview Now have two factors (A and B) Suppose each

More information

Topic 29: Three-Way ANOVA

Topic 29: Three-Way ANOVA Topic 29: Three-Way ANOVA Outline Three-way ANOVA Data Model Inference Data for three-way ANOVA Y, the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Factor C with levels

More information

Chapter 8 Quantitative and Qualitative Predictors

Chapter 8 Quantitative and Qualitative Predictors STAT 525 FALL 2017 Chapter 8 Quantitative and Qualitative Predictors Professor Dabao Zhang Polynomial Regression Multiple regression using X 2 i, X3 i, etc as additional predictors Generates quadratic,

More information

Lecture 13 Extra Sums of Squares

Lecture 13 Extra Sums of Squares Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

Answer Keys to Homework#10

Answer Keys to Homework#10 Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean

More information

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek Two-factor studies STAT 525 Chapter 19 and 20 Professor Olga Vitek December 2, 2010 19 Overview Now have two factors (A and B) Suppose each factor has two levels Could analyze as one factor with 4 levels

More information

Assignment 9 Answer Keys

Assignment 9 Answer Keys Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Analysis of Covariance

Analysis of Covariance Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2

More information

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household.

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household. Swabs, revisited 18 families with 3 children each (in well defined age intervals) were followed over a certain period of time, during which repeated swabs were taken. The variable swabs indicates how many

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa17.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

PLS205 Lab 2 January 15, Laboratory Topic 3

PLS205 Lab 2 January 15, Laboratory Topic 3 PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Multiple Regression: Chapter 13. July 24, 2015

Multiple Regression: Chapter 13. July 24, 2015 Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)

More information

Multiple Linear Regression

Multiple Linear Regression Chapter 3 Multiple Linear Regression 3.1 Introduction Multiple linear regression is in some ways a relatively straightforward extension of simple linear regression that allows for more than one independent

More information

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

STAT 350. Assignment 4

STAT 350. Assignment 4 STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

In Class Review Exercises Vartanian: SW 540

In Class Review Exercises Vartanian: SW 540 In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

Data Set 8: Laysan Finch Beak Widths

Data Set 8: Laysan Finch Beak Widths Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.

More information

ST Correlation and Regression

ST Correlation and Regression Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH From Basic to Translational: INDIRECT BIOASSAYS INDIRECT ASSAYS In indirect assays, the doses of the standard and test preparations are are applied

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent

More information

One-way between-subjects ANOVA. Comparing three or more independent means

One-way between-subjects ANOVA. Comparing three or more independent means One-way between-subjects ANOVA Comparing three or more independent means Data files SpiderBG.sav Attractiveness.sav Homework: sourcesofself-esteem.sav ANOVA: A Framework Understand the basic principles

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 = Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Use of Dummy (Indicator) Variables in Applied Econometrics

Use of Dummy (Indicator) Variables in Applied Econometrics Chapter 5 Use of Dummy (Indicator) Variables in Applied Econometrics Section 5.1 Introduction Use of Dummy (Indicator) Variables Model specifications in applied econometrics often necessitate the use of

More information

Orthogonal and Non-orthogonal Polynomial Constrasts

Orthogonal and Non-orthogonal Polynomial Constrasts Orthogonal and Non-orthogonal Polynomial Constrasts We had carefully reviewed orthogonal polynomial contrasts in class and noted that Brian Yandell makes a compelling case for nonorthogonal polynomial

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false.

ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false. ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false. 1. A study was carried out to examine the relationship between the number

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Chapter 12: Multiple Regression

Chapter 12: Multiple Regression Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x

More information

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Introductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11)

Introductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11) Introductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11) Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh

More information

Biological Applications of ANOVA - Examples and Readings

Biological Applications of ANOVA - Examples and Readings BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 1 ANOVA Pac Biological Applications of ANOVA - Examples and Readings One-factor Model I (Fixed Effects) This is the same example for

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information