Ph.D. course: Regression models Non-linear effect of a quantitative covariate PKA & LTS Sect. 4.2.1, 4.2.2 8 May 2017 www.biostat.ku.dk/~pka/regrmodels17 Per Kragh Andersen 1
Linear effects We have studied models with the linear predictor: LP i = a + bx i for a quantitative covariate x, both for quantitative outcomes (b is a mean value difference), binary outcomes (b is a log(odds ratio), survival times ( a = log(h 0 (t)), b is a log(hazard ratio)). The slope b has a simple interpretation: change for the linear predictor per 1 unit change in x. Linearity is simple, but restrictive and we need ways of checking the assumption of linearity alternative models to use when linearity fits poorly 2
Approaches Different possibilities are: transformation of x, i.e. LP i = a + bf(x i ); the function f must be known, scatterplot smoother, fine for description, not optimal for inference (Figure next slide), methods based on choosing cut-points for x (Sect. 4.2.1), polynomials (Sect. 4.2.2). The last three approaches may suggest transformations, f. We will use modeling the effect of bilirubin in the PBC-3 trial as illustration, but ideas carry over to quantitative and binary outcomes. 3
Probability of fetal death 0.00 0.01 0.02 0.03 0.04 0 5 10 15 Drinks per week Figure 1: Scatterplot smoother for the binary outcome y (fetal death) when plotted against the covariate x (alcohol consumption). The distribution of x is indicated along the horizontal axis. 4
Using cut-points for the covariate Piecewise constant effect Linear regression splines Quadratic/cubic (restricted) regression splines 5
Linear Predictor 2.0 1.5 1.0 0.5 0.0 0.5 2 4 6 8 10 x Linear Predictor 2.0 1.5 1.0 0.5 0.0 0.5 2 4 6 8 10 x Linear predictor 2.0 1.5 1.0 0.5 0.0 0.5 2 4 6 8 10 x Linear Predictor 2.0 1.5 1.0 0.5 0.0 0.5 2 4 6 8 10 x Figure 2: Illustration of models for the linear predictor that are alternatives to the simple linear model. The dotted curve represents the true relationship. 6
Bilirubin in quintiles Cox regression model with bilirubin categorized in quintiles h i (t) = h 0 (t) if x i 10.3, h 0 (t) exp(b 1 ) if x i (10.3, 16], h 0 (t) exp(b 2 ) if x i (16, 26.7], h 0 (t) exp(b 3 ) if x i (26.7, 51.4], h 0 (t) exp(b 4 ) if x i > 51.4. With dummy variables I(x i 10.3),..., I(x i > 51.4), the linear predictor for individual i is the piecewise constant function LP i (t) = log(h 0 (t)) + b 1 I(10.3 < x i 16) + + b 4 I(x i > 51.4). The estimates in this model are : b 1 = 0.537(0.708), b 2 = 1.120(0.494), b 3 = 1.698(0.460), b 4 = 2.670(0.437). 7
Linear predictor 0 1 2 3 0 100 200 300 400 Bilirubin Figure 3: Estimated linear predictor (solid curve) for the PBC study assuming an effect of serum bilirubin that is piecewise constant in quintile groups. The dashed curve joins values of the linear predictor for the scores attached to each interval of bilirubin. The distribution of bilirubin is shown on the horizontal axis. 8
Using interval scores s(x i ) = 7.66 if x i 10.3 13.26 if x i (10.3, 16] 20.23 if x i (16, 26.7] 37.32 if x i (26.7, 51.4] 148.83 if x i > 51.4 The model with linear predictor log(h 0 (t)) + bs(x i ) is nested in the model with bilirubin categorized and the likelihood ratio test for linearity is 19.0 χ 2 3, P = 0.0003. However, the model with a linear effect of x is not nested in the categorized model. Using plots of pseudo-observations, the fit may be evaluated. 9
time = 0.71 6 4 2 0 2 0 100 200 300 400 Bilirubin time = 1.18 6 4 2 0 2 0 100 200 300 400 Bilirubin time = 2.16 6 4 2 0 2 0 100 200 300 400 Bilirubin time = 3.19 6 4 2 0 2 0 100 200 300 400 Bilirubin Figure 4: The estimated linear predictor for the PBC3 study (assuming an effect of serum bilirubin which is piecewise constant in quintile groups) plotted against bilirubin together with smoothed pseudo-observations. The four panels correspond to quintiles of observed event times. 10
time = 0.71 6 4 2 0 2 1 2 3 4 5 6 log(bilirubin) time = 1.18 6 4 2 0 2 1 2 3 4 5 6 log(bilirubin) time = 2.16 6 4 2 0 2 1 2 3 4 5 6 log(bilirubin) time = 3.19 6 4 2 0 2 1 2 3 4 5 6 log(bilirubin) Figure 5: The estimated linear predictor for the PBC3 study (assuming an effect of log(serum bilirubin) which is piecewise constant in quintile groups) plotted against log(bilirubin) together with smoothed pseudo-observations. The four panels correspond to quintiles of observed event times. 11
Comments The model with a piecewise constant effect of x: is easy to fit and easy to report, does not contain the model with a linear effect of x as a sub-model does not provide a smooth (in fact, not even continuous) relationship is somewhat sensitive to the choice of cut-points 12
Regression splines A regression spline is a function of the form x + i = (x i r)i(x i > r) for some threshold r. Thus, x + i = 0 for x i r and increases linearly with x i from x i = r and upwards. If we compose the linear predictor of several such spline terms: LP i = a + bx i + b 1 x + i1 +... + b 4x + i4 we get a broken linear function (Figure). The parameter b j is the change of slope at cut-point j: slope before first cut-point is b, slope between first and second is b + b 1 etc. Linearity: b 1 = b 2 =... = b 4 = 0. 13
Results for PBC-3 For the PBC3 example we get the estimates: b = 0.245(0.182), b1 = 0.460(0.309), b 2 = 0.122(0.185), b3 = 0.0592(0.0654), b 4 = 0.0278(0.0174). The likelihood ratio test for linearity is 39.13 χ 2 (4), P < 0.001. The linear spline function is now continuous but still not smooth. Linear predictor with quadratic splines: LP i = a + b 1 x i + b 2 x 2 i + b 1,1 (x + i1 )2 +... + b 1,4 (x + i4 )2. No simple interpretation of coefficients, but a smooth curve is obtained. LR test for linearity b 2 = b 1,1 =... = b 1,4 = 0: 40.97 χ 2 5. 14
Linear predictor 1 0 1 2 3 4 0 100 200 300 400 Bilirubin Figure 6: Estimated linear predictor for the PBC3 study assuming an effect of serum bilirubin modeled as a linear spline (dashed), an unrestricted quadratic spline (solid), or a quadratic spline restricted to be linear for bilirubin values above 51.4 (dotted). The distribution of bilirubin is shown on the horizontal axis. 15
Restricted splines The quadratic effect b 2 x 2 i may be quite dramatic for large (both positive and negative) values of x i. This may be avoided using restricted splines, see p. 220. The idea is that for large (positive or negative) x s, the curve is linear instead of quadratic. Also cubic splines may be defined: (x + ij )3. 16
Polynomials The simplest alternative to a linear function is a quadratic function, and a standard test for linearity is obtained by including x 2 and testing whether the corresponding coefficient is b 2 = 0: LP i = a + b 1 x i + b 2 x 2 i. The resulting parabola has minimum/maximum in b 1 2b 2 and it is happy (convex) if b 2 > 0, bad-tempered (concave) if b 2 < 0. For the PBC-3 data: b1 = 0.0227(0.0031), b 2 = 0.0000369(0.00000871). Also higher order polynomials (cubic etc.). Simple approach including simple tests for linearity No simple interpretation of coefficients Influential points 17
Cook s distance 0.00 0.05 0.10 0.15 0.20 0 100 200 300 400 Bilirubin Cook s distance 0.00 0.05 0.10 0.15 0.20 0 1 2 3 4 5 6 Years Figure 7: Cook s distance for the model with a quadratic effect of bilirubin plotted against bilirubin and time: +: observed failure times, o: censored observations. 18
Fractional polynomials Instead of using just x 2 and perhaps x 3, use several powers x q e.g., x = x 0.5, 1/ x = x 0.5, x, 1/x = x 1, x 2, 1/x 2 = x 2, x 3, 1/x 3 = x 3, (and the power q = 0 is taken to mean log(x)). This provides a lot of flexibility but no interpretable coefficients. Since such models are purely descriptive, one often aims at finding best-fitting models with two or three terms. We did that for the PBC-3 study: 19
Table 1: Likelihood ratio tests comparing fractional polynomial models for the effect of bilirubin in the PBC3 study to a model with a linear effect. First column: one additional term in the model; next columns two additional terms in the model. q 1 3 2 1 0.5 0 0.5 2 q 2 3 4.29 2 12.10 19.10 1 25.51 29.32 32.38 0.5 30.69 32.75 34.12 34.09 0 32.32 33.10 33.24 32.57 32.35 0.5 30.56 30.68 30.57 31.08 31.77 32.42 2 20.82 21.47 23.65 28.69 31.17 32.50 32.44 3 16.59 17.88 21.19 28.00 31.06 32.46 32.00 26.59 20
Results The best fitting model with 1 extra term (in addition to just x) is LP i = log(h 0 (t)) + b 1 x i + b 0 log(x i ) with estimates b 1 = 0.000723(0.00222), b 0 = 1.0661(0.202), i.e. the linear term is insignificant and that for log(x) is highly significant. The best fitting model with 2 extra terms is LP i = log(h 0 (t)) + b 1 x i + b 0.5 x 0.5 i + b 2 x 2 i with estimates b 1 = 0.00242(0.00165), b 0.5 = 40.301(13.889), b 2 = 12.575(2.372), the last two terms being highly significant and the linear term insignificant. 21
Linear predictor 1 0 1 2 3 4 0 100 200 300 400 Bilirubin Figure 8: Estimated linear predictor for the PBC study assuming an effect of serum bilirubin which is modeled either as a fractional polynomial with powers 1 and 0 (dashed) or with powers 1, 0.5, and 2 (solid). The distribution of bilirubin is shown on the horizontal axis. 22
Comments Easy to fit models with non-linear effects using a linear predictor - just define appropriate extra covariates. Most such models provide estimates without a simple interpretation (however, linear splines) Note the distinction to truly non-linear models, e.g. the Gompertz growth curve model with E(y i ) = a + b exp(cx i ) for which special software is needed for the fitting 23