1 Using P-splines to smooth two-dimensional Poisson data Maria Durbán 1, Iain Currie 2, Paul Eilers 3 17th IWSM, July 2002. 1 Dept. Statistics and Econometrics, Universidad Carlos III de Madrid, Spain. 2 Dept. Actuarial Mathematics and Statistics, Heriot-Watt University, Edinburgh, UK 3 Department of Medical Statistics, Leiden University Medical Center, The Netherlands
What is this talk about? 2
What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case)
What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines
What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines Dicuss computational issues for large data sets.
What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines Dicuss computational issues for large data sets. Analysis of mortality data.
The data 3 Male policyholders, source: Continuous Mortality Investigation Bureau (CMIB). For each calendar year (1947-1999) and each age (11-100) we have: Number of years lived (the exposure). Number of policy claims (deaths). Mortality of male policyholders has improved rapidly over the last 30 years Model mortality trends overtime and dependence on age.
P-spline Use B-splines as the basis for the regression. Modify the log-likelihood by a difference penalty on the regression coefficients. y = f(x) + ɛ f(x) Ba S = (y Ba) (y Ba) + λa D Da â = (B B + λd D) 1 B y 4
P-spline Use B-splines as the basis for the regression. Modify the log-likelihood by a difference penalty on the regression coefficients. y = f(x) + ɛ f(x) Ba S = (y Ba) (y Ba) + λa D Da â = (B B + λd D) 1 B y 4 B-spline basis Scaled B-splines and their sum 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 10 20 30 40 0 10 20 30 40
Poisson data and P-splines, 1D-case 5
Poisson data and P-splines, 1D-case 5 E x = number of years lived aged x y x = number of deaths aged x Y x P (E x θ x ) η = log(θ x ) = Ba Maximise l(a; y x ) 1 2 λa D Da â t+1 = (B W t B + λd D) 1 B W t z t where z = η + W 1 (y µ) is the working variable and W = diag(µ) is the diagonal matrix of weights.
1. A generalized additive model 6
1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years
1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years â t+1 = (B W t B + P ) B W t z t P = blockdiag(0, P a, P y ); P a = λ a D ad a and P y = λ y D yd y are the penalty matrices for age and year
1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years â t+1 = (B W t B + P ) B W t z t P = blockdiag(0, P a, P y ); P a = λ a D ad a and P y = λ y D yd y are the penalty matrices for age and year Smoothing parameter selection dev(y; a, λ a, λ y ) + δ tr(h) δ=2 AIC δ = log(n) BIC
Computational issues 7
Computational issues 7 No need for backfitting closed form for H
Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation
Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation Number of parameters = ncol(b), much smaller than N
Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation Number of parameters = ncol(b), much smaller than N Fast when N is large, not posible with cubic smoothing splines
Model 1 8 log(mu) -7.8-7.4-7.0-6.6 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year
2. Two dimensional smoothing with penalties 9
2. Two dimensional smoothing with penalties 9 Suppose log mortalities is a matrix of parameters: log Θ = A = (a 1,..., a n ), A = (a r 1,..., a r m) and impose a smoothness condition on each row and column of A:
2. Two dimensional smoothing with penalties 9 Suppose log mortalities is a matrix of parameters: log Θ = A = (a 1,..., a n ), A = (a r 1,..., a r m) and impose a smoothness condition on each row and column of A: n l(a; Y ) 1 2 λ a a jd a D a a j 1 2 λ y j=1 l(a; y) 1 2 a (λ a P a + λ y P y )a m i=1 a r i D y D y a r i a = (a 1,..., a n), P a = I n D a D a, P y = D y D y I m. â t+1 = (W t + P ) 1 W t z t
Computational issues 10
Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j.
Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j. Copes with the potential computational problems associated with twodimensional smoothing with large data sets.
Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j. Copes with the potential computational problems associated with twodimensional smoothing with large data sets. Problem: tr(h) cannot be calculated AIC, BIC cannot be computed
Model 2 11 log(mu) -7.6-7.2-6.8-6.4 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year
3. Dimension reduction using P -splines 12
3. Dimension reduction using P -splines 12 B a, m n a, one-dimensional B-spline basis for smoothing by age for a single year B y, n n y, one-dimensional B-spline basis for smoothing by year for a single age Assume that log θ = Ba B = B y B a. Equivalent to Model 2 with a in matrix form: A = (a 1,..., a ny ), A = (a r 1,..., a r n a ). l(a; y) 1 2 a (λ a P a + λ y P y )a P a = I ny D a D a and P y = D y D y I na
Computational issues 13
Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it.
Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it. Matrix B is N n a n y storage problems. Solution:
Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it. Matrix B is N n a n y storage problems. Solution: work with partitioned matrix B = [B 1, B 2, B 3 ] take advantaje of the banded nature of B
Model 3 14 log(mu) -7.6-7.2-6.8-6.4 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year
15-2 0-2 0 log(mu) log(mu) -10-8 -6-4 19971987 1977 1967 1957 Year 30 50 70 90 Age -8-6 -4 1997198719771967 1957 Year 30 50 70 90 Age -2 0 log(mu) -8-6 -4 1997198719771967 1957 Year 30 50 70 90 Age
Conclusions and future work 16
Conclusions and future work 16 P -splines are useful tool to model two-dimensional Poisson data Investigate a method for approximating the value of tr(h) in Model 2 Develope methods for dealing with over-dispersion Fit the models in the context of GLMM Comparison with age-period-cohort models
-8-6 -4-2 17 0 Z -10 50 40 30 Y 20 10 20 40 60 80 X