Using P-splines to smooth two-dimensional Poisson data

Similar documents
GLAM An Introduction to Array Methods in Statistics

A Hierarchical Perspective on Lee-Carter Models

Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK

P -spline ANOVA-type interaction models for spatio-temporal smoothing

Multidimensional Density Smoothing with P-splines

Flexible Spatio-temporal smoothing with array methods

Recovering Indirect Information in Demographic Applications

Smoothing Age-Period-Cohort models with P -splines: a mixed model approach

Space-time modelling of air pollution with array methods

Modelling trends in digit preference patterns

Smoothing Age-Period-Cohort models with P -splines: a mixed model approach

Estimating prediction error in mixed models

Functional SVD for Big Data

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Triangles in Life and Casualty

Consistent Bivariate Distribution

Ratemaking application of Bayesian LASSO with conjugate hyperprior

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Generalized linear mixed models (GLMMs) for dependent compound risk models

Cohort Effect Structure in the Lee-Carter Residual Term. Naoki Sunamoto, FIAJ. Fukoku Mutual Life Insurance Company

An Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K.

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

Array methods in statistics with applications to the modelling and forecasting of mortality. James Gavin Kirkby

SUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE

Generalized Additive Models

Homework sheet 4: EIGENVALUES AND EIGENVECTORS. DIAGONALIZATION (with solutions) Year ? Why or why not? 6 9

Bayesian Nonparametric Regression for Diabetes Deaths

GB2 Regression with Insurance Claim Severities

Bayesian covariate models in extreme value analysis

Generalized linear mixed models for dependent compound risk models

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Generalized linear mixed models (GLMMs) for dependent compound risk models

Lecture XI. Approximating the Invariant Distribution

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

CHAPTER 3 Further properties of splines and B-splines

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

ABC methods for phase-type distributions with applications in insurance risk problems

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS

Introduction to Machine Learning

mboost - Componentwise Boosting for Generalised Regression Models

Information geometry for bivariate distribution control

The convergence of stationary iterations with indefinite splitting

Modelling the Covariance

Monte Carlo Method for Finding the Solution of Dirichlet Partial Differential Equations

10-725/36-725: Convex Optimization Prerequisite Topics

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Multiplying matrices by diagonal matrices is faster than usual matrix multiplication.

COS 424: Interacting with Data

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Lecture 7. Logistic Regression. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 11, 2016

SMSTC: Probability and Statistics

Theorems. Least squares regression

Computational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

Modelling general patterns of digit preference

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Using Estimating Equations for Spatially Correlated A

MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES

Chapter 7: Model Assessment and Selection

Spatial Process Estimates as Smoothers: A Review

1. Let A be a 2 2 nonzero real matrix. Which of the following is true?

A Quick Tour of Linear Algebra and Optimization for Machine Learning

Big Data Analytics: Optimization and Randomization

3.1 Interpolation and the Lagrange Polynomial

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

Gaussian Graphical Models and Graphical Lasso

CHAPTER 10 Shape Preserving Properties of B-splines

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Math 127C, Spring 2006 Final Exam Solutions. x 2 ), g(y 1, y 2 ) = ( y 1 y 2, y1 2 + y2) 2. (g f) (0) = g (f(0))f (0).

Methodological challenges in research on consequences of sickness absence and disability pension?

Modeling the Covariance

Deposited on: 07 September 2010

PENALIZING YOUR MODELS

Counts using Jitters joint work with Peng Shi, Northern Illinois University

Estimating the term structure of mortality

Penalized Splines, Mixed Models, and Recent Large-Sample Results

Monitoring actuarial assumptions in life insurance

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Linear Models and Estimation by Least Squares

MAT 1332: CALCULUS FOR LIFE SCIENCES. Contents. 1. Review: Linear Algebra II Vectors and matrices Definition. 1.2.

Linear Regression Models P8111

Analysis Methods for Supersaturated Design: Some Comparisons

A short introduction to INLA and R-INLA

Statistics 360/601 Modern Bayesian Theory

MTH5112 Linear Algebra I MTH5212 Applied Linear Algebra (2017/2018)

Bayesian density estimation from grouped continuous data

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Flexible modelling of the cumulative effects of time-varying exposures

Problem # Max points possible Actual score Total 120

Machine Learning for OR & FE

arxiv: v3 [stat.me] 11 Apr 2018

Chapter 5: Generalized Linear Models

Exercise Set Suppose that A, B, C, D, and E are matrices with the following sizes: A B C D E

Neural Networks: Backpropagation

CSL361 Problem set 4: Basic linear algebra

Compressive Inference

Estimation of spatiotemporal effects by the fused lasso for densely sampled spatial data using body condition data set from common minke whales

Maria Cameron Theoretical foundations. Let. be a partition of the interval [a, b].

Transcription:

1 Using P-splines to smooth two-dimensional Poisson data Maria Durbán 1, Iain Currie 2, Paul Eilers 3 17th IWSM, July 2002. 1 Dept. Statistics and Econometrics, Universidad Carlos III de Madrid, Spain. 2 Dept. Actuarial Mathematics and Statistics, Heriot-Watt University, Edinburgh, UK 3 Department of Medical Statistics, Leiden University Medical Center, The Netherlands

What is this talk about? 2

What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case)

What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines

What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines Dicuss computational issues for large data sets.

What is this talk about? 2 Introduction The data P-splines Smoothing Poisson data with P-splines (one dimensional case) Several models for two-dimensional Poisson data. Generalized additive model Two-dimensional smoothing with penalties Dimension reduction using P-splines Dicuss computational issues for large data sets. Analysis of mortality data.

The data 3 Male policyholders, source: Continuous Mortality Investigation Bureau (CMIB). For each calendar year (1947-1999) and each age (11-100) we have: Number of years lived (the exposure). Number of policy claims (deaths). Mortality of male policyholders has improved rapidly over the last 30 years Model mortality trends overtime and dependence on age.

P-spline Use B-splines as the basis for the regression. Modify the log-likelihood by a difference penalty on the regression coefficients. y = f(x) + ɛ f(x) Ba S = (y Ba) (y Ba) + λa D Da â = (B B + λd D) 1 B y 4

P-spline Use B-splines as the basis for the regression. Modify the log-likelihood by a difference penalty on the regression coefficients. y = f(x) + ɛ f(x) Ba S = (y Ba) (y Ba) + λa D Da â = (B B + λd D) 1 B y 4 B-spline basis Scaled B-splines and their sum 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 10 20 30 40 0 10 20 30 40

Poisson data and P-splines, 1D-case 5

Poisson data and P-splines, 1D-case 5 E x = number of years lived aged x y x = number of deaths aged x Y x P (E x θ x ) η = log(θ x ) = Ba Maximise l(a; y x ) 1 2 λa D Da â t+1 = (B W t B + λd D) 1 B W t z t where z = η + W 1 (y µ) is the working variable and W = diag(µ) is the diagonal matrix of weights.

1. A generalized additive model 6

1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years

1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years â t+1 = (B W t B + P ) B W t z t P = blockdiag(0, P a, P y ); P a = λ a D ad a and P y = λ y D yd y are the penalty matrices for age and year

1. A generalized additive model 6 Y = (y ij ) matrix of deaths at age i = 1,..., m and year j = 1,..., n. E = (E ij ), expousure Θ = (θ ij ) and log θ = Ba a = (α, a 1, a 2), B = (1 : B a : B y ) B a, N n a, set of B-splines for age B y, N n y, set of B-splines for years â t+1 = (B W t B + P ) B W t z t P = blockdiag(0, P a, P y ); P a = λ a D ad a and P y = λ y D yd y are the penalty matrices for age and year Smoothing parameter selection dev(y; a, λ a, λ y ) + δ tr(h) δ=2 AIC δ = log(n) BIC

Computational issues 7

Computational issues 7 No need for backfitting closed form for H

Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation

Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation Number of parameters = ncol(b), much smaller than N

Computational issues 7 No need for backfitting closed form for H Singular Matrix Ridge penalty Generalized inverse Use a different parametrisation Number of parameters = ncol(b), much smaller than N Fast when N is large, not posible with cubic smoothing splines

Model 1 8 log(mu) -7.8-7.4-7.0-6.6 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year

2. Two dimensional smoothing with penalties 9

2. Two dimensional smoothing with penalties 9 Suppose log mortalities is a matrix of parameters: log Θ = A = (a 1,..., a n ), A = (a r 1,..., a r m) and impose a smoothness condition on each row and column of A:

2. Two dimensional smoothing with penalties 9 Suppose log mortalities is a matrix of parameters: log Θ = A = (a 1,..., a n ), A = (a r 1,..., a r m) and impose a smoothness condition on each row and column of A: n l(a; Y ) 1 2 λ a a jd a D a a j 1 2 λ y j=1 l(a; y) 1 2 a (λ a P a + λ y P y )a m i=1 a r i D y D y a r i a = (a 1,..., a n), P a = I n D a D a, P y = D y D y I m. â t+1 = (W t + P ) 1 W t z t

Computational issues 10

Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j.

Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j. Copes with the potential computational problems associated with twodimensional smoothing with large data sets.

Computational issues 10 Algorithm: Iterate between rows and columns Working variable to update the column estimates: Z = (z 1,..., z n ) = A + (Y M λ y AD y D y )/M. Updated estimate of a j, j = 1,..., n, is a j = (diag(µ j ) + λ a D a D a ) 1 diag(µ j )z j. Copes with the potential computational problems associated with twodimensional smoothing with large data sets. Problem: tr(h) cannot be calculated AIC, BIC cannot be computed

Model 2 11 log(mu) -7.6-7.2-6.8-6.4 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year

3. Dimension reduction using P -splines 12

3. Dimension reduction using P -splines 12 B a, m n a, one-dimensional B-spline basis for smoothing by age for a single year B y, n n y, one-dimensional B-spline basis for smoothing by year for a single age Assume that log θ = Ba B = B y B a. Equivalent to Model 2 with a in matrix form: A = (a 1,..., a ny ), A = (a r 1,..., a r n a ). l(a; y) 1 2 a (λ a P a + λ y P y )a P a = I ny D a D a and P y = D y D y I na

Computational issues 13

Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it.

Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it. Matrix B is N n a n y storage problems. Solution:

Computational issues 13 bdeg = 0, n a = n, n y = m B = I nm and Model 2 = Model 3, but not possible to fit it. Matrix B is N n a n y storage problems. Solution: work with partitioned matrix B = [B 1, B 2, B 3 ] take advantaje of the banded nature of B

Model 3 14 log(mu) -7.6-7.2-6.8-6.4 Age: 34 log(mu) -5.0-4.8-4.6-4.4-4.2 Age: 60 1950 1970 1990 1950 1970 1990 Year Year

15-2 0-2 0 log(mu) log(mu) -10-8 -6-4 19971987 1977 1967 1957 Year 30 50 70 90 Age -8-6 -4 1997198719771967 1957 Year 30 50 70 90 Age -2 0 log(mu) -8-6 -4 1997198719771967 1957 Year 30 50 70 90 Age

Conclusions and future work 16

Conclusions and future work 16 P -splines are useful tool to model two-dimensional Poisson data Investigate a method for approximating the value of tr(h) in Model 2 Develope methods for dealing with over-dispersion Fit the models in the context of GLMM Comparison with age-period-cohort models

-8-6 -4-2 17 0 Z -10 50 40 30 Y 20 10 20 40 60 80 X