Hanxiang Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis March 4, 2009
Outline Project I: Free Knot Spline Cox Model
Project I: Free Knot Spline Cox Model Consider a parametric Cox model: h(t) = h 0 (t)exp(g θ (t, Z(t))), (1) where g θ is a smooth function. Our candidate of g θ is a free knot spline. i A quadratic free-knot polynomial spline with knots in time: g θ (t, z) = (β 0 + β 1 t + β 2 t 2 + β 2 (t γ) 2 +)z, where θ = [β 1, β 2, β 3, γ] is the parameter of interest. The knot γ can be a threshold value such as a changepoint.
ii A quadratic free-knot polynomial spline with knots in covariates: g θ (z) = β 1 z + β 2 z 2 + β 3 (z κ) 2 +, where θ = [β 1, β 2, β 3, κ] is the parameter of interest. The knot κ can be a threshold value such as a nadir (of BMI), a changepoint, etc.
ii A quadratic free-knot polynomial spline with knots in covariates: g θ (z) = β 1 z + β 2 z 2 + β 3 (z κ) 2 +, where θ = [β 1, β 2, β 3, κ] is the parameter of interest. The knot κ can be a threshold value such as a nadir (of BMI), a changepoint, etc. iii Applications? B-splines? Natural Splines?
iv g θ has first continuous derivative, g θ (z) = [z, z 2, (z κ) 2 1 {z>κ}, 2β 3 (z κ)1 {z>κ} ], but the 2nd derivative does not exist at knot κ.
iv g θ has first continuous derivative, g θ (z) = [z, z 2, (z κ) 2 1 {z>κ}, 2β 3 (z κ)1 {z>κ} ], but the 2nd derivative does not exist at knot κ. v Consistency? Asymptotic Normality? Under continuous first derivative, we have obtained.
iv g θ has first continuous derivative, g θ (z) = [z, z 2, (z κ) 2 1 {z>κ}, 2β 3 (z κ)1 {z>κ} ], but the 2nd derivative does not exist at knot κ. v Consistency? Asymptotic Normality? Under continuous first derivative, we have obtained. vi Consistency? Asymptotic Normality? Knots in covariates: g θ (z) = β 1 z + β 2 (z κ) +, g θ (z) = β(1 z κ 1 z<κ ). Knots in time: g θ (t, z) = (β 0 +β 1 t+β 2 (t γ) + )z, g θ (t, z) = β(1 t γ 1 t<γ )z,
Motivations Data consisting of groups of dependent binary random variables arise commonly in many fields including developmental toxicity studies, longitudinal studies, studies of familial diseases, cluster sample surveys and others.
Motivations Data consisting of groups of dependent binary random variables arise commonly in many fields including developmental toxicity studies, longitudinal studies, studies of familial diseases, cluster sample surveys and others. Independence assumption is not appropriate in many areas of science. E.g., in developmental toxicity study, fetuses from the same litter are correlated and may respond more similarly to a stimulus than fetuses from different litters. This intra-litter correlation causes over-dispersion (extra-binomial variation).
Motivations Data consisting of groups of dependent binary random variables arise commonly in many fields including developmental toxicity studies, longitudinal studies, studies of familial diseases, cluster sample surveys and others. Independence assumption is not appropriate in many areas of science. E.g., in developmental toxicity study, fetuses from the same litter are correlated and may respond more similarly to a stimulus than fetuses from different litters. This intra-litter correlation causes over-dispersion (extra-binomial variation). The perception of exchangeability, intensely studied over the past century, is meant to capture the notion of symmetry in a collection of random variables and is often used as an alternative to independence.
Disadvantages of Some Commonly Used Methods The statistical inference based on the conditional analysis does not use any information about the distribution of the latent effects since it is lost in the conditioning.
Disadvantages of Some Commonly Used Methods The statistical inference based on the conditional analysis does not use any information about the distribution of the latent effects since it is lost in the conditioning. The EM algorithm method is conceptually simple, but the computations may be formidable because each E-step in the computation may require a numerical integration.
Disadvantages of Some Commonly Used Methods The statistical inference based on the conditional analysis does not use any information about the distribution of the latent effects since it is lost in the conditioning. The EM algorithm method is conceptually simple, but the computations may be formidable because each E-step in the computation may require a numerical integration. The empirical Bayes approach may also be computationally difficult.
Disadvantages of Some Commonly Used Methods The statistical inference based on the conditional analysis does not use any information about the distribution of the latent effects since it is lost in the conditioning. The EM algorithm method is conceptually simple, but the computations may be formidable because each E-step in the computation may require a numerical integration. The empirical Bayes approach may also be computationally difficult. The quasi-likelihood and GEE approach (e.g. Liang, Qaqish and Zeger, 1992) only uses the first two moments, while higher order of correlation is approximated by a working matrix.
The Proposed Approach By relaxing independence to exchangeability, we introduce a rich family of parsimonious distributions resulted from completely monotonic functions, extending GLMs.
The Proposed Approach By relaxing independence to exchangeability, we introduce a rich family of parsimonious distributions resulted from completely monotonic functions, extending GLMs. We present a general framework that unifies the existing procedures such the beta-binomial, Williams procedures, and give many new distributions.
The Proposed Approach By relaxing independence to exchangeability, we introduce a rich family of parsimonious distributions resulted from completely monotonic functions, extending GLMs. We present a general framework that unifies the existing procedures such the beta-binomial, Williams procedures, and give many new distributions. Our approach uses all the distributional information including, of course, all order moments, so that it is a full likelihood approach.
The Proposed Approach By relaxing independence to exchangeability, we introduce a rich family of parsimonious distributions resulted from completely monotonic functions, extending GLMs. We present a general framework that unifies the existing procedures such the beta-binomial, Williams procedures, and give many new distributions. Our approach uses all the distributional information including, of course, all order moments, so that it is a full likelihood approach. The pmf s are simple both mathematically and computationally.
Exchangeable Binomial B 1,...,B m are exchangeable if for every {0, 1}-valued variables b 1,...,b m, P(B 1 = b 1,...,B m = b m ) = P(B π1 = b 1,...,B πm = b m ), (2) for every permutation π 1,..., π m of 1,...,m.
Exchangeable Binomial B 1,...,B m are exchangeable if for every {0, 1}-valued variables b 1,...,b m, P(B 1 = b 1,...,B m = b m ) = P(B π1 = b 1,...,B πm = b m ), (2) for every permutation π 1,..., π m of 1,...,m. Let Y = B 1 +... + B m. Then P(Y = y) = ( ) m y m ( ) m y ( 1) k λ y+k, y = 0, 1,...,m, y k k=0 where λ 0 = 1, λ k = P(B 1 = 1,...,B k = 1), 1 k m are the marginal probabilities.
Complete Monotonicity The marginal probabilities λ = {λ i : i = 0, 1,...,m} (λ 0 = 1) are completely monotone (CM): ( 1) k k λ i 0, i = 0, 1,...,m, k + i m, (3) where is the difference operator: λ i = λ i+1 λ i.
Complete Monotonicity The marginal probabilities λ = {λ i : i = 0, 1,...,m} (λ 0 = 1) are completely monotone (CM): ( 1) k k λ i 0, i = 0, 1,...,m, k + i m, (3) where is the difference operator: λ i = λ i+1 λ i. Write θ = (θ, ϑ), where θ is a parameter of interest, while ϑ is treated as a nuisance parameter. We model λ j = h j (β X; ϑ), j = 0, 1, 2,...,M.
Complete Monotonicity The marginal probabilities λ = {λ i : i = 0, 1,...,m} (λ 0 = 1) are completely monotone (CM): ( 1) k k λ i 0, i = 0, 1,...,m, k + i m, (3) where is the difference operator: λ i = λ i+1 λ i. Write θ = (θ, ϑ), where θ is a parameter of interest, while ϑ is treated as a nuisance parameter. We model λ j = h j (β X; ϑ), j = 0, 1, 2,...,M. Generalizing the GLMs: (1) CM links extend usual links; (2) exchangeable binomials extend binomials.
Table: The Complete Monotone Links. Name Link(θ = (θ 1, θ 2 )) Parameters Ind-Bin θ t θ (0, 1) MM-Bin θ/(θ + t) θ (0, ) Beta-Bin B(θ 1 + t, θ 2 )/B(θ 1, θ 2 ) θ (0, ) 2 Gamma-Bin (1 + θ 2 t) θ 1 θ (0, ) 2 Poisson-Bin exp(θ(e t 1)) θ (0, ) Normal-Bin 2 exp((σt) 2 /2)(1 Φ(σt)) σ 2 (0, )
Efficient Estimation in a Semiparametric Copula Model Sklar Theorem. Let H be a joint CDF with margins F and G. Then there exists a copula C such that: H(x, y) = C(F(x), G(x)), x, y R. If F and G are continuous, then C is unique. Parametric copulas: Archimedean copulas, Pareto copulas, Placket copulas, etc.
Efficient Estimation in a Semiparametric Copula Model Sklar Theorem. Let H be a joint CDF with margins F and G. Then there exists a copula C such that: H(x, y) = C(F(x), G(x)), x, y R. If F and G are continuous, then C is unique. Parametric copulas: Archimedean copulas, Pareto copulas, Placket copulas, etc. Suppose F = Fα and G = G β, while C is completely unspecified. Suppose we have bivariate data (X i, Y i ) s. How can we efficiently estimate C? Two issues: (1) How good can we estimate? (2) How to construct efficient estimate?