STA 216 Project: Spline Approach to Discrete Survival Analysis

Similar documents
6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

A proposed nonparametric mixture density estimation using B-spline functions

A. Distribution of the test statistic

CS229 Lecture notes. Andrew Ng

A Comparison Study of the Test for Right Censored and Grouped Data

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

A Brief Introduction to Markov Chains and Hidden Markov Models

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

Stochastic Variational Inference with Gradient Linearization

Separation of Variables and a Spherical Shell with Surface Charge

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

Explicit overall risk minimization transductive bound

Appendix for Stochastic Gradient Monomial Gamma Sampler

The EM Algorithm applied to determining new limit points of Mahler measures

Statistical Learning Theory: A Primer

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

Research Article Analysis of Heart Transplant Survival Data Using Generalized Additive Models

Appendix for Stochastic Gradient Monomial Gamma Sampler

Math 124B January 31, 2012

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Chemical Kinetics Part 2

4 1-D Boundary Value Problems Heat Equation

Haar Decomposition and Reconstruction Algorithms

HYDROGEN ATOM SELECTION RULES TRANSITION RATES

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Two-sample inference for normal mean vectors based on monotone missing data

XSAT of linear CNF formulas

AST 418/518 Instrumentation and Statistics

Statistical Astronomy

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

II. PROBLEM. A. Description. For the space of audio signals

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

Testing for the Existence of Clusters

Technical Appendix for Voting, Speechmaking, and the Dimensions of Conflict in the US Senate

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Statistics for Applications. Chapter 7: Regression 1/43

Learning Fully Observed Undirected Graphical Models

A GENERALIZED SKEW LOGISTIC DISTRIBUTION

Chemical Kinetics Part 2. Chapter 16

Analysis of rounded data in mixture normal model

Process Capability Proposal. with Polynomial Profile

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

Problem set 6 The Perron Frobenius theorem.

Efficiently Generating Random Bits from Finite State Markov Chains

Week 6 Lectures, Math 6451, Tanveer

Some Measures for Asymmetry of Distributions

THINKING IN PYRAMIDS

Melodic contour estimation with B-spline models using a MDL criterion

Partial permutation decoding for MacDonald codes

Statistical Learning Theory: a Primer

<C 2 2. λ 2 l. λ 1 l 1 < C 1

4 Separation of Variables

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

FOURIER SERIES ON ANY INTERVAL

A Novel Learning Method for Elman Neural Network Using Local Search

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS

Unit 48: Structural Behaviour and Detailing for Construction. Deflection of Beams

6 Wave Equation on an Interval: Separation of Variables

arxiv:hep-ph/ v1 15 Jan 2001

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

MONTE CARLO SIMULATIONS

LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS

Fast Blind Recognition of Channel Codes

Auxiliary Gibbs Sampling for Inference in Piecewise-Constant Conditional Intensity Models

Asynchronous Control for Coupled Markov Decision Systems

A MODEL FOR ESTIMATING THE LATERAL OVERLAP PROBABILITY OF AIRCRAFT WITH RNP ALERTING CAPABILITY IN PARALLEL RNAV ROUTES

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

Online Appendices for The Economics of Nationalism (Xiaohuan Lan and Ben Li)

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design

Discrete Techniques. Chapter Introduction

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

A GENERAL METHOD FOR EVALUATING OUTAGE PROBABILITIES USING PADÉ APPROXIMATIONS

Target Location Estimation in Wireless Sensor Networks Using Binary Data

Applied Nuclear Physics (Fall 2006) Lecture 7 (10/2/06) Overview of Cross Section Calculation

Inference Using Biased Coin Randomization. Victoria Plamadeala

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

A Statistical Framework for Real-time Event Detection in Power Systems

Statistical Inference, Econometric Analysis and Matrix Algebra

From Margins to Probabilities in Multiclass Learning Problems

More Scattering: the Partial Wave Expansion

Random maps and attractors in random Boolean networks

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

8 Digifl'.11 Cth:uits and devices

Fitting Algorithms for MMPP ATM Traffic Models

General Certificate of Education Advanced Level Examination June 2010

Copyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU

$, (2.1) n="# #. (2.2)

Discrete Techniques. Chapter Introduction

17 Lecture 17: Recombination and Dark Matter Production

Paragraph Topic Classification

Lecture 6 Povh Krane Enge Williams Properties of 2-nucleon potential

Improving the Reliability of a Series-Parallel System Using Modified Weibull Distribution

Technical Data for Profiles. Groove position, external dimensions and modular dimensions

Transcription:

: Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing approach. By seecting a set of knots and grouping the continuous surviva time into bins, we can buid a discrete hazard mode on continuous surviva data. For exampe, put piecewise constant baseine hazard in Cox s semi-parametric proportiona hazard mode woud ead to the same resut as a compementary og og ink to a discrete hazard mode. Even in the continuous surviva anaysis, grouping the data into bins wi have some nice interpretation and modeing perspective. For instance, [Gustafson et a.(003)gustafson, Aeschiman & R.] use MCMC method to fit a piecewise inear og-gamma hazard mode. This cass project investigates the piecewise inear discrete hazard mode. Section describes the mode at fixed knots, and the possibe mode averaging approach. Section 3 describes an important samping stretage to cacuate the margina ikeihood for a given mode, hence provide criterions of comparing modes with different knots specification. We aso incude the BIC approximation here for comparison. Section 4 appies our method to a rea data set in a toxicoogy and carcinogenesis study of ethybenzene. Section 5 discusses the imitation of our work and proposes possibe ways to modify this approach. Piecewise Linear Discrete Hazard Mode Suppose we have fixed knots 0 x 1 < x < < x C < x C+1, where x 1 is fixed(coud be zero, or some positive vaue, meaning data truncated beow certain initia period). For simpicity, suppose there is no covariates at this stage(we sha incude the dose effect as a covariate in section 4). Another assumption is made that ife time between x 1 and x C are a observed, and ife time between x C and x C+1 are a censored to the right. This is ony for notation simpicity, and generay, this method coud dea with paired observations (t i,δ i ), where δ i denotes the censoring state. Denote I j = [x j,x j+1 ) is the jth bin, then the usua discrete hazard mode can be written as P(T I j T j I ) = Φ(α j ), j = 1,,...,C 1. (1) then the ikeihood of the time is { Φ (α P(T) = j ) j 1 =1 [1 Φ(α )], if j < C; C 1 =1 [1 Φ(α )], if j = C. where T I j. () 1

A direct modification of this mode is to repace α j in equation (1) and () with a piecewise inear function of T. Whie keeping the mode in the discrete time framework, we change the discrete hazard with ( P(T T j I ) = Φ a j + T x ) j (a j+1 a j ), if T I j, for j {1,...,C 1}. (3) x j+1 x j and we coud pick some points in I j, and evauate the hazard at those point to represent the average hazard in that interva. For exampe, we chose the mid-points of I j, which is x j+x j+1, and ( ) aj + a j+1 P(T I j T j I ) = Φ, j {1,,...,C 1}. (4) Combining (3) and (4), we can get the ikeihood of T as ( ) Φ a j + T x j j 1 [ ] x P(T) = j+1 x j (a j+1 a j ) =1 1 Φ( a +a +1 ), if j < C; [ ] C 1 =1 1 Φ( a +a +1 ), if j = C. (5) where T I j. We can use data augmentation trick to impement this method. Suppose T i I j with j < C, repace T i with a ength j vector y (i) = (0,...,0,1) T and corresponding j C design matrix X (i) as Let z (i) where X (i) = 1 1 0... 0 0 0 0... 0 1 1 0... 0 0 0 0... 0... = 1 1 0 0 0... 0 0... 0 T x 0 0 0... 0 j x j+1 x j 0... 0 x j+1 T x j+1 x j N(X (i) T a,1), where a = (a1,...,a C 1,a C ) T, then P(z (i) < 0) = P(y (i) = 0) = 1 Φ(X (i) T a), As before, the joint ikeihood can be written as j(i) j(i) n n π(z, a y, X) = π(y (i) z (i) ) π(y (i) z (i) ) = 1 (i) z i=1 =1 >0 y(i) i=1 =1 P(z (i) π(z (i) y (i) 0) = P(y (i),x (i) X (i) T 1 X (i) T... X (i) T j 1 X (i) T j (6) = 1) = Φ(X (i) T a). (7),a) π(a). (8) + 1 (i) z (1 y(i) 0 ). (9) The fu conditionas of z (i) s are truncated normas as { π(z (i) y (i),x (i) N,a) + (z (i) ;X (i) T (i) a), when y = 1, N (z (i) ;X (i) T (i) a), when y = 0. (10)

If we have norma prior on a N(a 0,Σ 0 ), then the fu conditiona of a is π(a z,y,x) = N(ˆΣ(Σ 1 0 a 0 + X T z), ˆΣ = (Σ 1 0 + X T X) 1 ). (11) Equation (11) can be easiy modified to a truncated norma distribution, if we have a truncated norma prior on a. Suppose we have a covariate d caed dose eve, and we are interested in the differences of surviva among those different dose groups. Denote d = 1 is the contro, and there is an increasing order of dose from 1,,..., up to the maximum eve D. Repace the th eement a in vector a with d h=1 b[h]. We can interpret b [0] as the baseine hazard, and b [d] as the hazard difference between dose group d and d 1 for d > 1. A natura seection of prior for b is that norma with arge variance on b [1], and mixture of point mass at 0 with a truncated norma beow zero for b [d] when d > 1, if the dose is supposed to be detrimenta. Or we can remove the sign constrain when there is no prior information that the dose is good or bad. Foowing this prior, we can appy gibbs agorithm, as the fu conditionas are described in equation (10) and (11). Note, (11) then becomes a mixture of point mass at zero and a (truncated) norma distribution. Discarding the first severa iterations of the gibbs samper for convergence, we can obtain posterior sampers of the mode. We can count the proportions of each mode in the samper, and this can be viewed as approximations of the posterior probabiities of the mode, as ong as the mode space is not too arge. We can obtain a averaged mode based on the posterior sampers, or we can cacuate the best severa modes based on the samper. 3 Margina Likeihood For a fixed knots, we can compare different modes under that knots setting as described in section, but we may aso interested comparing different modes under different knots setting. Given mode γ, i.e. we fixed a set of knots, and denote θ as the non-zero parameters in the mode(a or b [d] in section ). The prior on θ given γ is no onger mixtures, but norma or truncated norma. We woud ike to estimate the margina ikeihood of the data given this mode p(y γ) = p(y θ, γ)π(θ γ) dθ (1) = p(y θ,γ) π(θ γ) q(θ γ)dθ (13) q(θ γ) We cannot sampe θ directy from the prior, and cacuate the Monte Caro estimates of equation (1), because there maybe arge different between the prior and the posterior distribution of θ, and the direct estimation has very ow efficiency. However, if we can come up with good estimates of the posterior of θ, say q(θ), then an important samping on (13) woud have fairy high efficiency. Actuay, we can run our gibbs samping agorithm in section, and get the posterior sampers of θ. Cacuate the estimates of the first two moments of each θ, and reconstruct norma or truncated norma distributions centered on the first moment estimator, with variance equa to the second moment estimator, and ca this distribution q(θ). Sampe θ (i) from q(θ), and compute p(y θ (i),γ) with weights w i = π(θ (i) γ)/q(θ (i) γ), and estimate p(y γ) N i=1 w ip(y θ (i),γ) N i=1 w. (14) i 3

A comparison of this cacuation and mode proportion in the posterior sampes for a fixed knots specification is shown in section 4, which indicates this method is reasonabe. Moreover, this method can aso compare modes with different knots specification. Another method is to cacuate the margina ikeihood is to use Lapace approximation. However, Lapace approximation approach cannot dea with the sign constraints, such as the truncated norma prior. Nevertheess, we sha briefy describe the BIC approximation using Lapace method in the foowing paragraphs. Again we want to estimate equation (1), for carity, we ignore the mode γ in foowing discussion, since we ony focus on the ikeihood for one mode. Denote g(θ) = og{p(y θ)π(θ)}, by Tayor expansion, we have g(θ) = g( θ) + (θ θ) T g ( θ) + 1 (θ θ) T g ( θ)(θ θ) + o( θ θ ) (15) Set θ to be the maximum point, then g ( θ) = 0. The approximation is not good uness θ is cose to θ. However, when n is arge, this rationa is vaid, and for a forma argument see [Tierney & Kadane(1986)]. Then p(y) = exp[g(θ)] dθ (16) exp[g( θ)] exp[ 1 (θ θ) T g ( θ)(θ θ)]dθ (17) = exp[g( θ)](π) d/ g ( θ) 1/. (18) where d is the dimension of θ, and the error in equation (18) is O(n 1 ) on the og scae. In arge sampes, θ = ˆθ, which ˆθ is the MLE, and g ( θ) ni, where i is the expected Fisher information matrix for one observation. Therefore g ( θ) n d i, and this introduces O(n 1 ) error, hence equation (18) becomes og p(y) = og p(y ˆθ) + og π(ˆθ) + d og(π) d og n 1 og i + O(n 1 ) (19) Suppose the prior π(θ) is mutivariate norma with mean ˆθ and variance matrix i 1. It seems to be a reasonabe representation that the prior contains the same amount of information as, on average, one observation. Then og p(ˆθ) = d og(π) + 1 og i, (0) substituting (0) into (19) gives og p(y) = og p(y ˆθ) d og n + O(n 1 ). (1) When the baseine mode is a saturated mode, et D res (γ) denotes the difference in twice the og-ikeihood between saturated mode and current mode γ. Then the BIC for mode γ, denoted as BIC γ can be written as BIC γ = (n og n D res (γ)) + dog n. () In R, we can use the standard gm function to give the output of D res, and use equation () to cacuate the BIC. Moreover, the BM A package in R wi provide the automated BIC for modes 4

at a given knots setting. Generay speaking, we can use the margina ikeihood(or the deviance) to compare/seect/average different modes under different knots setting, or we can use other criterions ike BIC to perform simiar anaysis. 4 Exampe We appied our method to a Nationa Toxicoogy Program (NTP) year study of ethybenzene (CASE No. 100-41-4). Groups of 50 mae F344/N rats were exposed to 0, 75, 50 or 750 ppm ethybenzene by inhaation, 6 hours per day, 5 days per week, for years. The number of surviva rats in each group is 15, 14, 13 and respectivey, and they are a kied on day 734. The histogram of the surviva time in each group(without censored data) is shown in Figure 1. As described in section, et the number of dose group D = 4 here. Put norma prior on b [1], and mixture prior on b [d], d > 1. We experimented with four choice of knots, shown in Tabe 1. We tried the mode averaging agorithm, with and without the sign constrains in b [d], d > 1. For each mixture prior, set equa probabiity of 0.5 for each coefficient to be in or out of the mode. The norma or truncated norma part of the prior used mean 0 and standard deviation of 5. Discard first 000 iterations, which is sufficient for a burning period here, we averaged the hazard curve and surviva curve based on an additiona 5000 iterations. The resut are shown through Figure to 9. We concude that there is not much difference among the first three dose groups, but the dose 750 group differs greaty from the other three groups. The dose 750 group have much higher hazard compared to the other groups, and it suggest the hazard curve are amost the same for before day 505, and then the dose 750 group deveops higher risk than the rest groups by knots choice 3 and 4. However, the second statement may come from the artifact of the random choice of knots. In any event, the anaysis woud agree the concusion from NTP: there was cear evidence of carcinogenic activity of ethybenzene in mae F344/N rats based on increased incidences of rena tubue neopasms. For knots choice with no sign constraints, we run the gibbs samper, and obtained 0,000 posterior sampes discard the first 3,000 iterations. Tabe shows the best 6 modes in the posterior sampes, its number of appearance(namey counts), and reative proportion of counts among the six modes. It aso incude a point estimate(important samping mean) of the ikeihood, and its reative proportion among the six modes(for uniform prior on mode space). We can see that athough there are some numeric uncertainty for the cacuation of the margina ikeihood, the method described in section 3 shoud work. We aso compared our method of mode seection, and the inherent function bic.gm() in the BMA package, and they end up having the same best mode, but they woud differ at the choice of the second best, third best modes etc, see Tabe 3. However, there are some other compications. Suppose we use knots choice 1, but vary the second knot x from day 430 to day 670 for some grid of points. For each setting, we find the best mode through the gibbs agorithm, and cacuate the margina ikeihood. We pot this ikeihood versus x in Figure 10. It behaves as a U-shape, which indicates the knots near the boundaries wi have higher ikeihood, hence more preferabe. If using the Lapace approximation, we end up having simiar U-shaped ikeihood, see Figure 11, by cacuating the deviance. The BIC vaues for these best modes under each knots setting woud prefer smaer x, shown in Figure 1, and this is not as what we expected. Another probem is to compare the best/averaged mode for different number of knots. In that case, the more knots, the ess ikey for the augmented data. We sha discuss these two probems in section 5. 5

5 Discussion The two probems arisen in the ast paragraph of section 4 do not appear by accident. It is the property of the discrete hazard mode(both piecewise constant and piecewise inear). Suppose the number of knots are given, and we ony move the position of one knot. For simpicity, et s just ook at the exampe in section 4. As the knot x moving from x 1 to x 3, most of the data points t i in I 1 I wi have corresponding y (i) switching from (0,1) to (1). The mode woud end up picking some estimate of a to favor the majority augmented data, which is (0,1) when x is near x 1, and is (1) when x is near x 3. When x is not cose to its boundaries x 1 and x 3, a the data ies in I 1 I woud have a baanced amount of (0,1) and (1), which wi bring down the margina ikeihood. As for BIC, a smaer x woud eads to a onger vector of y in genera, hence exercising ess penaty on the BIC term if the modes have the same/simiar dimensions (Note that the derivative of n in equation () is aways negative). Simiary, when the number of knots increases, the ength of y wi increase generay, hence modes with more knots wi have a smaer BIC. Simiar argument hods for the margina ikeihood of the data. We are no onger in a situation that to compare/seect/average modes for fixed sampe size n, and the criterion ony penaizes on the dimension of the mode p. Here we have n and p varying in the fu mode space. Margina ikeihood and BIC are bad in terms of mode comparison/seection/averaging. In terms of computationa efficiency, the Lapace approximation is much faster, but it cannot dea with constrains in the coefficients. However, if the constraint is not a big issue, and if we have a reasonabe criteria to compare the modes at different knots setting, we coud use reversibe jump(green, 1995) to move in the mode space. One possibe way to overcome the difficuty is to switch back to continuous time framework, and then the dimension of data T is fixed. Another improvement coud be incorporating the historica contro data to improve the baseine estimation, e.g. [Dunson & Dinse(000)]. References [Dunson & Dinse(000)] Dunson, D. B. & Dinse, G. E. (000). Distinguishing effects on tumor mutipicity and growth rate in chemoprevention experiments. Biometrics 56, 1068 1075. [Gustafson et a.(003)gustafson, Aeschiman & R.] Gustafson, P., Aeschiman, D. & R., L. A. (003). A simpe approach to fitting bayesian survia modes. Lifetime data anaysis 9, 5 19. [Tierney & Kadane(1986)] Tierney, L. & Kadane, J. B. (1986). Accurate approximations for posterior moments and margina densities. Journa of the American Statistica Association 81, 8 86. 6

Tabe 1: Knots choice used in section method. Knots Choice 1 3, 631, 79, 743 Knots Choice 3, 561, 631, 79, 743 Knots Choice 3 169, 393, 505, 561, 617, 673, 701, 79, 743 Knots Choice 4 169, 393, 505, 561, 617, 645, 673, 701, 79, 743 Tabe : Gibbs samping method: best modes for knots choice. Rank Mode Variabes Counts(%) og ikeihood(%) 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 515(54.70) -65.51(57.18) 1 1 1 1 0 0 0 0 0 0 0 1 0 1 0 1 1150(1.1) -67.01(1.86) 3 1 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 951(10.10) -67.50(7.8) 4 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 9(9.79) -67.3(9.39) 5 1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 771(8.19) -67.54(7.53) 6 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 47(5.01) -67.90(5.3) Tabe 3: BIC method: best modes for knots choice. p!=0 EV SD mode 1 mode mode 3 mode 4 mode 5 Intercept 100.0018 0.3887.981 1.9311.999.68 1.9005 X1 100.0 1.76115 0.6086 1.880 1.3470.1075 1.7645 1.830 X 100.0 1.70449 0.3661 1.8407 1.4761 1.6017 1.8641 1.4990 X3 100.0.91690 0.5388.9436.5751 3.0 3.1043.7348 X4 0.0 0.00000 0.0000..... X5 0.0 0.00000 0.0000..... X6 0.0 0.00000 0.0000..... X7 0.0 0.00000 0.0000..... X8 3.5 0.0519 0.1676..... X9 0.0 0.00000 0.0000..... X10 0.0 0.00000 0.0000..... X11 8.8 0.06580 0.585..... X1.0 0.3319 0.7105. 1.5067.. 1.5066 X13 78.8 0.84908 0.5635 0.9640 1.4111. 0.970 1.4158 X14 3.5 0.3450 0.475.. 1.1593.. X15 58.6 1.3878 1.350.3195.3181... nvar 5 6 4 4 5 BIC 44.9918 441.571 441.3773 440.986 439.5698 post prob 0.30 0.149 0.135 0.111 0.055 7

Histogram of Life[Life < 734 & Dose == 1] Histogram of Life[Life < 734 & Dose == ] Frequency 0.0 0.5 1.0 1.5.0.5 3.0 Frequency 0.0 0.5 1.0 1.5.0 Life[Life < 734 & Dose == 1] Life[Life < 734 & Dose == ] Histogram of Life[Life < 734 & Dose == 3] Histogram of Life[Life < 734 & Dose == 4] Frequency 0 1 3 4 Frequency 0.0 0.5 1.0 1.5.0.5 3.0 Life[Life < 734 & Dose == 3] Life[Life < 734 & Dose == 4] Figure 1: Histogram of the non-censored surviva time in each dose group. 0.0 Surviva Probabiity Figure : Hazard and surviva curve for knots choice 1, no sign constraints. 8

0.0 Surviva Probabiity Figure 3: Hazard and surviva curve for knots choice, no sign constraints. 0.0 0. 0.4 0.6 0.8 Surviva Probabiity 00 00 Figure 4: Hazard and surviva curve for knots choice 3, no sign constraints. 9

0.0 0.1 0. 0.3 0.4 0.5 Surviva Probabiity 00 00 Figure 5: Hazard and surviva curve for knots choice 4, no sign constraints. 0.0 Surviva Probabiity Figure 6: Hazard and surviva curve for knots choice 1, with sign constraints. 10

0.0 Surviva Probabiity Figure 7: Hazard and surviva curve for knots choice, with sign constraints. 0.0 0. 0.4 0.6 0.8 Surviva Probabiity 00 00 Figure 8: Hazard and surviva curve for knots choice 3, with sign constraints. 11

0.0 0.1 0. 0.3 0.4 0.5 Surviva Probabiity 00 00 Figure 9: Hazard and surviva curve for knots choice 4, with sign constraints. og ikeihood 170 160 150 140 450 500 550 600 650 x Figure 10: Important samping method: margina og ikeihood of data for the best mode for knots choice 1, with the second knot x varying, and no sign constraints. 1

og ikeihood 160 150 140 130 10 450 500 550 600 650 x Figure 11: Lapace approximation method: margina og ikeihood of data for the best mode for knots choice 1, with the second knot x varying, and no sign constraints. BIC 000 1800 1600 1400 450 500 550 600 650 x Figure 1: BIC for the best mode(chosen by bic.gm() in BMA package in R) for knots choice 1, with the second knot x varying, and no sign constraints. 13