Chapter 7 Parametric Likelihood Fitting Concepts: Chapter 7 Parametric Likelihood Fitting Concepts: Objectives Show how to compute a likelihood for a parametric model using discrete data. Show how to compute a likelihood for samples containing right censored observations and left censored observations. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. December 14, 2015 8h 9min 7-1 Use a parametric likelihood as a tool for data analysis and inference. Illustrate the use of likelihood and normal-approximation methods of computing confidence intervals for model parameters and other quantities of interest. Explain the appropriate use of the density approximation for observations reported as exact failures. 7-2 Example: Between α-particle Emissions of Americium-241 (Berkson 1966) Berkson (1966) investigates the randomness of α-particle emissions of Americium-241, which has a half-life of about 458 years. Data: Interarrival times (units: 1/5000 seconds). n =10,220 observations. Data binned into intervals from 0 to 4000 time units. Interval sizes ranging from 25 to 100 units. Additional interval for observed times exceeding 4,000 time units. Smaller samples analyzed here to illustrate sample size effect. We start the analysis with n =200. Data for α-particle Emissions of Americium-241 Interarrival s Interval Endpoint All s Random Sample of s lower upper n = 10220 n = 200 0 100 1609 41 100 300 2424 44 300 500 1770 24 500 700 1306 32 700 1000 1213 29 1000 2000 1528 21 2000 4000 354 9 4000 16 0 10220 200 7-3 7-4 Histogram of the n = 200 Sample of α-particle Interarrival Data Exponential Probability Plot of the n = 200 Sample of α-particle Interarrival Data. The Plot also Shows Approximate 95% Simultaneous Nonparametric Confidence Bands. Frequency 0 10 20 30 40 0 1000 2000 3000 4000 Probability.99.98.95.9.8.7.6.5.3.1 0 500 1000 1500 2000 7-5 7-6
Parametric Likelihood Probability of the Data Using the model Pr(T t) = F(t;) for continuous T, the likelihood (probability) for a single observation in the interval (t i 1,t i ] is L i (;data i ) = Pr(t i 1 < T t i ) = F(t i ;) F(t i 1 ;). Can be generalized to allow for explanatory variables, multiple sources of variability, and other model features. and Likelihood for Interval Data Data: α-particle emissions of americium-241 The exponential distribution is ( F(t;) = 1 exp t ), t > 0. = E(T), the mean time between arrivals. The total likelihood is the joint probability of the data. Assuming n independent observations n L() = L(;DATA) = C L i (;data i ). i=1 Want to estimate and g(). We will find to make L() large. 7-7 The interval-data likelihood has the form n 8 [ L() = L i () = F(tj ;) F(t j 1 ;) ] d j i=1 j=1 8 [ ( = exp t ) ( j 1 exp t )] dj j j=1 where d j is the number of interarrival times in the jth interval (i.e., times between t j 1 and t j ). 7-8 R() = L()/L( ) for the n = 200 α-particle Interarrival Data. Vertical Lines Give an Approximate 95% Likelihood-Based Confidence Interval for Exponential Probability Plot for the n = 200 Sample of α-particle Interarrival Data. The Plot Also Shows Parametric Exponential ML Estimate and 95% Confidence Intervals for F(t). <- estimate of mean using all n=10220 times.98.95 n=200 -> Probability.9.8 400 600 800 1000 1200 1600.7.6.5.4.2 ML Fit 95% Pointwise Confidence Intervals 0 500 1000 1500 2000 7-9 7-10 Example. α-particle Pseudo Data Constructed with Constant Proportion within Each Bin R() = L()/L( ) for the n = 20, 200, and 2000 Pseudo Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals Interarrival s Interval Endpoint Samples of s lower upper n=20000 n=2000 n=200 n=20 0 100 3000 300 30 3 100 300 5000 500 50 5 300 500 3000 300 30 3 500 700 3000 300 30 3 700 1000 2000 200 20 2 1000 2000 3000 300 30 3 2000 4000 1000 100 10 1 4000 0000 000 0 0 20000 2000 200 20 <- estimate of mean using all n=20000 times n=20 -> n=200 -> <- n=2000 400 600 800 1000 1200 1600 7-11 7-12
Example. α-particle Random Samples Interarrival s Interval Endpoint All s Random Samples of s lower upper n = 10220 n = 2000 n = 200 n=20 R() = L()/L( ) for the n = 20, 200, and 2000 Samples from the α-particle Interarrival Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals. estimate of mean using all n=10220 times -> 0 100 1609 292 41 3 100 300 2424 494 44 7 300 500 1770 332 24 4 500 700 1306 236 32 1 700 1000 1213 261 29 3 1000 2000 1528 308 21 2 2000 4000 354 73 9 0 4000 16 4 0 0 10220 2000 200 20 n=2000 n=20 n=200 200 400 600 800 1000 7-13 7-14 Likelihood as a Tool for Modeling/Inference What can we do with the (log) likelihood? n L() = log[l()] = L i (). i=1 Likelihood as a Tool for Modeling/Inference (Continued) Regions of high likelihood are credible; regions of low likelihood are not credible (suggests confidence regions for parameters). Study the surface. Maximize with respect to (ML point estimates). Look at curvature at maximum (gives estimate of Fisher information and asymptotic variance). Observe effect of perturbations in data and model on likelihood (sensitivity, influence analysis). 7-15 If the length of is > 1 or 2 and interest centers on subset of (need to get rid of nuisance parameters), look at profiles (suggests confidence regions/intervals for parameter subsets). Calibrate confidence regions/intervals with χ 2 or simulation (or parametric bootstrap). Use reparameterization to study functions of. 7-16 for Simulated Exponential ( = 5) Samples of Size n = 3 for Simulated Exponential ( = 5) Samples of Size n = 1000 Profile Likelihood Profile Likelihood 0 10 20 30 40 50 theta 4.6 4.8 5.0 5.2 5.4 5.6 theta 7-17 7-18
Large-Sample Approximate Theory for Likelihood Ratios for a Scalar Parameter Relative likelihood for is R() = L() L( ). If evaluated at the true, then, asymptotically, 2 log[r()] follows, a chisquare distribution with 1 degree of freedom. An approximate 100(1 α)% likelihood-based confidence region for is the set of all values of such that Normal-Approximation Confidence Intervals for A 100(1 α)% normal-approximation (or Wald) confidence interval for is [, ] = ±z (1 α/2) ŝe. where ŝe = [ d 2 L()/d 2] 1 is evaluated at. Based on = NOR(0,1) Z ŝe 2log[R()] < χ 2 (1 α;1) or, equivalently, the set defined by R() > exp [ χ 2 (1 α;1) /2]. General theory in the Appendix. From the definition of NOR(0, 1) quantiles Pr [ ] z (α/2) < z Z (1 α/2) 1 α implies that Pr [ z (1 α/2) < ŝe +z (1 α/2) ŝe ] 1 α. 7-19 7-20 Normal-Approximation Confidence Intervals for (continued) A 100(1 α)% normal-approximation (or Wald) confidence interval for is [, ] = [ /w, w] where w = exp[z (1 α/2) ŝe / ]. This follows after transforming (by exponentiation) the confidence interval [log(), log()] = log( )±z (1 α/2) ŝe log( ) which is based on Z log( ) = log( ) log() ŝe log( ) NOR(0,1) Because log( ) is unrestricted in sign, generally Z log( ) is Comparisons for α-particle Data All s Sample of s n =10,220 n = 200 n=20 ML Estimate 596 572 440 Standard Error ŝe 6.1 41.7 101 95% Confidence Intervals for Based on Likelihood [585, 608] [498, 662] [289, 713] Z log( ) NOR(0,1) [585, 608] [496, 660] [281, 690] NOR(0,1) [585, 608] [491, 654] [242, 638] Z ML Estimate λ 10 5 168 175 227 Standard Error ŝe λ 10 5 1.7 13 52 95% Confidence Intervals for λ 10 5 Based on Likelihood [164, 171] [151, 201] [140, 346] Z log( λ) NOR(0,1) [164, 171] [152, 202] [145, 356] NOR(0,1) [164, 171] [149, 200] [125, 329] Z λ closer to an NOR(0,1) distribution than is Z. 7-21 7-22 Density Approximation for Exact Observations Confidence Intervals for Functions of For one-parameter distributions, confidence intervals for can be translated directly into confidence intervals for monotone functions of. The arrival rate λ = 1/ is a decreasing function of. [λ, λ] = [1/, 1/ ] = [.00151,.00201]. F(t;) is a decreasing function of [F (t e), F(t e)] = [F(t e; ), F(t e; )]. If t i 1 = t i i, i > 0, and the correct likelihood F(t i ;) F(t i 1 ;) = F(t i ;) F(t i i ;) can be approximated with the density f(t) as ti [F(t i ;) F(t i i ;)] = f(t)dt f(t i;) i (t i i ) then the density approximation for exact observations may be appropriate. L i (;data i ) = f(t i ;) For most common models, the density approximation is adequate for small i. There are, however, situations where the approximation breaks down as i 0. 7-23 7-24
ML Estimates for the Mean Based on the Density Approximation With r exact failures and n r right-censored observations the ML estimate of is = TTT ni=1 t i = r r TTT = n i=1 t i, total time in test, is the sum of the failure times plus the censoring time of the units that are censored. Confidence Interval for the Mean Life of a New Insulating Material A life test for a new insulating material used 25 specimens which were tested simultaneously at a high voltage of 30 kv. The test was run until 15 of the specimens failed. The 15 failure times (hours) were recorded as: Using the observed curvature in the likelihood: [ ] 1 = ŝe d2 L() 2 d 2 = r = r. If the data are complete or failure censored, 2TTT/ χ 2 2r. Then an exact 100(1 α)% confidence interval for is [, ] = 2(TTT) 2(TTT) χ 2, χ (1 α/2;2r) 2. (α/2;2r) 7-25 1.15, 3.16, 10.38, 10.75, 12.53, 16.74, 22.54, 25.01, 33.02, 33.93, 36.17, 39.06, 44.56, 46.65, 55.93 Then TTT = 1.15+ +55.93+10 55.93 = 958hours. The ML estimate of and a 95% confidence interval are: = 958/15 = 63.392hours [ ], = 2(958) χ 2, 2(958) χ (.975;30) 2 = (.025;30) = [48, 113.26]. [ 1901.76 46.98, 1901.76 ] 16.79 7-26 Exponential Analysis With Zero Failures ML estimate for the Exponential distribution mean cannot be computed unless the available data contains one or more failures. For a sample of n units with running times t 1,...,t n and an assumed exponential distribution, a conservative 100(1 α)% lower confidence bound for is = 2(TTT) χ 2 = 2(TTT) 2log(α) = TTT log(α). (1 α;2) The lower bound can be translated into an lower confidence bound for functions like t p for specified p or a upper confidence bound for F(t e) for a specified t e. This bound is based on the fact that under the exponential failure-time distribution, with immediate replacement of failed units, the number of failures observed in a life test with a fixed total time on test has a Poisson distribution. 7-27 Analysis of the Diesel Generator Fan Data (Assuming Removal After 200 Hours of Service) Here we do the analysis of the fan data after 200 hours of testing when all the fans were still running. Thus TTT =14,000 hours. A conservative 95% lower confidence bound on is = 2(TTT) χ 2 = 28000 5.991 = 4674. (.95;2) Using the entire data set, = 28,701 and a likelihoodbased approximate 95% lower confidence bound is = 18,485 hours. A conservative 95% upper confidence bound on F(10000; ) is F(10000) = F(10000; ) = 1 exp( 10000/4674) =.882. This shows how little information comes from a short test with zero or few failures. 7-28 Other Topics in Chapter 7 Inferences when there are no failures. 7-29