n =10,220 observations. Smaller samples analyzed here to illustrate sample size effect.

Similar documents
Chapter 9. Bootstrap Confidence Intervals. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Unit 10: Planning Life Tests

The Relationship Between Confidence Intervals for Failure Probabilities and Life Time Quantiles

Chapter 15. System Reliability Concepts and Methods. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Statistical Prediction Based on Censored Life Data. Luis A. Escobar Department of Experimental Statistics Louisiana State University.

Sample Size and Number of Failure Requirements for Demonstration Tests with Log-Location-Scale Distributions and Type II Censoring

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Data: Singly right censored observations from a temperatureaccelerated

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

ACCELERATED DESTRUCTIVE DEGRADATION TEST PLANNING. Presented by Luis A. Escobar Experimental Statistics LSU, Baton Rouge LA 70803

Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Practice Problems Section Problems

A Tool for Evaluating Time-Varying-Stress Accelerated Life Test Plans with Log-Location- Scale Distributions

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Accelerated Destructive Degradation Tests: Data, Models, and Analysis

Accelerated Destructive Degradation Test Planning

Reliability prediction based on complicated data and dynamic data

Maximum-Likelihood Estimation: Basic Ideas

Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions

Warranty Prediction Based on Auxiliary Use-rate Information

Accelerated Testing Obtaining Reliability Information Quickly

STAT 6350 Analysis of Lifetime Data. Probability Plotting

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

Statistical Prediction Based on Censored Life Data

Chapter 18. Accelerated Test Models. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Empirical Power of Four Statistical Tests in One Way Layout

A Statistical Journey through Historic Haverford

The exponential distribution and the Poisson process

Cox s proportional hazards model and Cox s partial likelihood

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Understanding and Addressing the Unbounded Likelihood Problem

Stat 5101 Lecture Notes

Quantile POD for Hit-Miss Data

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Interval Estimation for Parameters of a Bivariate Time Varying Covariate Model

Unit 20: Planning Accelerated Life Tests

Bayesian Methods for Accelerated Destructive Degradation Test Planning

Step-Stress Models and Associated Inference

Semiparametric Regression

OPTIMUM DESIGN ON STEP-STRESS LIFE TESTING

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Using Accelerated Life Tests Results to Predict Product Field Reliability

Bayesian Life Test Planning for the Log-Location- Scale Family of Distributions

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests

MAS3301 / MAS8311 Biostatistics Part II: Survival

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

STT 843 Key to Homework 1 Spring 2018

Estimating a parametric lifetime distribution from superimposed renewal process data

A hidden semi-markov model for the occurrences of water pipe bursts

Examination paper for TMA4275 Lifetime Analysis

Analysis of Type-II Progressively Hybrid Censored Data

STAT331. Cox s Proportional Hazards Model

Inference on reliability in two-parameter exponential stress strength model

3 Joint Distributions 71

Maximum Likelihood (ML) Estimation

Subject CS1 Actuarial Statistics 1 Core Principles

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Comparisons of Approximate Confidence Interval Procedures for Type I Censored Data

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer

Weibull Reliability Analysis

Testing Statistical Hypotheses

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Generalized Linear Models

Bayesian Inference. Chapter 2: Conjugate models

β j = coefficient of x j in the model; β = ( β1, β2,

Weibull Reliability Analysis

Interval Estimation III: Fisher's Information & Bootstrapping

New Developments in Econometrics Lecture 16: Quantile Estimation

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases

Parametric Evaluation of Lifetime Data

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Mahdi karbasian* & Zoubi Ibrahim

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data

PRINCIPLES OF STATISTICAL INFERENCE

Chapter 1 Statistical Inference

Generalized additive modelling of hydrological sample extremes

Theory and Methods of Statistical Inference

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Estimation of Quantiles

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Prediction of remaining life of power transformers based on left truncated and right censored lifetime data

Introduction to Statistics and Error Analysis II

Prediction of remaining life of power transformers based on left truncated and right censored lifetime data

Math 494: Mathematical Statistics

HANDBOOK OF APPLICABLE MATHEMATICS

Lecture 3. Truncation, length-bias and prevalence sampling

Exercise 5.4 Solution

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods

ST745: Survival Analysis: Nonparametric methods

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

Transcription:

Chapter 7 Parametric Likelihood Fitting Concepts: Chapter 7 Parametric Likelihood Fitting Concepts: Objectives Show how to compute a likelihood for a parametric model using discrete data. Show how to compute a likelihood for samples containing right censored observations and left censored observations. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors text Statistical Methods for Reliability Data, John Wiley & Sons Inc. 1998. December 14, 2015 8h 9min 7-1 Use a parametric likelihood as a tool for data analysis and inference. Illustrate the use of likelihood and normal-approximation methods of computing confidence intervals for model parameters and other quantities of interest. Explain the appropriate use of the density approximation for observations reported as exact failures. 7-2 Example: Between α-particle Emissions of Americium-241 (Berkson 1966) Berkson (1966) investigates the randomness of α-particle emissions of Americium-241, which has a half-life of about 458 years. Data: Interarrival times (units: 1/5000 seconds). n =10,220 observations. Data binned into intervals from 0 to 4000 time units. Interval sizes ranging from 25 to 100 units. Additional interval for observed times exceeding 4,000 time units. Smaller samples analyzed here to illustrate sample size effect. We start the analysis with n =200. Data for α-particle Emissions of Americium-241 Interarrival s Interval Endpoint All s Random Sample of s lower upper n = 10220 n = 200 0 100 1609 41 100 300 2424 44 300 500 1770 24 500 700 1306 32 700 1000 1213 29 1000 2000 1528 21 2000 4000 354 9 4000 16 0 10220 200 7-3 7-4 Histogram of the n = 200 Sample of α-particle Interarrival Data Exponential Probability Plot of the n = 200 Sample of α-particle Interarrival Data. The Plot also Shows Approximate 95% Simultaneous Nonparametric Confidence Bands. Frequency 0 10 20 30 40 0 1000 2000 3000 4000 Probability.99.98.95.9.8.7.6.5.3.1 0 500 1000 1500 2000 7-5 7-6

Parametric Likelihood Probability of the Data Using the model Pr(T t) = F(t;) for continuous T, the likelihood (probability) for a single observation in the interval (t i 1,t i ] is L i (;data i ) = Pr(t i 1 < T t i ) = F(t i ;) F(t i 1 ;). Can be generalized to allow for explanatory variables, multiple sources of variability, and other model features. and Likelihood for Interval Data Data: α-particle emissions of americium-241 The exponential distribution is ( F(t;) = 1 exp t ), t > 0. = E(T), the mean time between arrivals. The total likelihood is the joint probability of the data. Assuming n independent observations n L() = L(;DATA) = C L i (;data i ). i=1 Want to estimate and g(). We will find to make L() large. 7-7 The interval-data likelihood has the form n 8 [ L() = L i () = F(tj ;) F(t j 1 ;) ] d j i=1 j=1 8 [ ( = exp t ) ( j 1 exp t )] dj j j=1 where d j is the number of interarrival times in the jth interval (i.e., times between t j 1 and t j ). 7-8 R() = L()/L( ) for the n = 200 α-particle Interarrival Data. Vertical Lines Give an Approximate 95% Likelihood-Based Confidence Interval for Exponential Probability Plot for the n = 200 Sample of α-particle Interarrival Data. The Plot Also Shows Parametric Exponential ML Estimate and 95% Confidence Intervals for F(t). <- estimate of mean using all n=10220 times.98.95 n=200 -> Probability.9.8 400 600 800 1000 1200 1600.7.6.5.4.2 ML Fit 95% Pointwise Confidence Intervals 0 500 1000 1500 2000 7-9 7-10 Example. α-particle Pseudo Data Constructed with Constant Proportion within Each Bin R() = L()/L( ) for the n = 20, 200, and 2000 Pseudo Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals Interarrival s Interval Endpoint Samples of s lower upper n=20000 n=2000 n=200 n=20 0 100 3000 300 30 3 100 300 5000 500 50 5 300 500 3000 300 30 3 500 700 3000 300 30 3 700 1000 2000 200 20 2 1000 2000 3000 300 30 3 2000 4000 1000 100 10 1 4000 0000 000 0 0 20000 2000 200 20 <- estimate of mean using all n=20000 times n=20 -> n=200 -> <- n=2000 400 600 800 1000 1200 1600 7-11 7-12

Example. α-particle Random Samples Interarrival s Interval Endpoint All s Random Samples of s lower upper n = 10220 n = 2000 n = 200 n=20 R() = L()/L( ) for the n = 20, 200, and 2000 Samples from the α-particle Interarrival Data. Vertical Lines Give Corresponding Approximate 95% Likelihood-Based Confidence Intervals. estimate of mean using all n=10220 times -> 0 100 1609 292 41 3 100 300 2424 494 44 7 300 500 1770 332 24 4 500 700 1306 236 32 1 700 1000 1213 261 29 3 1000 2000 1528 308 21 2 2000 4000 354 73 9 0 4000 16 4 0 0 10220 2000 200 20 n=2000 n=20 n=200 200 400 600 800 1000 7-13 7-14 Likelihood as a Tool for Modeling/Inference What can we do with the (log) likelihood? n L() = log[l()] = L i (). i=1 Likelihood as a Tool for Modeling/Inference (Continued) Regions of high likelihood are credible; regions of low likelihood are not credible (suggests confidence regions for parameters). Study the surface. Maximize with respect to (ML point estimates). Look at curvature at maximum (gives estimate of Fisher information and asymptotic variance). Observe effect of perturbations in data and model on likelihood (sensitivity, influence analysis). 7-15 If the length of is > 1 or 2 and interest centers on subset of (need to get rid of nuisance parameters), look at profiles (suggests confidence regions/intervals for parameter subsets). Calibrate confidence regions/intervals with χ 2 or simulation (or parametric bootstrap). Use reparameterization to study functions of. 7-16 for Simulated Exponential ( = 5) Samples of Size n = 3 for Simulated Exponential ( = 5) Samples of Size n = 1000 Profile Likelihood Profile Likelihood 0 10 20 30 40 50 theta 4.6 4.8 5.0 5.2 5.4 5.6 theta 7-17 7-18

Large-Sample Approximate Theory for Likelihood Ratios for a Scalar Parameter Relative likelihood for is R() = L() L( ). If evaluated at the true, then, asymptotically, 2 log[r()] follows, a chisquare distribution with 1 degree of freedom. An approximate 100(1 α)% likelihood-based confidence region for is the set of all values of such that Normal-Approximation Confidence Intervals for A 100(1 α)% normal-approximation (or Wald) confidence interval for is [, ] = ±z (1 α/2) ŝe. where ŝe = [ d 2 L()/d 2] 1 is evaluated at. Based on = NOR(0,1) Z ŝe 2log[R()] < χ 2 (1 α;1) or, equivalently, the set defined by R() > exp [ χ 2 (1 α;1) /2]. General theory in the Appendix. From the definition of NOR(0, 1) quantiles Pr [ ] z (α/2) < z Z (1 α/2) 1 α implies that Pr [ z (1 α/2) < ŝe +z (1 α/2) ŝe ] 1 α. 7-19 7-20 Normal-Approximation Confidence Intervals for (continued) A 100(1 α)% normal-approximation (or Wald) confidence interval for is [, ] = [ /w, w] where w = exp[z (1 α/2) ŝe / ]. This follows after transforming (by exponentiation) the confidence interval [log(), log()] = log( )±z (1 α/2) ŝe log( ) which is based on Z log( ) = log( ) log() ŝe log( ) NOR(0,1) Because log( ) is unrestricted in sign, generally Z log( ) is Comparisons for α-particle Data All s Sample of s n =10,220 n = 200 n=20 ML Estimate 596 572 440 Standard Error ŝe 6.1 41.7 101 95% Confidence Intervals for Based on Likelihood [585, 608] [498, 662] [289, 713] Z log( ) NOR(0,1) [585, 608] [496, 660] [281, 690] NOR(0,1) [585, 608] [491, 654] [242, 638] Z ML Estimate λ 10 5 168 175 227 Standard Error ŝe λ 10 5 1.7 13 52 95% Confidence Intervals for λ 10 5 Based on Likelihood [164, 171] [151, 201] [140, 346] Z log( λ) NOR(0,1) [164, 171] [152, 202] [145, 356] NOR(0,1) [164, 171] [149, 200] [125, 329] Z λ closer to an NOR(0,1) distribution than is Z. 7-21 7-22 Density Approximation for Exact Observations Confidence Intervals for Functions of For one-parameter distributions, confidence intervals for can be translated directly into confidence intervals for monotone functions of. The arrival rate λ = 1/ is a decreasing function of. [λ, λ] = [1/, 1/ ] = [.00151,.00201]. F(t;) is a decreasing function of [F (t e), F(t e)] = [F(t e; ), F(t e; )]. If t i 1 = t i i, i > 0, and the correct likelihood F(t i ;) F(t i 1 ;) = F(t i ;) F(t i i ;) can be approximated with the density f(t) as ti [F(t i ;) F(t i i ;)] = f(t)dt f(t i;) i (t i i ) then the density approximation for exact observations may be appropriate. L i (;data i ) = f(t i ;) For most common models, the density approximation is adequate for small i. There are, however, situations where the approximation breaks down as i 0. 7-23 7-24

ML Estimates for the Mean Based on the Density Approximation With r exact failures and n r right-censored observations the ML estimate of is = TTT ni=1 t i = r r TTT = n i=1 t i, total time in test, is the sum of the failure times plus the censoring time of the units that are censored. Confidence Interval for the Mean Life of a New Insulating Material A life test for a new insulating material used 25 specimens which were tested simultaneously at a high voltage of 30 kv. The test was run until 15 of the specimens failed. The 15 failure times (hours) were recorded as: Using the observed curvature in the likelihood: [ ] 1 = ŝe d2 L() 2 d 2 = r = r. If the data are complete or failure censored, 2TTT/ χ 2 2r. Then an exact 100(1 α)% confidence interval for is [, ] = 2(TTT) 2(TTT) χ 2, χ (1 α/2;2r) 2. (α/2;2r) 7-25 1.15, 3.16, 10.38, 10.75, 12.53, 16.74, 22.54, 25.01, 33.02, 33.93, 36.17, 39.06, 44.56, 46.65, 55.93 Then TTT = 1.15+ +55.93+10 55.93 = 958hours. The ML estimate of and a 95% confidence interval are: = 958/15 = 63.392hours [ ], = 2(958) χ 2, 2(958) χ (.975;30) 2 = (.025;30) = [48, 113.26]. [ 1901.76 46.98, 1901.76 ] 16.79 7-26 Exponential Analysis With Zero Failures ML estimate for the Exponential distribution mean cannot be computed unless the available data contains one or more failures. For a sample of n units with running times t 1,...,t n and an assumed exponential distribution, a conservative 100(1 α)% lower confidence bound for is = 2(TTT) χ 2 = 2(TTT) 2log(α) = TTT log(α). (1 α;2) The lower bound can be translated into an lower confidence bound for functions like t p for specified p or a upper confidence bound for F(t e) for a specified t e. This bound is based on the fact that under the exponential failure-time distribution, with immediate replacement of failed units, the number of failures observed in a life test with a fixed total time on test has a Poisson distribution. 7-27 Analysis of the Diesel Generator Fan Data (Assuming Removal After 200 Hours of Service) Here we do the analysis of the fan data after 200 hours of testing when all the fans were still running. Thus TTT =14,000 hours. A conservative 95% lower confidence bound on is = 2(TTT) χ 2 = 28000 5.991 = 4674. (.95;2) Using the entire data set, = 28,701 and a likelihoodbased approximate 95% lower confidence bound is = 18,485 hours. A conservative 95% upper confidence bound on F(10000; ) is F(10000) = F(10000; ) = 1 exp( 10000/4674) =.882. This shows how little information comes from a short test with zero or few failures. 7-28 Other Topics in Chapter 7 Inferences when there are no failures. 7-29