Quality Control Using Inferential Statistics In Weibull Based Reliability Analyses S. F. Duffy 1 and A. Parikh 2

Similar documents
Quality Control Using Inferential Statistics in Weibull-based Reliability Analyses

QUALITY CONTROL USING INFERENTIAL STATISTICS IN WEIBULL ANALYSES FOR COMPONENTS FABRICATED FROM MONOLITHIC CERAMICS ANKURBEN PARIKH

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Psychology 282 Lecture #4 Outline Inferences in SLR

F79SM STATISTICAL METHODS

Statistical Failure Analysis with Application to the Design of Nuclear Components Fabricated From Silicon Carbide

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Topic 19 Extensions on the Likelihood Ratio

Primer on statistics:

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Hypothesis Testing Chap 10p460

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Mathematical Statistics

Hypothesis Testing: The Generalized Likelihood Ratio Test

Institute of Actuaries of India

Maan Jawad Global Engineering & Technology Camas, Washington, U.S.A.

Bayesian Modeling of Accelerated Life Tests with Random Effects

Likelihood-based inference with missing data under missing-at-random

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Basics of Uncertainty Analysis

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

HANDBOOK OF APPLICABLE MATHEMATICS

Step-Stress Models and Associated Inference

Fundamental Probability and Statistics

Reliability of Technical Systems

Topic 17: Simple Hypotheses

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

Maximum-Likelihood Estimation: Basic Ideas

Summary of Chapters 7-9

Bayesian Reliability Analysis: Statistical Challenges from Science-Based Stockpile Stewardship

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Practice Problems Section Problems

4.5.1 The use of 2 log Λ when θ is scalar

Reliability Engineering I

Statistical Inference

Statistical Inference

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

TUTORIAL 8 SOLUTIONS #

Basic Concepts of Inference

Composite Hypotheses. Topic Partitioning the Parameter Space The Power Function

Topic 10: Hypothesis Testing

Truncated Life Test Sampling Plan Under Odd-Weibull Distribution

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Topic 15: Simple Hypotheses

Assumptions of classical multiple regression model

Examination paper for TMA4275 Lifetime Analysis

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions

Hypothesis testing (cont d)

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Constructing Ensembles of Pseudo-Experiments

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis

Topic 10: Hypothesis Testing

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Ch. 5 Hypothesis Testing

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

Standard Error of Technical Cost Incorporating Parameter Uncertainty

An Introduction to Multivariate Statistical Analysis

Inference for P(Y<X) in Exponentiated Gumbel Distribution

Inverse Sampling for McNemar s Test

Nuisance parameters and their treatment

Performance Evaluation and Comparison

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Multivariate Statistical Analysis


Introduction to Reliability Theory (part 2)

Chapter 12: Estimation

Statistics for the LHC Lecture 2: Discovery

Unified approach to the classical statistical analysis of small signals

On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials

Tests and Their Power

INTRODUCTION TO ANALYSIS OF VARIANCE

Topic 21 Goodness of Fit

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

BTRY 4090: Spring 2009 Theory of Statistics

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

Hypothesis Testing - Frequentist

Inference on reliability in two-parameter exponential stress strength model

Review. December 4 th, Review

Life Prediction of Structural Components

Lecture 3. Inference about multivariate normal distribution

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Probability and Statistics Notes

Topic 17 Simple Hypotheses

Estimating Load-Sharing Properties in a Dynamic Reliability System. Paul Kvam, Georgia Tech Edsel A. Peña, University of South Carolina

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Using Simulation Procedure to Compare between Estimation Methods of Beta Distribution Parameters

Transcription:

Quality Control Using Inferential Statistics In Weibull Based Reliability Analyses S. F. Duffy 1 and A. Parikh 2 1 Cleveland State University 2 N & R Engineering www.inl.gov ASTM Symposium on Graphite Testing for Nuclear Applications: The Significance of Test Specimen Volume and Geometry and the Statistical Significance of Test Specimen Population September 19-20, 2013, Seattle, WA, USA

MOTIVATION Recently a standard practice on Reporting Uniaxial Strength Data and Estimating Weibull Distribution Parameters for Advanced Graphites (ASTM D 7486) survived the ASTM voting process and has been accepted as a Society Standard Practice. Efforts within the ASME Boiler and Pressure Vessel committees have produced an extensive treatment on designing reactor components from graphite materials. The presentation today is a small attempt to continue efforts to bridge the ASTM graphite testing community and the ASME graphite design community. The seemingly simple question of how many test specimens should be tested is addressed. 2

INTRODUCTION Simple procedures have been proposed for accepting a graphite material based on strength. An enhanced approach that incorporates the inherent variability associated with parameter estimation as well as component reliability is presented. Tensile strength is a random variable characterized by a two parameter Weibull distribution. The Weibull distribution is an extreme value distribution and this facet makes it a preferred distribution to characterize the minimum strength of graphite. In ASTM Standard Practice D 7486 Weibull distribution parameters are estimated using maximum likelihood estimators. Point estimates computed from data are approximate values of the true distribution parameter values. 3

The point estimates are dependent on the number of strength tests conducted and the question of whether the estimated values are of sufficiently high quality is directly related to the fundamental question of how many samples must be tested? However, the more appropriate question is how many samples must be tested to establish a given confidence level for a stipulated component reliability? We wish to address the latter question properly by utilizing interval estimates along with hypothesis testing. Confidence intervals are used to indicate the quality of estimated parameters. Confidence intervals on parameter estimates represent the range of values for the distribution parameters that are both reasonable and plausible. Within the interval one can expect the true value of a distribution parameter with a quantifiable degree of confidence. When confidence intervals are combined with hypothesis testing a rejection region is established, and the notion of a likelihood ratio ring can be developed wherein the estimates should reside, and moreover, where the true distribution parameter pair can be found. 4

Hypothesis Testing Hypothesis testing is applied to estimates of the Weibull distribution parameters (, m). Achieving statistical significance is equivalent to accepting that the observed results (the point estimates) are plausible and a null hypothesis (H 0 ) is not rejected. If properly crafted, the null hypothesis helps to derive a rejection region in a distribution parameter space. To do this a test statistic is developed that aids in our decision to reject the null hypothesis. For our use the statistic, based on a ratio of likelihood functions, helps in defining the relative size of the rejection region. The objective is to define the process relative to strength testing of graphite materials and establish a robust material acceptance criteria. The process has been discussed within the ASME nuclear design code committee for graphite. 5

Parameters for a distribution are identified generically by the vector Since the tensile strength for graphite is assumed characterized by a two parameter Weibull distribution, then a vector of distribution parameters whose components are the MLE parameter estimates can be identified as ˆ, ˆ ) A null hypothesis is stipulated such that H i ( 1, 2, 3,...) ( mˆ, ˆ ) ( 1 2, ( mˆ, ˆ ) 0 : 1 2 0 that is the components of the vector 0, i.e., the MLE parameter estimates, are equal to the true distribution parameter. The alternative hypothesis is the vector components are not equal c H1 : 1, 2 0 6

For the two parameter Weibull distribution define a log likelihood function as L ln n i1 Given a specific data set the likelihood function can be constructed in the m parameter space. f Constructing the Test Statistic ( x ) The MLE parameter values define the location of the peak of this function. The functional value of the log likelihood function at the maximum is represented in green. i m, ˆ σ θ m ~ Likelihood Function MLE parameters Likelihood Function true parameters mˆ m ~ Likelihood Ratio Ring Likelihood Function If we knew the true population distribution parameters, the functional value of the log likelihood function for the same data set would be represented by the red line. T/2 7

The value of the likelihood function associated with a specific data set that is functionally evaluated at the MLE parameter estimates is expressed as Lˆ n i1 f ( x mˆ, A second likelihood function for the same data set is functionally evaluated at the true distribution parameters, i.e., ~ L n i1 f A test statistic now is introduced that is defined as the natural log of the ratio of the likelihood functions, i.e., T i ( x ~ L 2 ln Lˆ i ˆ ) m~, ~ ) The Neyman-Pearson lemma (1933) states that this likelihood ratio test is the most powerful test statistic available for testing the null hypothesis stipulated earlier. Wilks (1939) showed that as n increases the test statistic T becomes asymptotically 2 - distributed. 8

One can either compute the likelihood ratio T and compare 2ln(T) to 2 values and determine the corresponding significance level, or define a rejection region by assuming a significance level, calculating the correspond 2 value (one degree of freedom), computing T, and finding parameter pairs that satisfy the value of T. This is outlined in the next section. The test statistic is designed in such a way that the probability of a Type I error does not exceed the constant, a value that we control. Thus the probability of a Type I error is fixed at an acceptable low value. The ratio of the two likelihood functions defined previously should be low in the optimal critical region a result of minimizing and maximizing (1 ). The ratio is high in the complementary region 9

Likelihiood Ratio Ring (Region of Acceptance) The likelihood ratio confidence bound is based on the equation For a two parameter Weibull distribution this equation is expressed as L ( ) L ( ˆ) L 2 2 ln ;1 2 ; 1 Weibull Characteristic Strength (σ θ ) ˆ ˆ 2 m, L m, e 0 410 405 400 395 390 385 380 375 Confidence Ring γ = = 10% 90% 17.0, 400.0 15.223, 393.381 5 10 15 20 25 Weibull Modulus (m) Confidence Ring True value Estimated Value Above a virtual data set was generated using Monte Carlo simulation. The sample size was n=10. Here the true distribution parameters are known (17, 400). 10

In real life we do not know the values of the true distribution parameters. However, the confidence bounds provide us with a degree of comfort as to where to expect to find them. That measure of comfort is quantified by. ) The four data sets in the figure to the right ranged in size from 83 to 253. 11 Figure courtesy of R.L. Bratton (INL TPOC)

Component Performance Curves The acceptance criterion depends on establishing an acceptable probability of failure for the component under design. This can quantified using a hazard rate format, i.e., expressed as a fraction with a numerator of one. The hazard rate is simply the reciprocal of the probability of failure a quantity usually expressed as a percentage. With the tensile strength characterized by a two parameter Weibull distribution then a component probability of failure curve can be generated in an m graph. Using the expression P f 1 exp then points along the curve the curve in the following figure represent parameter pairs equal to a specified stress and probability of failure given by this, or similar expressions. The curve generated by this expression will be referred to as a component performance curve. m 12

One component design curve defines two regions of the m space, an acceptable performance region and a rejection region relative to a specified component probability of failure. Parameter pairs to the right of the component performance curve have lower probability of failures for a given design stress, and parameter pairs to the left are associated with higher probability of failures. We overlay this curve with point estimates of the distribution parameters obtained from tensile strength data. Rejection Region Acceptance Region Parameter Estimate Pair 13

Alternative component performance curves exist. The graph to the right presents Weibull parameter estimates in terms of their equivalent mean and standard deviation. Is it enough to have parameter estimates simply reside in the acceptance region? How close can we come to the design curve? 14

The figure to the right is an example of a component performance curve and the previous examples of likelihood ratio rings for NBG-18. An increase in the failure rate would shift the curve to the left, a decrease would shift the curve to the right. Curves are graphed based on the following expression Probability of Failure Curves ASME Appendix II Equation 21 P f S 1 exp Sc g 95% m 95% which is from ASME Boiler and Pressure Vessel Code, Article HHA-II-3000, Section III, Division 5 15 Figure courtesy of R.L. Bratton (INL TPOC)

In Parikh s thesis (2011) generic data sets were generated via Monte Carlo simulation. The true distribution parameters were known. The red curve is a likelihood ratio ring that corresponds to a data set with 10 test specimens. The blue curve is the component performance curve. Confidence Ring and Component Performance Curve 15.22, 393.38 CPC Point estimates were made using maximum likelihood estimators. The ring was generated at the 90 th confidence level ( = 10%) and it is partitioned by the component performance curve. 16

To insure that an acceptance region enclosed by the likelihood ratio ring is completely to the right of the component reliability curve we can either adjust the probability of failure by increasing it (not a good approach). Confidence Ring and Component Performance Curves 15.22, 393.38 CPC with P f of 1 in 500,000 Confidence Ring CPC with P f of 1 in 63,268.31 17

or we can decrease the confidence level until the likelihood ratio ring is tangent to the material performance curve (see the figure to the right).. Confidence Ring and Component Performance Curve = 90%, 80%, 70% ( = 10%, 20%, 30%) 15.22, 393.38 18

Likelihood Confidence Rings with increasing sample size (N) and fix γ(=90%) or we increase the size of the data set and hold the confidence level steady to shrink the size of the confidence ring. Should be the preferred approach by regulators. Weibull Characteristic Strength (σ θ ) 420 415 410 405 400 395 390 17, 400 10 sample size 20 sample size 30 sample size 40 sample size 50 sample size 60 sample size 70 sample size 80 sample size 90 sample size 100 sample size True Value 385 380 10 12 14 16 18 20 22 24 26 Weibull Modulus (m) 19

Summary A topic was presented to bridge one of several gaps between experiment and component modeling, i.e., how many tests should be conducted given a component design. The questioned was answered by combining two common statistical inference procedures: confidence interval estimation and hypothesis testing. Confidence interval estimates provide a range of possible values within which the true, unknown population parameters reside. Hypothesis testing speaks to the quality of the data. The two concepts can be combined to produce a material acceptance criterion. The concepts presented here are being considered for incorporation into the ASME nuclear design code. Other issues exist that. Size dependence in graphite appears to be a function of density. A more robust size scaling methodology is needed that incorporates density in a coherent fashion. A coherent solution requires a joint effort among experimentalists and designers. 20