Introduction to Reliability Theory (part 2)

Similar documents
Introduction to Imprecise Probability and Imprecise Statistical Methods

Reliability of Technical Systems

Step-Stress Models and Associated Inference

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

Computer Simulation of Repairable Processes

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

An Integral Measure of Aging/Rejuvenation for Repairable and Non-repairable Systems

Mahdi karbasian* & Zoubi Ibrahim

The structure function for system reliability as predictive (imprecise) probability

The random counting variable. Barbara Russo

Introduction to repairable systems STK4400 Spring 2011

Examination paper for TMA4275 Lifetime Analysis

Fleet Maintenance Simulation With Insufficient Data

Hybrid Censoring; An Introduction 2

e 4β e 4β + e β ˆβ =0.765

Multistate Modeling and Applications

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Thursday, October 4 Renewal Theory: Renewal Reward Processes

Reliability Engineering I

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Key Words: Lifetime Data Analysis (LDA), Probability Density Function (PDF), Goodness of fit methods, Chi-square method.

System reliability using the survival signature

Software Reliability & Testing

Semiparametric Regression

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

A hidden semi-markov model for the occurrences of water pipe bursts

Cox s proportional hazards/regression model - model assessment

Failure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1

ST495: Survival Analysis: Maximum likelihood

These notes will supplement the textbook not replace what is there. defined for α >0

Fault Tolerant Computing CS 530 Software Reliability Growth. Yashwant K. Malaiya Colorado State University

Fundamentals of Reliability Engineering and Applications

Chapter 9 Part II Maintainability

Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions

A Few Notes on Fisher Information (WIP)

SEQUENCING RELIABILITY GROWTH TASKS USING MULTIATTRIBUTE UTILITY FUNCTIONS

CHAPTER 3 MATHEMATICAL AND SIMULATION TOOLS FOR MANET ANALYSIS

Censoring mechanisms

Reliability Growth in JMP 10

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Part 3 Robust Bayesian statistics & applications in reliability networks

TMA 4275 Lifetime Analysis June 2004 Solution

BAYESIAN MODELING OF DYNAMIC SOFTWARE GROWTH CURVE MODELS

Lecture 5 Models and methods for recurrent event data

Repairable Systems Reliability Trend Tests and Evaluation

CHAPTER 10 RELIABILITY

Reliability and Availability Simulation. Krige Visser, Professor, University of Pretoria, South Africa

Variability within multi-component systems. Bayesian inference in probabilistic risk assessment The current state of the art

11 Survival Analysis and Empirical Likelihood

Parametric and Topological Inference for Masked System Lifetime Data

Semi-parametric predictive inference for bivariate data using copulas

Stochastic Renewal Processes in Structural Reliability Analysis:

AN INTEGRAL MEASURE OF AGING/REJUVENATION FOR REPAIRABLE AND NON REPAIRABLE SYSTEMS

Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

B.H. Far

Dependable Systems. ! Dependability Attributes. Dr. Peter Tröger. Sources:

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

arxiv: v1 [math.ho] 25 Feb 2008

Imprecise Software Reliability

ST745: Survival Analysis: Nonparametric methods

Survival Analysis Math 434 Fall 2011

Hybrid Censoring Scheme: An Introduction

Inferring from data. Theory of estimators

Likelihood Construction, Inference for Parametric Survival Distributions

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Survival Distributions, Hazard Functions, Cumulative Hazards

1 Delayed Renewal Processes: Exploiting Laplace Transforms

Survival Analysis. Lu Tian and Richard Olshen Stanford University

HANDBOOK OF APPLICABLE MATHEMATICS

Abstract. 1. Introduction

Temporal point processes: the conditional intensity function

Continuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21

Statistical Inference and Methods

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Confidence Intervals for Reliability Growth Models with Small Sample Sizes. Reliability growth models, Order statistics, Confidence intervals

Failure Correlation in Software Reliability Models

Guidelines for Analysis of Data Related to Aging of Nuclear Power Plant Components and Systems

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

10 Introduction to Reliability

COM336: Neural Computing

ABC methods for phase-type distributions with applications in insurance risk problems

MAS3301 / MAS8311 Biostatistics Part II: Survival

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Software Reliability Growth Modelling using a Weighted Laplace Test Statistic

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

Unit 20: Planning Accelerated Life Tests

Software Reliability.... Yes sometimes the system fails...

Understanding the Cox Regression Models with Time-Change Covariates

Non-Parametric Tests for Imperfect Repair

Bayesian Reliability Demonstration

Parametric Evaluation of Lifetime Data

CHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR

Poisson Processes. Particles arriving over time at a particle detector. Several ways to describe most common model.

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Fault Tolerant Computing ECE 655

A novel repair model for imperfect maintenance

Lecture 3. Truncation, length-bias and prevalence sampling

Transcription:

Introduction to Reliability Theory (part 2) Frank Coolen UTOPIAE Training School II, Durham University 3 July 2018 (UTOPIAE) Introduction to Reliability Theory 1 / 21

Outline Statistical issues Software reliability Decision making Survival signature Imprecision Resilience (UTOPIAE) Introduction to Reliability Theory 2 / 21

Statistical Issues Random failure time T > 0 with CDF F(t) = P(T t) Survival Function S(t) = P(T > t) = 1 F(t) pdf f (t) = F (t) = S (t) Hazard Rate h(t) = f (t)/s(t) Interpretation, for small δt: h(t)δt P(T t + δt T > t) Cumulative Hazard Function (CHF) H(t) = t 0 h(x)dx H(t) = t 0 f (x) S(x) dx = ln S(t) S(t) = exp{ H(t)} = exp{ t 0 h(x)dx} (UTOPIAE) Introduction to Reliability Theory 3 / 21

Most statistical methods for estimation of parameters in assumed models (e.g. Weibull) use the Likelihood Function, either maximising it or combined with a prior distribution in Bayesian analysis. Reliability data are often affected by right-censoring: an item has not failed during a particular period of time. Let t 1,..., t n be observed failure times, and c 1,..., c m right-censored observations. For a parametric model with parameter θ, the likelihood function is L(θ t 1,..., t n ; c 1,..., c m ) = n m f (t j θ) S(c i θ). This assumes that the censoring mechanism is independent of the failure process. j=1 i=1 (UTOPIAE) Introduction to Reliability Theory 4 / 21

Regression models for reliability data For regression models in reliability applications Weibull models are often used, with the survival function, depending on a vector of covariates x, given by { ( ) t ηx } S(t; x) = exp. α x Some simple forms are often used for the shape and scale parameters as functions of x, e.g. the loglinear model for α x, specified via ln α x = x T β, with β a vector of parameters, and similar models for η x. The statistical methodology is then pretty similar to general regression methods (e.g. fitting by MLE or Bayesian methods) and implemented in statistical software packages. (UTOPIAE) Introduction to Reliability Theory 5 / 21

Failure processes A system fails at certain times, there may be some actions during this period which affect failure behaviour, e.g. minimal repairs to allow the system to continue its function, or replacement of some components, or other improvements of the system. Normally this should not include full replacement of the system. Let the random quantities T 1 < T 2 < T 3 <... be the failure times of the system, let X i = T i T i 1 (with T 0 = 0). (UTOPIAE) Introduction to Reliability Theory 6 / 21

Rate of OCcurrence Of Failure (ROCOF) Let N(t) be the number of failures in the period (0, t], then the ROCOF is: v(t) = d dt EN(t). Increasing (decreasing) ROCOF models a system that gets worse (better) over time. Note that the ROCOF is not the same as the hazard rate (the definitions are clearly different!), although intuitively they are a bit similar. If we consider a standard Poisson process, with iid times between failures being exponentially distributed, then the ROCOF and hazard rate happen to be identical. (UTOPIAE) Introduction to Reliability Theory 7 / 21

Non-Homogeneous Poisson Process models (NHPP) The crucial assumption in these models, as for standard Poisson processes which are just a special case, is that the numbers of failures in distinct intervals are independent if the process characteristics are known. A NHPP with ROCOF v(t) is easiest defined by the property that the number of failures in interval (t 1, t 2 ] is a Poisson distributed random quantity, with mean m(t 1, t 2 ) = t2 t 1 v(t)dt. (UTOPIAE) Introduction to Reliability Theory 8 / 21

Suppose we have observed the system over time period [0, r], and have observed failures at times t 1 < t 2 <... < t n r, then the likelihood function is { n } L = v(t i ) i=1 This enables inference as usual. [ r ] exp v(t)dt. 0 Two basic, often used parametric ROCOFs are v 1 (t) = exp(β 0 + β 1 t) and v 2 (t) = γηt η 1, (UTOPIAE) Introduction to Reliability Theory 9 / 21

Software Reliability Many models that have been suggested, during the last five decades, for software reliability, are NHPPs which model the software testing process as a fault counting process. Many models are variations to a basic model which assumes: (1) Software contains an unknown number of bugs, N. (2) At each failure, one bug is detected and corrected. (3) The ROCOF is proportional to the number of bugs present. (UTOPIAE) Introduction to Reliability Theory 10 / 21

The basic model is a NHPP with for some constant λ. v(t) = (N i + 1)λ, for t [T i 1, T i ), N and λ are both considered unknown, and estimated from data, where N tends to be of most interest, or in particular the number of remaining bugs. (UTOPIAE) Introduction to Reliability Theory 11 / 21

Software reliability is a very important topic area, in particular statistical support for software testing is challenging. Book to appear (Wiley, August 2018): Analytic Methods in Systems and Software Testing, edited by Keneth, Ruggeri and Faltin. Includes chapter by Wooff, Goldstein and Coolen on the use of Bayesian Graphical Models for software testing, including many aspects of implementation. Challenge: considering reliability of systems with hardware and software, including interaction. (UTOPIAE) Introduction to Reliability Theory 12 / 21

Decision Making Traditionally, decisions related to reliability (e.g. planning of replacements or inspections, warranties, etc) were mainly based on Renewal (Reward) Theorem: It is assumed that the same process goes on for a very long time ( infinitely ), consisting of consecutive cycli which are stochastic copies of each other. Let X i be the random length of cyclus i, and R i a random reward associated with cyclus i. Assume that the X i s are iid, and that the R i s are iid, but allow dependence of R i on X i. For a process starting at time 0, let R(t) be the random cumulative reward upto time t. Then R(t) t E(R i) E(X i ) if t. So long-run average reward per unit of time is used. (UTOPIAE) Introduction to Reliability Theory 13 / 21

Assumption underlying use of renewal reward theorem often unrealistic; optimisation over one cycle, or a few cycli, or a fixed period of time more realistic. Analysis may be harder, and decisions typically less risky. In recent years, emphasis has shifted to service contracts of fixed length. This is a main game-changer, with possible advantages for reliability theory (e.g. wrt data ownership) and many new challenges for decision making, e.g. planning and location of spare parts. (UTOPIAE) Introduction to Reliability Theory 14 / 21

Survival Signature System with K 2 types of components m k components of type k {1, 2,..., K }, with K k=1 m k = m State vector x = (x 1, x 2,..., x K ), with x k = (x1 k, x 2 k,..., x m k k ) the sub-vector representing the states of the components of type k with xi k = 1 if the ith component of type k functions and xi k = 0 if not. Structure function φ(x) = 1 if system functions with state x and φ(x) = 0 if not. (UTOPIAE) Introduction to Reliability Theory 15 / 21

The Survival Signature Φ(l 1, l 2,..., l K ), with l k = 0, 1,..., m k, is the probability that a system functions given that precisely l k of its components of type k function, for each k {1, 2,..., K }. There are ( m k l k ) state vectors x k with precisely l k of their m k components x k i = 1, so with m k i=1 x k i = l k. Let S l1,...,l K denote the set of all state vectors for the whole system for which m k i=1 x i k = l k, k = 1, 2,..., K. Assuming exchangeability of the failure times of the m k components of type k Φ(l 1,..., l K ) = [ K k=1 ( mk l k ) 1 ] x S l1,...,l K φ(x) (UTOPIAE) Introduction to Reliability Theory 16 / 21

Probability system functions at time t Let Ct k {0, 1,..., m k } denote the number of components of type k in the system that function at time t > 0. If the probability distribution for the failure time of components of type k is known and has CDF F k (t), and we assume failure times of components of different types to be independent, then the probability that the system functions at time t > 0 is m 1 P(T S > t) = m 1 l 1 =0 m K l K =0 [ l 1 =0 m K l K =0 Φ(l 1,..., l K ) Φ(l 1,..., l K )P( K k=1 (( mk l k K {Ct k = l k }) = k=1 ) ) ] [F k (t)] m k l k [1 F k (t)] l k (UTOPIAE) Introduction to Reliability Theory 17 / 21

Presented at 1st UTOPIAE Training School (Glasgow, November 2017). Main challenges include upscaling to very large real-world systems (with zooming in ), applications to systems with multiple functions and to networks with different routes through them. For some questions of practical relevance other summaries of the full structure function may be required. (UTOPIAE) Introduction to Reliability Theory 18 / 21

Imprecision Use of imprecise probability theory (also shown at 1st UTOPIAE Training School) is attractive in reliability, particularly when considering new or updated systems or components, or when there is doubt about some model assumptions. Inference and decision problems with imprecise probability tend to require solution of constrained optimisation problems, with the inference as target function and the set of probabilities as constraint. There are more opportunities to use idea of imprecision to provide robustness, enabling simple models to be used (e.g. in Accelerated Life Testing). Overall goal: Make the jobs of practitioners easier or better, preferably both! Many opportunities and challenges! (UTOPIAE) Introduction to Reliability Theory 19 / 21

Resilience In recent years there has been increasing attention to Resilience : if things go wrong, how can negative effect be minimized? Hugely important topic! For example for systems with multiple functions or multiple phases, some critical functions or phases may still be possible after a failure, possibly by re-configuring the system. Of course, this also involves the usual aspects of maintenance, repair, etc. Very many challenges, perhaps generalizing structure function as (imprecise) probability will prove useful. (UTOPIAE) Introduction to Reliability Theory 20 / 21

The topic of Reliability Theory is extremely wide, including aspects from very many traditional areas of mathematics and engineering. For substantial progress it is important to develop theory and methods closely linked to real-world problems, as the gap between text-book problems and reality is often very large. UTOPIAE ESRs are well placed to make good contributions! Feel free to contact me for more information or references. (UTOPIAE) Introduction to Reliability Theory 21 / 21