STOCHASTIC MODELS FOR RELIABILITY, AVAILABILITY, AND MAINTAINABILITY

Similar documents
Availability. M(t) = 1 - e -mt

Performance Evaluation of Queuing Systems

IEOR 6711, HMWK 5, Professor Sigman

57:022 Principles of Design II Final Exam Solutions - Spring 1997

Part I Stochastic variables and Markov chains

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

IEOR 6711: Stochastic Models I, Fall 2003, Professor Whitt. Solutions to Final Exam: Thursday, December 18.

Stochastic Models. Edited by D.P. Heyman Bellcore. MJ. Sobel State University of New York at Stony Brook

Continuous-Time Markov Chain

LECTURE #6 BIRTH-DEATH PROCESS

Markov Reliability and Availability Analysis. Markov Processes

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Stochastic process. X, a series of random variables indexed by t

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

Cover Page. The handle holds various files of this Leiden University dissertation

RECURSION EQUATION FOR

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

MARKOV PROCESSES. Valerio Di Valerio

Tutorial: Optimal Control of Queueing Networks

Data analysis and stochastic modeling

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

The Transition Probability Function P ij (t)

Markov decision processes and interval Markov chains: exploiting the connection

Networking = Plumbing. Queueing Analysis: I. Last Lecture. Lecture Outline. Jeremiah Deng. 29 July 2013

Dynamic Control of a Tandem Queueing System with Abandonments

Probability, Random Processes and Inference

Page 0 of 5 Final Examination Name. Closed book. 120 minutes. Cover page plus five pages of exam.

Infinite-Horizon Average Reward Markov Decision Processes

Queueing Review. Christos Alexopoulos and Dave Goldsman 10/25/17. (mostly from BCNN) Georgia Institute of Technology, Atlanta, GA, USA

Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing

Queueing Review. Christos Alexopoulos and Dave Goldsman 10/6/16. (mostly from BCNN) Georgia Institute of Technology, Atlanta, GA, USA

Statistics 150: Spring 2007

Introduction to queuing theory

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking

EE 445 / 850: Final Examination

Probabilistic Model Checking and Strategy Synthesis for Robot Navigation

The Markov Decision Process (MDP) model

On Successive Lumping of Large Scale Systems

Part II: continuous time Markov chain (CTMC)

STAT 380 Continuous Time Markov Chains

ECE-517: Reinforcement Learning in Artificial Intelligence. Lecture 4: Discrete-Time Markov Chains

Random Walk on a Graph

Dynamic Control of Parallel-Server Systems

Markov Processes and Queues

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011

Fault-Tolerant Computing

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

1 Markov decision processes

Single-part-type, multiple stage systems. Lecturer: Stanley B. Gershwin

Introduction to Markov Chains, Queuing Theory, and Network Performance

IEOR 6711: Professor Whitt. Introduction to Markov Chains

Transient Solution of a Multi-Server Queue. with Catastrophes and Impatient Customers. when System is Down

CDA5530: Performance Models of Computers and Networks. Chapter 4: Elementary Queuing Theory

Introduction to Queuing Networks Solutions to Problem Sheet 3

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

PBW 654 Applied Statistics - I Urban Operations Research

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

Markov Processes Cont d. Kolmogorov Differential Equations

Time Reversibility and Burke s Theorem

Markov Decision Processes Chapter 17. Mausam

Modelling data networks stochastic processes and Markov chains

Continuous Time Processes

Outline. Finite source queue M/M/c//K Queues with impatience (balking, reneging, jockeying, retrial) Transient behavior Advanced Queue.

Infinite-Horizon Discounted Markov Decision Processes

Markov Model. Model representing the different resident states of a system, and the transitions between the different states

Reliability Analysis of an Anti-lock Braking System using Stochastic Petri Nets

ABHELSINKI UNIVERSITY OF TECHNOLOGY

MARKOV DECISION PROCESSES

56:171 Operations Research Final Exam December 12, 1994

THIELE CENTRE. The M/M/1 queue with inventory, lost sale and general lead times. Mohammad Saffari, Søren Asmussen and Rasoul Haji

Queues and Queueing Networks

Figure 10.1: Recording when the event E occurs

Stochastic Shortest Path Problems

Math 597/697: Solution 5

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

0utline. 1. Tools from Operations Research. 2. Applications

Lecture 9: Deterministic Fluid Models and Many-Server Heavy-Traffic Limits. IEOR 4615: Service Engineering Professor Whitt February 19, 2015

The random counting variable. Barbara Russo

Stochastic Networks and Parameter Uncertainty

Queueing Theory. VK Room: M Last updated: October 17, 2013.

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

Chapter 1. Introduction. 1.1 Stochastic process

Markov decision processes

Queueing. Chapter Continuous Time Markov Chains 2 CHAPTER 5. QUEUEING

MIT Manufacturing Systems Analysis Lectures 6 9: Flow Lines

The cost/reward formula has two specific widely used applications:

MARKOV MODEL WITH COSTS In Markov models we are often interested in cost calculations.

Systems Simulation Chapter 6: Queuing Models

Birth-Death Processes

Inventory Ordering Control for a Retrial Service Facility System Semi- MDP

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010

9. Reliability theory

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Liquidation in Limit Order Books. LOBs with Controlled Intensity

Quantitative evaluation of Dependability

Quantitative Model Checking (QMC) - SS12

Transcription:

STOCHASTIC MODELS FOR RELIABILITY, AVAILABILITY, AND MAINTAINABILITY Ph.D. Assistant Professor Industrial and Systems Engineering Auburn University RAM IX Summit November 2 nd 2016

Outline Introduction to Markov models Examples of Markov chains in RAM Extensions and more complex models Computational tools for Markov models International standards for Markov models in RAM Summary & Conclusion

INTRODUCTION TO MARKOV MODELS

What Is a Stochastic Process? A mathematical representation of a system that evolves over time, subject to random variation. The state of the system at a given time is captured by a random variable X n or X t Stochastic processes can evolve in: Discrete-time {X n, n = 0,1,2 } Continuous-time {X(t), t 0}

Examples of Stochastic Processes Stock price at closing of trading day Stock price at any given time Inventory of multiple items at the end of each day Traffic in a website at any given time Failure state of a component at any given time Failure state of several component at any given time

Applications of Stochastic Processes Revenue management Inventory planning Google s search engine Call center staffing Derivatives pricing Reliability, availability and maintainability

Defining a Stochastic Process States What are the states the system can occupy? Events What can happen in the system that triggers a change in the state? Probabilities What are the odds of events happening?

Markov Chains Markov chains are a class of stochastic process MCs have a discrete (countable) state space S Finite Or infinite E.g. 0,1, a, b, c, {0,1,2,3, } MCs can evolve in Discrete time Continuous time

Markov Chains MCs have the Markov Property : P X t 0 + s X t, t t 0 = P X t 0 + s X t 0 In words, the future only depends on the present and not on the past. In continuous time MCs, times between events follow an exponential distribution.

The Exponential Distribution If random variable X is exponential, then: P X > t = e λt E X = 1 λ P X > t + s X > s = P X > t = e λt This is called the memorylessness property.

Markov Chains Dynamics The MC starts out at each state i S with probability a(i) at time t = 0. The MC remains at state i for an exponentially distributed amount of time, with parameter λ i. After that the MC transitions to a new state j, with each state j S having a transition probability p ij. And so on

Defining a Markov Chain State space: S All possible states the system can occupy Initial distribution vector: a a s =Probability that the system is at state s at t = 0 Generator matrix: Q Q ij = ቊ λ ip ij λ i if i j if i = j

Transition Diagrams for MC A graph where: The nodes are each s S The arcs are each λ i p ij > 0 i j; i, j S Example S = 1,2 λ = 1,10 0.5 0.5 P = 0.2 0.8 0.5 0.5 Q = 2 2 0.5 1 2 2

Measures of Performance p s t : Probability of being in state s at time t In vector form: P(t) π s : Long-run average probability of being in state s In vector form: π MTTA: Average time until absorption Relevant when there is a set of absorbing states

Transient Analysis The objective is to calculate P(t) for some t P t = a e Qt = a I + n=1 Qt n n!

Steady-state analysis The objective is to compute π Solve the following system of equations πq = 0 π s = 1 s S

Mean Time To Absorption We can calculate MTTA as MTTA = s B z s Where A is the set of absorbent states, B are non-absorbent states and A B = S And where z s = න p s t dt 0

EXAMPLES OF MARKOV CHAIN MODELS FOR RELIABILITY, AVAILABILITY AND MAINTAINABILITY

Two-state Component Consider a single component that fails at a constant hazard rate λ. Assume that repairs occur at a rate μ. What is the probability that the component will be working in 1000 hours? What is the long-run availability of the component? What is the expected time to the first failure?

Two-state Component State space: S = {up, down} Transition diagram: λ up down μ Initial state: up

Two-state Component The probability of being in each state after t is: p up t = μ λ + μ + λ λ + μ e λ+μ t Therefore: p down t = λ λ + μ λ λ + μ e λ+μ t lim p up t = π up = t μ λ + μ lim p down t = π down = t λ λ + μ

N parallel components with shared repairs Consider a system with N identical components in parallel, which fail at a constant rate λ. Assume that repairs occur one at a time at a rate μ, per repair. There is only one repair resource What is the long-run availability of the system? On average how many components are operational? A1 A2

N parallel components with shared repairs State = Number of working components State space: S = {0,1,2,, N} Transition diagram: Nλ (N 1)λ (N 2)λ 3λ 2λ λ N N-1 N-2 2 1 0 μ μ μ μ μ μ

N parallel components with shared repairs Balance equations Normalization: Then, Nλπ N = μπ N 1 2λπ 2 = μπ 1 λπ 1 = μπ 0 N π n = 1 n=1 Availability = 1 π 0 = 1 N n! λ n n=0 μ n 1

N parallel components with shared repairs For example for λ = 0.01hr 1, μ = 0.1hr 1, N = 5 Availability = 1 π 0 = 1 N n=0 μ n n! λ n 1 = 99.932% N E Operational = n=1 n μn n! λ n π 0 = 4.36

Workstation-Fileserver Consider a system with 2 identical workstations and one fileserver, connected by a network. The system is operational as long as: At least 1 workstation is up The fileserver is up

Workstation-Fileserver Assuming exponentially distributed times to failure w : failure rate of workstation f : failure rate of file-server Assume that components are repairable w : repair rate of workstation f : repair rate of file-server Shares repairs. File-server has priority for repair over workstations

Workstation-Fileserver State = (Number of operational workstations, number of operational fileservers) S = { 2,1, 1,1, 0,1, 2,0, 1,0, (0,0)}

Workstation-Fileserver Transition diagram: 2 w w 2,1 1,1 0,1 w w f f f f f f 2,0 2 w 1,0 w 0,0

Workstation-Fileserver Long run average Availability Availability = π 2,1 + π 1,1 Example: 1 0.0001 h, w f 0.00005 h 1, w 1.0 h 1, f 0.5 h 1 Then Availability = 0.9999

Workstation-Fileserver Instantaneous availability A t = p 2,1 t + p 1,1 t w 0.0001 h 1, f 0.00005 h 1, w 1.0 h 1, f 0.5 h 1

Workstation-Fileserver Calculate time until system failure 2 w w 2,1 1,1 0,1 w f f 2,0 1,0

Workstation-Fileserver Let 0.0001 h w 1, f 0.00005 h 1, w 1.0 h 1, f 0.5 h 1 MTTF = z 2,1 + z 11 We can solve z 2,1, z 1,1 numerically using MATLAB as: z 2,1 = 0 p2,1 t dt, z 1,1 = 0 p1,1 t dt Then MTTF = 19992 hours

Workstation-Fileserver Suppose workstations cannot be repaired 2 w 2,1 1,1 w 0,1 f f 2,0 1,0

Workstation-Fileserver Let 0.0001 h w 1, f 0.00005 h 1, w 1.0 h 1, f 0.5 h 1 MTTF = z 2,1 + z 11 We can solve z 2,1, z 1,1 numerically using MATLAB as: z 2,1 = 0 p2,1 t dt, z 1,1 = 0 p1,1 t dt Then MTTF = 9993 hours

EXTENSIONS AND MORE COMPLEX MODELS

Infinite Markov Chains Suppose that your system consists of infinitely many states. Example: The state represents the number of components awaiting repair, from an infinite pool. Some infinite-state MCs are well understood Traditional queues M/M/1, M/M/c, etc. So-called quasi-birth-death processes. These kinds of models can be solved by Analytical methods (for queues) Matrix-Analytic methods (for quasi-birth-death processes). Fluid approximations to a continuous system

Phase-type Distributions Suppose that not all times are exponential Example: Repair times a nearly deterministic. PH-type distributions allow us to model the holding times at some states as other distributions. PH-type distributions use a new MC to model a single transition. Some distributions that can be well approximated by PHtype are: Erlang Deterministic Hyper- and Hypo-exponential

Markov Decision Process (MDP) Suppose you can make a decision at each transition. Example: Choose which component gets a repaired. MDPs add an Action Space A to the definition of a MC. Now the problem is not just determining performance. We choose a policy (the action to take in each state). The objective is to optimize a certain measure of performance. Like Maximize time to failure Maximize average availability Maximize number of repairs

COMPUTATIONAL TOOLS FOR MARKOV MODELS

Software for Markov Chains SMCSolver Butools SHARPE MKV jmarkov Available at www.jmarkov.org

jmarkov Object-oriented framework for Markov models Designed for modeling and solving Coded in Java Can be imported into other programs Consists of four modules: Core module: General finite MCs jqbd: Highly structured infinite MCs jphase: Structured MCs with phase-type times jmdp: Markov Decision Processes

jmarkov Core module Builds MC from simple rules. Handles finite discrete and continuous-time MCs. Calculates π and P t and expected rewards.

jqbd Only needs specification of the repeating structure. Calculates expected rewards.

jphase Allows specification of PH-type random variables Permits including PH-type variables in MC models Includes fitting capabilities for estimating PH parameters Generates PH-type random variates for simulations

jmdp Builds MDP models from simple rules Supports finite and infinite horizon, discrete and continuous time MDPs Calculates optimal deterministic policies for total, average and discounted cost criteria Solvers calculate model parameters on-the-fly

INTERNATIONAL STANDARDS FOR APPLYING FOR MARKOV MODELS IN RAM

IEC Standards The International Electrotechnical Commission (IEC) has 2 standards concerning Markov modeling techniques: IEC 61165: Application of Markov techniques IEC 61508: Functional safety of electrical/ electronic/ programmable electronic safety-related systems

IEC 61165 Provides guidance on the application of Markov techniques to model and analyze a system and estimate reliability, availability, maintainability and safety measures. Gives an overview of available methods. Reviews the relative merits of each method and their applicability.

IEC 61508 Part 1: General requirements Part 2: Requirements for electrical/ electronic/ programmable safety-related systems Part 3: Software requirements Part 4: Definitions and abbreviations Part 5: Examples of methods for the determination of safety integrity levels Part 6: Guidelines on the application of IEC 61508-2 and -3 Part 7: Overview of techniques and measures

SUMMARY & CONCLUSION

Conclusions Markov model are a powerful mathematical modeling technique. Markov Chains can be used in many applications of Reliability, Availability and Maintainability Markov models can provide exact analytical solutions to small problems. More elaborate Markov models can incorporate decision-making, non-exponential distributions and other extensions. Computational tools, such as jmarkov, can be used to solve large scale Markov models easily.

THANK YOU Contact: Daniel F. Silva www.jmarkov.org