Reliability ad Availablity This set of otes is a combiatio of material from Prof. Doug Carmichael's otes for 13.21 ad Chapter 8 of Egieerig Statistics Hadbook. NIST/SEMATECH e-hadbook of Statistical Methods, http://www.itl.ist.gov/div898/hadbook/, 25. available free from: see: NIST/SEMATECH e-hadbook of Statistical Methods o CD Icludig ad improvig reliability of propulsio (ad other) systems is a challegig goal for system desigers. A approach has developed to tackle this challege: 1. a desig ad developmet philosophy 2. a test procedure for compoets ad total systems 3. a modellig procedure based o test results, field tests ad probability (statistics Desig ad developmet philosophy recogitio that reliability is a product is essetiall the abscece of failures or substadard performace of all critical systems i the desig, followed by a examiatio of the factors leadig to failure. Causes of failure: a. loadig: (iaccurate estimates of) thermal, mechaical or electriacl icludig vibratios b. stregth: (iaccurate estimates of) the load carryig capacity of the compoets c. eviromet: presece of dirt, high temperature, shock, corrosio, moisture, etc. d. huma factors: heavy haded operators ("sailor proof"), wrog decisios (operator error), crimial activities (sabatoge), poor desig, tools left i critical compoets, use of icorect replacemets e. quality cotrol: or lack thereof; loose cotrol of materials ad maufacture, lack of ispectio, loose specificatios f. accidet; act of God, freak accidets, collisios g. acts of war: terorism, war damage desiger should recogize these potetial causes for failure ad try to desig devices that will resist failure. Detailed Desig Features a. try to accout for all possible situatios i the desig stage ad elimiate possible failures. Deliverig maximumloads ad miimum stregths b. assume that every compoet ca fail, examie the outcome of the failure ad try to reduce the risk of damage. Failure Modes ad Effects Aalysis (FEMA) c. istitute strict quality cotrol i maufacture ad maiteace d. have cleaarly defied specificatios (icludig material specificatios ad methods of testig) e. develop techology to meet ew challeges. coduct developmet testig. f. cosider possible war damage ad ship collisio g. carry out developmet testig i arduous coditios System Desig Features a. calculate probability of failures. (reliability ad availability aalysis b. improve system desig by stadby or redudat systems c. aalyze failures, ote treds d. specify clearly all operatig procedures (good operatig mauals) e. require ispectio, maiteace ad replacemet procedures (tred aalysis) Failure testig ad aalysis from field or laboratory tests o compoets or systems determie umber of operatig uits as a fuctio of time (life): 12/13/25 1
set up N_surv a typical survival curve might look like this: 1 8 omial survival curve umber survivig 6 4 2 2 4 6 8 1 defie the failure rate at time t as proportio_failig_i_δt δn() t 1 λ δt Nt () δt 1 d Nt () Nt () dt time - as "rate" > cosistet with time 1 populatio declie uits are: 1 d Nt () dt Nt () λ to make some estimates based o this sample: 1 dn () t Nt () fail_rate t λ dt l Nt ( + δt ) ( Nt, δt ) : Nt () δt () Nt () t λτ dτ () N ( ) ( ) Nt N I exp ( ) t λτ dτ calculate for modest δt.1 ad t 1, 4, 6, 12 fail_rate ( 4,.1).1 fail_rate ( 1,.1).1 fail_rate ( 6,.1).1 fail_rate ( 12,.1).1 looks like λ failure rate is a costat, ot uusual d Nt () dt defie... N I N ( t ) 1 λ Nt () or... t set... () N I exp λ dτ or... N( t) Nt 1 d λ Nt () costat Nt () dt d dt Nt () itegrate from N() t t 1 λ Nt () to t l ( ) λ dτ N λ :.1 N I : 1 : N I exp( λ t ) 1 Nt () 5 5 1 t 12/13/25 2
N.B. failure rate is ot ecessarily the same as (but ca be related to) (i this case it is) the probability of failure see Egieerig Statistics Hadbook a actual failure rate curve might look like this: set up bath tub 2 omial failure rate 1 8 three regios are evidet: failure rate 1-1 early failure period ifat mortality rate 1-8 itrisic failure period aka stable failure period > itrisic failure rate > 8 wearout failure period - materials wear out ad degradatio failures occur at a ever 2 4 6 8 1 icreasig rate for most systems, the failure rate time is relatively costat except for wer i ad wear out. If the failure rate is costat, the compoet is said to have radom failure. Reliability (applies to a particular missio with a defied duratio.) defied as the probability of operatig without degraded performace durig a specific time period. At time t 1, the umber operatig is N(t 1 ) ad N I is the iitial umber. The reliability is: Nt 1 Nt 1 t 1 Nt t 1 1 ( ) dn () t ( ) ( ) () N I ( ) N I Nt Rt 1 sice... 1 λ dt l λ dt Rt 1 exp λ dt with λ costat ( ) exp ( λ t 1 ) Rt 1 ad expadig i a series... ( ) N I (λ t 1 ) 2 (λ t 1 ) 3 ( ) 1 +.. Rt 1 λ t 1 + ad if... λ*t1 << 1, Rt ( 1 ) 1 λ t 1 e.g. λ t 1 :.5 1 λ t 1.95 exp ( λ t 1 ).951 Mea Time Betwee (Operatioal Missio) Failure (MTB(OM)F with field testig,data is collected i the form of operatig time, failures ad repair time. Durig the field operatio of a compoet or a system, there is a total umber of operatig hours ad a total umber of failures. MTB(OM)F is defied accumulated_life MBT ( OM)F umber_of_failures 2! 3! For radom failures, the failure rate λ umber_of_failures accumulated_life 1 MBT ( OM) F if... t 1 ( ) 1 t 1 < 1 Rt 1 λ t 1 1 MBT ( OM)F MBT ( OM) F 12/13/25 3
Probability of Failure (Q or F) t 1 if... λ*t1 sice probability of success + failure 1 R + Q 1 Q 1 R 1 exp( λ t 1 ) λ t 1 MTBF << 1 ow cosider separate compoets C1 ad C2 havig R 1 ad R 2 ad Q 1 ad Q 2. the... (R 1 + Q 1 ) ( R 2 + Q 2 ) 1 (R 1 + Q 1 ) ( R 2 + Q 2 ) expad R 1 R 2 + R 1 Q 2 + Q 1 R 2 + Q 1 Q 2 R 1 R 2 probability_both_c1_ad_c2_operatig R 1 Q 2 probability_c1_operatig_ad_c2_failed R 2 Q 1 probability_c2_operatig_ad_c1_failed Q 1 Q 2 probability_c1_ad_c2_failed Series Systems If it is ecessary for all systems to operate, the this termed a series system ad is represeted as a circuit as: From above; the probability that both are operatig is... R series R 1 R 2 more geerally, R series R 1 R 2 R 3.. R exp (λ i t 1 ) R i e.g.... R 1 :.9 R 2 :.9 R 3 :.9 2 compoets R 1 R 2.81 R :.9 6 compoets R 6.531 Parallel Systems If there is redudacy, ad either C1 or C2 is required for operatio the this is a parallel scheme... R parallel R 1 R 2 + R 1 Q 2 + Q 1 R 2 1 Q 1 Q 2 geerally... R parallel 1 Q 1 Q 2 Q 3.. Q 1 Q i whe Qi Q R parallel 1 Q i e.g.... R 1 :.9 R 2 :.9 R 3 :.9 Q i :.1 R :.9 2 compoets 1 Q i 2.99 12/13/25 4
R out of N see Hadbook of Statistical Methods sectio 8.1.8.4.R out of N model If a system has compoets ad reqires ay r to be operatioal; assumig all compoets have thesame reliability Ri all compoets operate idepedet of oe aother (as far as failure is cocered) the system ca survive ay ( - r) compoets failig, but fails at the istat the - r - 1)th compoet fails System reliability is give by the probability of exactly r compoets survivig to time t + the probability of exactly (r + 1) compoets survivig to time t... up to all survivig. These are biomial probabilities: R s () i t ) R i i ( 1 R i r i r for example (where Ri are ot 4 r 2 i.e. four compoets of which two are ecessarily equal... required for operatio 2 compoets R 1 R 2 Q 3 Q 4 + R 1 R 3 Q 2 Q 4 + R 1 R 4 Q 2 Q 3 + R 2 R 3 Q 1 Q 4 + R 2 R 4 Q 1 Q 3 + R 3 R 4 Q 1 Q 2 3 compoets 4 compoets R 1 R 2 R 3 Q 4 + R 1 R 3 R 4 Q 2 + R 1 R 2 R 4 Q 3 + R 2 R 3 R 4 Q 1 R 1 R 2 R 3 R 4 sum all these for R s R s R 1 R 2 Q 3 Q 4 + R 1 R 3 Q 2 Q 4 + R 1 R 4 Q 2 Q 3 + R 2 R 3 Q 1 Q 4 + R 2 R 4 Q 1 Q 3 + R 3 R 4 Q 1 Q 2... + R 1 R 2 R 3 Q 4 + R 1 R 3 R 4 Q 2 + R 1 R 2 R 4 Q 3 + R 2 R 3 R 4 Q 1... + R 1 R 2 R 3 R 4 N.B. a series system is oe with r i.e. all compoets must operate. a parallel system is oe with r 1 Stadby Systems Stadby sceario will be more reliable tha parallel as see i Hadbook of Statistical Methods sectio 8.1.8.5.Stadby model Availability Availability is the probability that a compoet is operatioal, i.e. it is ot beig repaired MTTR mea_time_to_repair total_time_for_repairs umber_of_repairs For every failure there should be a repair, so that the average compoet is repaired for the average time after it has operated for the average time betwee failures. Average time betwee failures is MTBF ad for repair MTTR, so assumig compoet is either operatig or beig repaired... availability A operatig_time operatig_time + repair_time MTBF MTBF + MTTR if... MTBF > MTTR ( << ) which it should be... 12/13/25 5
MTBF A 1 MTTR 1 (1 + a) 1 1 a a < 1 ( << ) MTBF + MTTR MTBF 1 + a probability that it is beig repaired is... Q A MTTR Q A 1 A MTBF ad as above, availability for series systems would be.. A series A 1 A 2 A 3.. A A i ad parallel... A parallel 1 Q 1 Q 2 Q 3.. Q 1 Q i 12/13/25 6