FAULT TOLERANT SYSTEMS hp://www.cs.umass.du/c/orn/faultolransysms ar 4 Analyss Mhods Chapr HW Faul Tolranc ar.4.1 Duplx Sysms Boh procssors xcu h sam as If oupus ar n agrmn - rsul s assumd o b corrc If rsuls ar dffrn - w can no dnfy h fald procssor A hghr-lvl sofwar has o dcd how falur s o b handld Ths can b don usng on of svral mhods ar.4. ag 1
Duplx Rlably Two acv dncal procssors wh rlably R Lfm of duplx - m unl boh procssors fal C - Covrag Facor - probably ha a fauly procssor wll b corrcly dagnosd, dnfd and dsconncd Rduplx - h rlably of duplx sysm: Rduplx = Rcomp [ R² C R1-R ] - rlably of comparaor Rcomp ar.4.3 Duplx - Consan Falur Ras Each procssor has a consan falur ra l Idal comparaor - Rcomp=1 Duplx rlably - R duplx = C 1 MTTFduplx = 1/l C/l ar.4.4 ag
Faul Dcon: Frs Mhod - Accpanc Tss Accpanc Ts - a rang chc of ach procssor's oupu Exampl - h prssur n a gas conanr mus b n som nown rang W us smanc nformaon of h as o prdc whch valus of oupu ndca an rror How should h accpanc rang b pcd? ar.4.5 Accpanc Ts - Snsvy Vs. Spcfcy Narrow accpanc rang: hgh probably of dnfyng an ncorrc oupu, bu also a hgh probably ha a corrc oupu wll b msdnfd as rronous fals posv Wd accpanc rang: low probably of boh Snsvy - h condonal probably ha h s wll rcognz an rronous oupu as such Spcfcy h condonal probably ha h oupu s rronous f h s dnfd as such Narrow rang - hgh snsvy bu low spcfcy Wd rang - low snsvy bu hgh spcfcy ar.4.6 ag 3
Scond Mhod Hardwar Tsng Boh procssors ar subjcd o dagnosc ss Th procssor whch fals h s s dnfd as fauly Ral-lf ss ar nvr prfc Ts Covrag - sam as s snsvy - h probably ha h dagnosc s can dnfy a fauly procssor as such Ts Transparncy - h complmn of h s covrag - h probably ha h s passs a fauly procssor as good ar.4.7 Thrd Mhod - Forward Rcovry Us a hrd procssor o rpa h compuaon carrd ou by h duplx If only on of h hr procssors s fauly, h on ha dsagrs s h fauly on I s possbl o us a combnaon of hs mhods Accpanc s - qucs o run bu ofn h las snsv ar.4.8 ag 4
ar & Spar Sysm Avod dsrupon of opraon upon a msmach bwn h wo moduls n a duplx Dsconnc duplx and ransfr as o spar par Ts offln, and f faul s ransn - mar duplx as a good spar ar.4.9 Trplx-Duplx Archcur Form a rplx ou of duplxs Whn procssors n a duplx dsagr, boh ar swchd ou Allows smpl dnfcaon of fauly procssors Trplx can funcon vn f only on duplx s lf - duplx allows faul dcon ar.4.10 ag 5
Th osson rocss - Assumpons Non-drmnsc vns of som nd occurrng ovr m wh h followng probablsc bhavor For som consan l and a vry shor nrval of lngh D: 1. robably of on vn occurrng durng D s ld plus a nglgbl rm. robably of mor han on vn occurrng durng D s nglgbl 3. robably of no vns occurrng durng D s 1-lD plus a nglgbl rm ar.4.11 osson rocss - Drvaon N numbr of vns occurrng durng [0,] For a gvn, N s a random varabl =rob{n=} probably of vns occurrng durng a m prod of lngh =0,1,, Basd on h prvous assumpons: 1 for =1,, 1 and 1 0 0 ar.4.1 ag 6
osson rocss Dffrnal Equaons Ths rsuls n h dffrnal quaons: d = d 1 and d 0 = d 0 Wh h nal condons 0 = 0 for 1 and 0 0 = 1 Th soluon for =0,1,, s For a gvn, N has a osson dsrbuon wh h paramr l For all valus of, N s a osson procss wh ra l ar.4.13 =! osson rocss - roprs For a osson procss wh ra l : Expcd numbr of vns n an nrval of lngh s l Lngh of m bwn conscuv vns has an xponnal dsrbuon wh paramr l and man 1/ l Numbrs of vns n dsjon nrvals ar sascally ndpndn Sum of wo osson procsss wh paramrs l1 and l s a osson procss wh paramr l1 l ar.4.14 ag 7
Exampl of a osson rocss - Duplx wh Rdundancy Two acv procssors unlmd numbr of nacv spars Inducon procss nsananous, spars always funconal Each procssor has a consan falur ra l Lfm of a procssor - Exponnal dsrbuon wh paramr l Tm bwn wo conscuv falurs of sam logcal procssor - Exponnally dsrbud wh a paramr l N - numbr of falurs n on logcal procssor durng [0,] M - numbr of falurs n h duplx sysm durng [0,] ar.4.15 Duplx wh rdundancy - Rlably Calculaon Duplx has wo procssors - falur ra s l Comparaor falur ra - nglgbl robably of falurs n duplx n [0,] -? rob{m = } =?! for =0,1,, For h duplx no o fal, ach of hs falurs mus b dcd and succssfully rplacd - probably C For falurs probably Rduplx= = 0 rob{ C falurs} C = = 0 C! = = 0 C! = C = 1C ar.4.16 ag 8
Duplx wh Rdundancy Rlably - Alrnav Drvaon Indvdual procssors fal a ra l Ra of falurs n h duplx s l robably C of ach falur o b succssfully dal wh, and 1-C o caus duplx falur Falurs ha crash h duplx occur wh ra l1-c Th rlably of h sysm s 1 C ar.4.17 Mor Complx Sysms NMR sysms n whch falng procssors ar dnfd and rplacd from an nfn pool of spars - smlar calculaon o duplx Fn s of spars - h summaon n h rlably drvaon s cappd a ha numbr of spars, rahr han gong o nfny Ohr varaons of duplx sysms - On procssor s acv whl h scond s a sandby spar rocssors can b rpard whn hy bcom fauly Combnaoral argumns may b nsuffcn for rlably calculaon n mor complx sysms If falur ras ar consan, w can us Marov Modls for rlably calculaons ar.4.18 ag 9
Marov Chans - Inroducon Marov Modls provd a srucurd approach for h drvaon of h rlably of complx sysms A Marov Chan s a sochasc procss X - an nfn squnc of random varabls ndxd by m, wh a spcal probablsc srucur For a sochasc procss o b a Marov Chan, s fuur bhavor mus dpnd only on s prsn sa, and no on any pas sa Xs dpnds on X, bu gvn X, Xs dos no dpnd on any X for < If X= - h chan s n sa a m W dal only wh Marov Chans wh connuous m 0 and dscr sa X=0,1,, ar.4.19 Marov Chan - robablsc Inrpraon rob{xs=j X=,X=} = rob{xs=j X=} < Onc h chan movs no sa, says hr for a lngh of m whch has an xponnal dsrbuon wh paramr l - has a consan ra l of lavng sa Th probably ha whn lavng sa h chan wll mov o sa j wh j - j Transon ra from sa o sa j s = j = 1 j = j j j j ar.4.0 ag 10
Sa robabls - probably ha h procss s n sa a m, gvn sard a sa 0 a m 0 For gvn m nsan, sa and a vry small nrval of m D, h chan can b n sa a m D n on of h followng cass: I was n sa a m and has no movd durng h nrval D - probably 1 I was a som sa j a m j and movd from j o durng D: probably robably of mor han on ranson s nglgbl f D s small nough Ths assumpons rsul n 1 j j j j j ar.4.1 Dffrnal Equaons for Sa robabls d = jj d Snc d = d j j = j j j j j j Ths for =0,1,,... can now b solvd, usng h nal condons 00=1 and 0=0 for 0 ar.4. ag 11
Marov Chan for a Duplx wh a Sandby Exampl: On acv procssor and a on sandby spar -conncd whn h acv un fals Consan falur ra l of an acv procssor C- covrag facor - probably ha a falur of h acv procssor s corrcly dcd and h spar procssor s succssfully conncd Th Marov chan - ar.4.3 Dffrnal Equaons for Duplx wh Sandby j d d = d d = j Inal condons: 0=1, 10=00=0 d d = C d 1 1 0 1 d = 1 C j j j ar.4.4 ag 1
Rlably of Duplx wh Sandby Soluon of dffrnal quaons: 1 = = C 0 1 = 1 R sysm = 1 0 = 1 = C Exrcs - drv hs xprsson usng combnaoral argumns ar.4.5 Marov Chan for a Duplx wh Rpar Two acv procssors: ach wh falur ra l and rpar ra m rpar m s xponnal wh paramr m Th Marov modl j d d = j Th dffrnal quaons - d d = Inal condons - 0=1, 10=00=0 j 1 1 d = 0 1 0 d = 1 0 d d j j ar.4.6 ag 13
ag 14 ar.4.7 Duplx wh Rpar - Sa robabls Th soluon o h dffrnal quaons - = 1 = 1 1 0 = ar.4.8 Avalably vs. Rlably In sysms whou rpar, manly h rlably masur s of sgnfcanc Wh rpar avalably s mor manngful han rlably on Avalably - Ap = rob{th sysm s opraonal a m }=1-0 Rlably - R=rob{Th sysm s opraonal durng [0,] } - can b calculad by rmovng h ranson from sa 0 o sa 1, solvng h rsulng nw dffrnal quaons - R=1-0
Long-Run Avalably W calcula A - h long-run avalably - h proporon of m n h long run ha h sysm s opraonal W frs calcula h sady-sa probabls -, 1, and 0 or,1,0 Ths sady-sa probabls can b calculad n on of h wo mhods: lng approach n sng d/d=0 =0,1, and solvng h lnar quaons for, usng h rlaonshp 10=1 A=1-0 ar.4.9 Duplx wh Rpar Long-Run Avalably Sady sa probabls - = 1 = 0 = Long-run avalably - A = 1 = 1-0 = = 1 ar.4.30 ag 15