Adaptive Fault Tleance n ROS: A Cmpnent-Based Appach Jean-Chales Fabe, Michael Laue, Matthieu Amy LAAS-CNRS, Ave du Clnel Rche, F-31400 Tuluse, Fance -1-
Definitins Dependability: Ability t pvide sevices that can defensibly be tusted within a time-peid. Fault Tleance (FT) : Design and implementatin f mechanisms t cntl es (esidual, andm, systematic ) by detecting them and ensuing tansitins t a safe state Resiliency: The pesistence f dependability when facing changes -2-
Definitins Dependability: Ability t pvide sevices that can defensibly be tusted within a time-peid. Fault Tleance (FT) : Design and implementatin f mechanisms t cntl es (esidual, andm, systematic ) by detecting them and ensuing tansitins t a safe state Resiliency: The pesistence f dependability when facing changes Adaptive Fault Tleance (AFT): Design and implementatin f Fault Tleant Mechanisms (FTM) t ensue the dependability f the system at untime when facing changes -3-
blem statement and key cncepts Once the system is deplyed, it faces changes. System designes cannt pedict eveything. esistence f dependability equies the adaptatin f safety mechanism -4-
blem statement and key cncepts Once the system is deplyed, it faces changes. System designes cannt pedict eveything. esistence f dependability equies the adaptatin f safety mechanism Key cncepts f Adaptive Fault Tleance - Sepaatin f cncens - Design f adaptatin - Remte fine-gained updates -5-
Oveall pcess FTM as a Leg system -6-
Oveall pcess FTM as a Leg system -7-
Oveall pcess FTM as a Leg system Remte update Cmpnent gaph Suspend executin Mdificatin f the gaph Re-activate -8-
Oveall pcess FTM as a Leg system Change Safety analysis / FMECA Impact n safety mechanism Agile update f FTM Remte update Remte update Cmpnent gaph Suspend executin Mdificatin f the gaph Re-activate -9-
Assumptins and FTM Chaacteistics BR=imay-Backup Replicatin LFR=Leade-Fllwe Replicatin TR=Time Redundancy -10-
Assumptins and FTM Chaacteistics LFR FT LFR TR A, R A, R BR FT BR TR BR=imay-Backup Replicatin LFR=Leade-Fllwe Replicatin TR=Time Redundancy -11-
Assumptins and FTM Chaacteistics TRANSITIONS Tigge: high ate f HW tansient faults bseved LFR FT LFR TR A, R A, R BR FT BR TR Tigge: Nn deteministic SW applicatin vesin BR=imay-Backup Replicatin LFR=Leade-Fllwe Replicatin TR=Time Redundancy Tigge: bandwidth dp belw a given theshld -12-
Cmpnentizatin f FTM Cmpnent-based implementatin Tansitins between FTMs Design f adaptatin f FTMs Change mdel equest eply fault tleant pcessing applicatin sevice Client Seve -13-
Cmpnentizatin f FTM Cmpnent-based implementatin Tansitins between FTMs Design f adaptatin f FTMs Change mdel equest eply befe afte pceed applicatin sevice Client Seve -14-
Cmpnentizatin f FTM ptcl syncbefe eplylg pceed Cmpnent-based implementatin Tansitins between FTMs FTM syncafte Design f adaptatin f FTMs Change mdel equest eply befe afte pceed applicatin sevice Client Seve -15-
Design f FTM adaptatin n ROS Geneic cmputatin gaph f FTM (Bxes epesent ndes) clt2sv Client Seve Tpics(0) Ndes(2) Client Seve Sevices: clt2sv (client t seve) -16-
Design f FTM adaptatin n ROS Geneic cmputatin gaph f FTM (Bxes epesent ndes) p2bf Client clt2sv x y FTM pxy2p p2pxy t c l aft2p Befe bf2pd ceed pd2aft Afte pd2sv Seve Sevice Tpic Tpics(6) pxy2p pxy2bf, bf2pd,pd2aft aft2p p2pxy Ndes(5+2) Client Seve xy tcl Befe, ceed, Afte Sevices: clt2pxy (client t pxy) and pd2sv (pceed t seve) -17-
-18- Implementing BR n ROS Client Recvey x y Befe ceed Afte Seve_M t c l CLIENT RIMARY Befe ceed Afte t c l BACK-U Seve_S CD_M CD_S clt2pxy pxy2p p2bf bf2pd pd2aft aft2p p2pxy cd2ec ecvey getstate pd2sv_m setstate pd2sv_s p2bf aft2p MASTER SLAVE aft2aft Sevice Tpic bf2pd pd2aft
-19- Implementing BR n ROS Client Recvey x y Befe ceed Afte Seve_M t c l CLIENT RIMARY Befe ceed Afte t c l BACK-U Seve_S CD_M CD_S clt2pxy pxy2p p2bf bf2pd pd2aft aft2p p2pxy cd2ec ecvey getstate pd2sv_m setstate pd2sv_s p2bf aft2p MASTER SLAVE aft2aft Sevice Tpic bf2pd pd2aft
Implementing BR n ROS CLIENT Client clt2pxy x y Recvey Sevice Tpic ecvey pxy2p p2pxy t c l SLAVE BACK-U p2bf aft2aft Befe bf2pd pd2sv_s ceed Seve_S pd2aft setstate Afte aft2p CD_S cd2ec -20-
Implementing TR n ROS CLIENT Client clt2pxy x y pxy2p p2pxy t c l p2bf aft2bf aft2p TR Befe bf2pd ceed pd2aft Afte pd2sv_m MASTER getstate_m setstate_m Seve_M Sevice Tpic -21-
Cmbining FTM n ROS Geneic cmpsitin gaph f FTM p2bf Client clt2sv x y pxy2p p2pxy t c l Befe bf2pd ceed pd2aft Afte pd2sv Seve Sevice FTM1 aft2p Tpic tcl nde is a sftwae ack f ndes Befe ceed activatin f sevices ptcls Afte tcl nde can substitute f pceed nde It can be view as a fntend f the seve -22-
Cmbining FTM n ROS Geneic cmpsitin gaph f FTM p2bf Client clt2sv x y pxy2p p2pxy t c l Befe bf2pd ceed pd2aft Afte pd2sv Seve Sevice FTM1 aft2p Tpic tcl nde is a sftwae ack f ndes Befe ceed activatin f sevices ptcls Afte tcl nde can substitute f pceed nde It can be view as a fntend f the seve -23-
Cmbining FTM n ROS Geneic cmpsitin gaph f FTM p2bf p2bf Client clt2sv x y pxy2p p2pxy t c l Befe bf2pd pd2aft Afte t c l Befe bf2pd ceed pd2aft Afte pd2sv Seve Sevice FTM1 aft2p FTM2 aft2p Tpic tcl nde is a sftwae ack f ndes Befe ceed activatin f sevices ptcls Afte tcl nde can substitute f pceed nde It can be view as a fntend f the seve -24-
-25- Cmbining BR+TR n ROS Client Recvey x y Befe Afte t c l CLIENT MASTER Befe Afte t c l BACK-U CD_M CD_S clt2pxy pxy2p p2bf bf2pd_s aft2p p2pxy /cd2ec ecvey TR Befe ceed Afte Seve_M p2bf bf2pd pd2aft aft2p getstate_m setstate_m pd2sv_m aft2bf RIMARY TR Befe ceed Afte Seve_M p2bf bf2pd pd2aft aft2p getstate_s setstate_s pd2sv_s aft2bf SLAVE pd2aft_s bf2pd_m pd2aft_m p2bf aft2p aft2aft getstate_m estestate_s t c l t c l
Case Study Initializatin Initialisatin time aund 0,5s Time due t the initializatin f cmmunicatins by the ROS Maste Executin Aund 5ms f the BR and 2ms f the TR Requests evey 7cm f a ca diving at 50km.h -1 Recvey Recvey Reactivatin f 2 Tpics Recvey time aund 1ms Adaptatin & Cmpsitin Adaptatin Initializatin f new ndes Same de as Initializatin time ( 0,3s) Ubuntu Tusty 14.04 I5 Dual Ce 2,5GHz 8G DDR3 RAM -26-
ROS Maste : A single pint f failue The ROS Maste is equisite f: The cntl ve the system The cntl ve cmmunicatin The cntl ve the gaph The cntl ve the Ndes If the ROS Maste cash: Lss f the sftwae achitectue Ndes have t be eladed The state f the system is einitialized Citical lss in case f embedded systems Slutins t assue the eliability f the ROS Maste: Launching it n a distinct and eliable machine Check-pinting its state and esting it -27-
DMTC: Check inting the ROS Maste DMTC, hw des it wk: Wks with Linux kenel 2.6.9 and late Tanspaent (n ecmpilatin ) Vitualizatin f cess ID Check pinting with DMTC: cess is launch alng the cdinat A checkpint image is ceated f each pcess A estat scipt is ceated by cdinat DMTC shuld be able t checkpint the ROS Maste The lst f the ROS Maste shuld n lnge be a pblem -28-
Lessns leant Adaptive fault tleance Sepaatin f Cncen Design f Adaptatin SC+D4A FTM islatin and cmpnentizatin Installatin adaptatin f an FTM nline Nde can be stated and stpped Mapping at initializatin Nde Management AIs ae nt pvided by ROS f Nde Management Use signals and System calls fulfill the missing equiements Implementing dynamic binding Natual dynamic binding is als nt pvided by ROS Tpics and Sevices ae emapped at the initializatin -29-
Summay f dynamic adaptatin SC ROS ndes, cmpnent mapping t ndes D4A Cmpnentized FT design pattens tcl-befe-ceed-afte Ndes Mngmnt Unix system calls and ROS cmmands Dynamic Binding ROS sevices, pts, tpics Additinal lgic t ceate pts and tpics Maste CKT Check pint f the ROS Maste ROS Maste is n lnge a Single int f Failue -30-
Cnclusin Nw Adaptive Fault Tleance f Resilient Cmputing is pssible n ROS Design and validatin f FTMs is always caied ut ffline If applicatin can be teminated and e-launched : adaptatin OK Dynamic adaptatin : Extended AI f dynamic binding Cnsistency f ecnfiguatin? ceeding... Expeiments n ADAS with Renault SAS Evlutin f AUTOSAR int Adaptive AUTOSAR Expeimentatin n ROS Maste with DMTC -31-