Risk Elicitation in Complex Systems: Application to Spacecraft Re-entry Simon Wilson 1 Cristina De Persis 1 Irene Huertas 2 Guillermo Ortega 2 1 School of Computer Science and Statistics Trinity College Dublin 2 European Space Research and Technology Center ESA-Noordwijk 20th May 2016
The motivation
The motivation 5400 tonnes over the last 40 years estimated to have survived re-entry from orbit; No reported casualties; More than 50 debris objects recovered and documented.
The motivation Surviving fragments pose risk to people and property; Greatest risk is probably the regulatory effect on the industry if there were a fatality; Sophisticated deterministic models of re-entry exist, based on finite element approaches: No attempt to discuss uncertainties; Models fail in cases of a highly energetic break-up event; Number of re-entries, controlled and not controlled, is increasing: Seen as a way to control the space junk problem.
The general research question Implement and evaluate a statistical risk assessment model that can: Derive the probability for the top event (explosion); This will be based on a combination of expert opinion and (sparse) data; There is (and will only be) limited data; Diverse expert opinion (no one is an expert on everything); Access to experts is time-limited; May be large variations in conditions surrounding the event;
Stage 1: Model Explosion OR Chemical reaction propellant+air Chemical reaction between hypergolic propellants burst of a battery cells OR Slow release of propellant Simultaneous release of hypergolic propellants Sudden release of propellant OR OR Burst of a pressure vessels Valve leakage Tank destruction Pipe rupture Exothermal chemical reactions overpressure short-circuit overcharge overdischarge corrosion
Probabilistic fault tree Build a fault tree of events that lead to failure Assign a probability θ 1j to elementary events j = 1,..., N; Under assumption of independence, implies probability of intermediate and top events e.g. Prob(Solve release of propellant) [ = 1 (1 Prob(Valve leakage))(1 Prob(Tank destruction)) ] (1 Prob(Pipe rupture)) ; Model parameterised by the elementary event θ 1j probabilities only.
Stage 2: Elicitation Need a prior on the probability of each elementary event θ 1j ; Group these events by expert (or group of events); We discuss everything with respect to a nominal conditions; Time consuming and difficult process so: Ask experts to specify a probability distribution for one of the elementary events in their group.
The beta distribution Prob(THETA opinions) 0 1 2 3 4 0.0 0.2 0.4 0.6 0.8 1.0 p(θ) = Γ(α + β) Γ(α) Γ(β) θα 1 (1 θ) β 1, Elicit values of α and β. 0 θ 1. THETA
Pairwise comparisons of event probabilities We use an idea from analytic hierarchy process to rank the θ 1j ; Experts are asked to specify based on their knowledge and experience if the occurrence of an event is: equally (=1), or moderately more (=3), or strongly more (=5), or very strongly (=7), or absolutely more (=9) probable than another; Can use AHP to map these to a weight for each event w j : Better than just using the raw comparison as a weight?
Mapping AHP weights to prior distributions We have one beta prior, say for event j p(θ 1j ); We have a weight for each event w j ; w j > w j event j more likely that event j so θ 1j > st θ 1j ; We take a high prior probability interval for θ 1j, say (θ L, θ U ) with P(θ L < θ 1j < θ U ) = 0.95; Create the equivalent interval for each θ 1j : w j w j { θ L < θ 1j < min 1, w } j θ U. w j Identify a beta distribution with these 95% probability limits.
Stage 3: Prediction A prior for each elementary event is assessed; Logic of fault tree gives P(top event) or any intermediate event as function of θ 1j ; Prior distribution of top event derived by simulation: Simulate sets of θ 1j from prior; For each set, derive P(top event) from fault tree logic.
Stage 4: Updating with Data From a particular re-entry: If the top event occurred, did not occur or was unobserved; Similarly for any intermediate event; Similarly for any elementary event; Likelihood is then probability of observing all of what we observed in terms of the θ i : Likelihood = Likelihood of observation of event Two cases: observed events 1 Data are observed under nominal conditions; 2 Date are observed under other (non-nominal) conditions. Principle: we always do inference for θ 1j s on the nominal conditions:
Nominal Likelihood Fault tree gives a causal relationship between events; Likelihood of an event is conditional on its parent events; We have a two stage procedure for determining the likelihood: 1 Work up the fault tree to the top event and logically deduce if any unobserved events must have occurred or not: e.g. if an unobserved event is an OR, and one parent is observed, then it must have occurred; 2 Work up the fault tree to the top event and evaluate the likelihood term for each observed node in the tree: e.g. if an observed event is an OR, and one parent is also observed, then it does not contribute to the likelihood;
Nominal Likelihood Relationship to parent events OR AND At least one parent observed to have occurred? All parents observed and all occurred? Likelihood (1) P(event occurs) = 1 NO YES YES NO All parents observed and all did not occur? Likelihood (1) At least one parent observed not to have occurred? Likelihood (2) P(event occurs) = 0 NO YES YES NO Likelihood (3) Likelihood (2) Likelihood (4)
Nominal Likelihood OR At least one parent observed to have occurred? Relationship to parent events AND All parents observed and all occurred? Likelihood (3) P(event occurs) = 1 (1 P(event j occurs)). j: unobserved or not deduced parent NO YES YES NO All parents observed and all did not occur? NO Likelihood (3) YES Likelihood (1) Likelihood (2) YES At least one parent observed not to have occurred? NO Likelihood (4) = Likelihood (4) P(event occurs) P(event j occurs). j: unobserved or not deduced parent
Non-nominal Likelihood We map data from non-nominal cases to a likelihood under the nominal case; We elicit relative risk for each elementary event probability in this case with the nominal case. θ 1j are weighted in the likelihood accordingly; Example: Expert weights the chance of elementary event j as moderately more likely in nominal case than the case in question; Each θ 1j in likelihood is replaced by θ 3 1j ; Intuition: seeing this happen once is like it happening 3 times under nominal conditions; This also permits a prediction of P(top event).
Posterior Not a binomial likelihood so beta prior not conjugate; We use importance sampling to generate samples from posterior of the θ 1j ; For high dimensional situations can use MCMC; Can obtain samples of posterior of P(top event) from samples of the θ 1j as before.
The entire procedure Construct fault tree INITIALISE MODEL Prior elicitation of elementary event probabilities in nominal case FOR EACH OCCURENCE Next event under nominal conditions? Elicit relative risk of elementary events Update posterior distribution of elementary event probabilities Prediction for Prob(top event) Observe data
Worked example fault tree
Worked example data AHP to get initial priors; Then data on 3 re-entries: 1 Nominal conditions, explosion not observed; 2 Nominal conditions, observe explosion, A 3 and A 4 ; 3 Non-nominal conditions, observe explosion and B 2. Relative risk for all C j is 1/5; Likelihoods are: 1 (1 θ 1j ), θ 1,A3 1 (1 θ 1,Dj ), 1 (1 θ1,c 5 j ). j j j
Worked example prior and posteriors for probabilities of elementary events
Worked example prior and posterior for probability of top event
Conclusion A procedure for producing a risk probability in a reasonably complex system; Tackled the challenges of doing this when: Data are sparse; Many experts and/or too much expert elicitation would be needed for the prior; Limited access to experts time; Conditions around event are important. Issues: All the usual ones to do with AHP (recall Fabrizio s tutorial); Sensitivity analysis; Fully probabilistic fault trees.