Statistical methods for decision making in mine action

Similar documents
Confidence Distribution

Space Assets for Demining Assistance (SADA) Feasibility Studies

Sequential Monitoring of Clinical Trials Session 4 - Bayesian Evaluation of Group Sequential Designs

Detection theory 101 ELEC-E5410 Signal Processing for Communications

REASONING UNDER UNCERTAINTY: CERTAINTY THEORY

Machine Learning

Machine Learning

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Introduction to Artificial Intelligence. Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST

Psychology 282 Lecture #4 Outline Inferences in SLR

Reasoning with Uncertainty

Introduction to Bayesian Statistics

CS540 Machine learning L9 Bayesian statistics

Artificial Intelligence

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Inference for a Population Proportion

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

Introduction to Statistical Inference

Model Averaging (Bayesian Learning)

Neutron inverse kinetics via Gaussian Processes

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Uncertain Inference and Artificial Intelligence

Computational Cognitive Science

Novel spectrum sensing schemes for Cognitive Radio Networks

Exact Inference by Complete Enumeration

Quantifying uncertainty & Bayesian networks

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan

Bayesian Networks Basic and simple graphs

Detection theory. H 0 : x[n] = w[n]

Bayesian Concept Learning

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Learning (II)

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Computational Cognitive Science

Available online at ScienceDirect. Procedia Engineering 119 (2015 ) 13 18

Data Privacy in Biomedicine. Lecture 11b: Performance Measures for System Evaluation

Failure prognostics in a particle filtering framework Application to a PEMFC stack

Efficient Likelihood-Free Inference

Optimal Sensor Placement for Intruder Detection

Detection and Estimation Chapter 1. Hypothesis Testing

Announcements. Proposals graded

Introduction to Artificial Intelligence. Unit # 11

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

Belief Update in CLG Bayesian Networks With Lazy Propagation

Principles of Bayesian Inference

PMR Learning as Inference

Probabilistic Graphical Models

Bayesian Models in Machine Learning

Part 3 Robust Bayesian statistics & applications in reliability networks

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Classical and Bayesian inference

Partial Operator Induction with Beta Distribution. Nil Geisweiller AGI-18 Prague

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Bayesian Analysis for Natural Language Processing Lecture 2

Principles of Bayesian Inference

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Overview

ECE531 Lecture 2b: Bayesian Hypothesis Testing

Likelihood and Bayesian Inference for Proportions

The Bayes Theorema. Converting pre-diagnostic odds into post-diagnostic odds. Prof. Dr. F. Vanstapel, MD PhD Laboratoriumgeneeskunde UZ KULeuven

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Defining posterior hypotheses in C2 systems

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Learning from Data. Amos Storkey, School of Informatics. Semester 1. amos/lfd/

Statistical Multisource-Multitarget Information Fusion

Chap 4. Software Reliability

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Grundlagen der Künstlichen Intelligenz

Choosing priors Class 15, Jeremy Orloff and Jonathan Bloom

An Empirical-Bayes Score for Discrete Bayesian Networks

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Machine Learning

Principles of Bayesian Inference

The Generalized Likelihood Uncertainty Estimation methodology

On the errors introduced by the naive Bayes independence assumption

Evidence with Uncertain Likelihoods

Bayesian Approaches Data Mining Selected Technique

Bayesian Inference. p(y)

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information:

Bayesian Estimation of Input Output Tables for Russia

Introduction to Bayesian Inference

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010

Uncertainty. Outline

Probability Theory Review

STONY BROOK UNIVERSITY. CEAS Technical Report 829

CS-E3210 Machine Learning: Basic Principles

The Naïve Bayes Classifier. Machine Learning Fall 2017

Transcription:

Statistical methods for decision making in mine action Jan Larsen Intelligent Signal Processing Technical University of Denmark jl@imm.dtu.dk, www.imm.dtu.dk/~jl Jan Larsen 1

Why do we need statistical models? Mine action is influenced by many uncertain factors statistical modeling is the principled framework to handle uncertainty The use of statistical modeling enables empirical, consistent and robust decisions with associated risk estimates from acquired data and prior knowledge Pitfalls and misuse of statistical methods sometimes wrongly leads to the conclusion that they are of little practical use Jan Larsen 2

The elements of statistical decision theory Data Sensor Calibration Post clearance External factors Prior knowledge Physical knowledge Experience Environment Statistical models Loss function Inference: Decision Risk assessment Assign probabilties to hypotheses Jan Larsen 3

Bayes theorem posterior = likelihood prior probability of data Jan Larsen 4

What are the requirements for mine action risk Tolerable risk for individuals comparable to other natural risks As Facts high cost efficiency as possible requires detailed risk analysis e.g. some areas might better be fenced 99.6% than is not cleared an unrealistic requirement Need But for today s professional methods risk achieve analysis, at most management 90% and and control are hard involving to evaluate!!! all GICHD partners and (MAC, FFI are NGOs, commercial etc.) currently working on such methods [Håvard Bach, Ove Dullum NDRF SC2006] Jan Larsen 5

Outline Statistical modeling What are the requirements for mine detection? The design and evaluation of mine equipment Improving performance by statistical learning and information fusion The advantage of using combined method Jan Larsen 6

A simple inference model assigning probabilities to data The detection system provides the probability of detection a mine in a specific area: Prob(detect) The land area usage behavior For discussion pattern of provides the probability of encounter: assumptions Prob(mine and encounter) involved factors see Risk Assessment of Prob(casualty)=(1-Prob(detect)) Minefields * Prob in (mine HMA encounter) a Bayesian Approach PhD Thesis, IMM/DTU 2005 by Jan Vistisen Jan Larsen 7

A simple loss/risk model Minimize the number of casualties Under mild assumptions this equivalent to minimizing the probability of casualty Jan Larsen 8

Requirements on detection probability Prob(causality)=(1-Prob(detection))*Prob(encounter) Prob(detection)=1-Prob(causality)/Prob(encounter) Prob(encounter)= ρ*a ρ : homogeneous mine density (mines/m 2 ), a: yearly footprint area (m 2 ) Prob(causality)=10-5 per year Jan Larsen 9

Maximum yearly footprint area in m 2 P(detection) ρ : mine density (mines/km 2 ) 0.1 1 10 100 1000 0.996 25000 2500 250 25 2.5 0.9 1000 100 10 1 0.1 Reference: Bjarne Haugstad, FFI Jan Larsen 10

Outline Statistical modeling What are the requirements for mine detection? The design and evaluation of mine equipment Improving performance by statistical learning and information fusion The advantage of using combined method Jan Larsen 11

Optimizing the MA operation System design phase Evaluation phase Changing environment Operation phase Mine types, placement Soil and physical properties Unmodeled confounds Overfitting Insufficient coverage of data Unmodeled confounding factors Insufficient model fusion and selection Jan Larsen 12

Designing a mine clearance system Confounding parameters target operational environment Methods sensors, dog, mechanical Statistical learning is a principled framework for combining information and achieving optimal decisions Prior knowledge expert, informal Design and test data Jan Larsen 13

Evaluation and testing How do we assess the performance/detection probability? What is the confidence? Jan Larsen 14

Confusion matrix True Estimated yes no yes a b no c d Detection probability (sensitivity): a/(a+c) False alarm: b/(a+b) Jan Larsen 15

Receiver operations curve (ROC) detection probability % 100 0 0 100 false alarm % Jan Larsen 16

Inferring the detection probability N independent mine areas for evaluation y detections observed true detection probability θ N Py ( θ) ~ Binom( θ N) = θ θ y y N y Jan Larsen 17

Posterior probability via Bayes formula θ θ prior Py ( θ ) = ( ) p P y ( ) Py ( ) Jan Larsen 18

Prior probability of θ No prior Non-informative prior Informative prior p( θ) = Uniform( θ 0,1) p( θ) = Beta( θ α, β) Jan Larsen 19

Prior distribution mean=0.6 Jan Larsen 20

Posterior probability is also Beta P( y) Beta( y, n y) y+ α n y+ β θ = θ + α β + θ θ Jan Larsen 21

HPD credible sets the Bayesian confidence interval { } C 1-ε= θ: P( θ y) k( ε), P( C y) > 1 ε Jan Larsen 22

The required number of samples N We need to be confident about the estimated detection probability Prob( θ > 99.6%) = C 1 ε C 95% C 99% C 95% C 99% θ = 99.7% est 9303 18994 θ = 99.7% est 8317 18301 θ = 99.8% est 2285 3995 θ = 99.8% est 2147 3493 Uniform prior Informative prior α=0.9, β=0.6 Jan Larsen 23

Credible sets when detecting 100% Minimum number of samples N Prob( θ > 80%) Prob( θ > 99.6%) Prob( θ > 99.9%) C 95% 13 747 2994 C 99% 20 1148 4602 Jan Larsen 24

Outline Statistical modeling What are the requirements for mine detection? The design and evaluation of mine equipment Improving performance by statistical learning and information fusion The advantage of using combined method Jan Larsen 25

Improving performance by fusion of methods Methods (sensors, mechanical etc.) supplement each other by exploiting different aspect of physical environment Late integration Hierarchical integration Early integration Jan Larsen 26

Early integration sensor fusion Sensor 1 Sensor n Trainable sensor fusion Detection database Jan Larsen 27

Late integration decision fusion Sensor Signal processing Mechanical system Decision fusion Decision Jan Larsen 28

Suggestion Apply binary (mine/no mine) decision fusion to existing detection equipment Jan Larsen 29

Advantages Combination leads to a possible exponential increase in detection performance Combination leads to better robustness against changes in environmental conditions Jan Larsen 30

Challenges Need for certification procedure of equipment under well-specified conditions (ala ISO) Need for new procedures which estimate statistical dependences between existing methods Need for new procedures for statistically optimal combination Jan Larsen 31

Dependencies between methods Contingency tables Mine present yes Method j no Method i yes c11 c10 no c01 c00 Jan Larsen 32

Optimal combination Method 1 Method K 0/1 0/1 Combiner 0/1 Optimal combiner depends on contingency tables Jan Larsen 33

Jan Larsen 34 Informatics and Mathematical Modelling / Intelligent Signal Processing Optimal combiner 1 0 1 0 1 0 1 1 1 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 7 6 5 4 3 2 1 2 1 Combiner Method 1 2 2 1 K possible combiners OR rule is optimal for independent methods

OR rule is optimal for independent methods Method 1: 1 0 0 1 0 0 1 0 1 0 Method 2: 0 1 0 0 1 0 1 1 1 0 Combined: 1 1 0 1 1 0 1 1 1 0 P ( OR) = P( yˆ ˆy = 1 y = 1) d 1 2 = 1 Py ( ˆ = 0 yˆ = 0 y= 1) 1 2 = 1 Py ( ˆ = 0 y= 1) Py ( ˆ = 0 y= 1) 1 2 = 1 (1 P ) (1 P ) d1 d2 independence Jan Larsen 35

False alarm follows a similar rule P fa ( OR) = Py ( ˆ ˆy = 1 y= 0) 1 2 = 1 Py ( ˆ = 0 yˆ = 0 y= 0) 1 2 = 1 Py ( ˆ = 0 y= 0) Py ( ˆ = 0 y= 0) 1 2 = 1 (1 P ) (1 P ) fa1 fa2 Jan Larsen 36

Example p = 0.8, p = 0.1 p = 0.7, p = 0.1 d1 fa1 d2 fa2 p p d fa = 1 (1 0.8) (1 0.7) = 0.94 = 1 (1 0.1) (1 0.1) = 0.19 Exponential increase in detection rate Linear increase in false alarm rate Joint discussions with: Bjarne Haugstad Jan Larsen 37

Testing independence Fisher s exact test Method i Method j yes no yes c11 c10 no c01 c00 Hypothesis: Method i and j are independent Alternatives: Dependent or positively (negatively) correlated H: Py ( ˆ = 0, yˆ = 0) = Py ( ˆ = 0) Py ( ˆ = 0) i j i j A: Py ( ˆ = 0, yˆ = 0) > Py ( ˆ = 0) Py ( ˆ = 0) i j i j Jan Larsen 38

Artificial example N=23 mines Method 1: P(detection)=0.8, P(false alarm)=0.1 Method 2: P(detection)=0.7, P(false alarm)=0.1 Resolution: 64 cells How does number of mines and cells influence the analysis? How does detection and false alarm rate influence the possibility of gaining by combining methods? Jan Larsen 39

Resolution High Many cells provide possibility accurate spatial localization of mines Good estimation of false alarm rate Poor detection rate Low Increased possibility of reliably estimating P(no mines in area) Poor spatial localization Jan Larsen 40

Confusion matrix for method 1 True yes no Estimated yes 19 5 no 4 36 Jan Larsen 41

Confidence of estimated detection rate With N=23 mines 95%-credible intervals for detection rates are extremely large!!!! Method1 (flail): Method2 (MD): [64.5% 82.6% 93.8%] [50.4% 69.6% 84.8%] Jan Larsen 42

Confidence for false alarm rates Determined by deployed resolution Large resolution - many cells gives many possibilities to evaluate false alarm. In present case: 64-23=41 non-mine cells Method1 (flail): [4.9% 12.2% 24.0%] Jan Larsen 43

100 Detection rates 90 80 70 60 Flail Metal detector % 50 40 Combined 30 20 10 Flail : 82.6 Metal detector: 69.6 Combined: 91.3 0 2 4 6 1 3 5 7 combination number Jan Larsen 44

40 35 30 False alarm rates Flail : 12.2 Metal detector: 7.3 Combined: 17.1 25 % 20 15 Flail Combined 10 Metal detector 5 0 2 4 6 1 3 5 7 combination number Jan Larsen 45

Comparing methods Is the combined method better than any of the two orginal? Since methods are evaluated on same data a paired statistical McNemar with improved power is useful Method1 (flail): 82.6% < 91.3% Combined Method2 (MD): 69.6% < 91.3% Combined Jan Larsen 46

They keys to a successful mine clearance system Use statistical learning which combines all available information in an optimal way informal knowledge data from design test phase confounding parameters (environment, target, operational) Combine many different methods using statistical fusion MineHunt System and HOSA concepts have been presented at NDRF summer conferences (98,99,01) Jan Larsen 47

Outline Statistical modeling What are the requirements for mine detection? The design and evaluation of mine equipment Improving performance by statistical learning and information fusion The advantage of using combined method Jan Larsen 48

How do we proceed? NDRF has decided to take a leading role in initiating a project COLUMBINE to carry out suggested research Jan Larsen 49

Project proposal Combine existing techniques: mechanical flail, dogs, metal detector, ground penetrating radar The methods use very different physical properties of mines and environment, hence the error patterns are likely to be independent Combining a few 60-90% methods will reach the goal Cost of 60%-90% systems are lower and requires fewer samples to evaluate and certify reliably Full efficiency and economic advantages has to include quality and management aspects Jan Larsen 50

Project work packages Current technologies identification a number of available and techniques aim is to clarify how information about the methods and their operation can be extracted and stored in an efficient way Physical properties of current technologies the aim is to get knowledge about how independent the methods are from a physical perspective suggest a list of promising method combination schemes under various environment conditions Controlled test of combined methods deployment of different methods and multiple runs on the same test lanes. objective is to clarify the degree of statistical dependence among methods under specific mine objects, environments, and equipment conditions. Jan Larsen 51

Project work packages Procedures for the use of complementary methods development of a mathematical modeling framework for combination of methods practical procedures for deploying complementary methods modeling will be based on prior information and data from test sites the belief function framework is a principled way to incorporate prior knowledge about the environment, mine density, informal knowledge (such as interviews with local people) etc. prior information will be combined with test data using a statistical decision theoretic framework sensor-based methods offer information integration at various levels: early integration of sensor signals via the Dempster-Shafer belief framework to late statistical based integration of object detections from single sensors very heterogeneous methods such as e.g. dogs and metal detector can only be combined at the decision level Jan Larsen 52

Project work packages Validation of proposed procedures test and validated on test sites in close cooperation with end-users suggest practical procedures with optimal cost-benefit tradeoff - requires significant engagement of end-users needs and views Mine action information management system All information about individual methods, the procedures, prior knowledge, environment etc. will be integrated in an information management system aim of providing a Total Quality Management of the mine action Jan Larsen 53

Are today s methods good enough? some operators believe that we already have sufficient clearance efficiency no single method achieve more than 90% efficiency clearance efficiency is perceived to be higher since many mine suspected areas actually have very few mines or a very uneven mine density today s post clearance control requires an unrealistically high number of sample to get statistically reliable results Jan Larsen 54

Are combined methods not already the common practice? today s combined schemes are ad hoc practices with limited scientific support and qualification believe that the full advantage of combined methods and procedures has not yet been achieved Jan Larsen 55

Does the project require a lot of new development? No basic research or development is required start from today s best practice and increase knowledge about the optimal use of the existing toolbox Jan Larsen 56

Is it realistic to design optimal strategies under highly variable operational conditions? it is already very hard to adapt existing methods to work with constantly high and proven efficiency under variable operational conditions proposed combined framework sets lower demand on clearance efficiency of the individual method and hence less sensitivity to environmental changes the uncertainty about clearance efficiency will be much less important when combining methods overall system will have an improved robustness to changing operational conditions Jan Larsen 57

Conclusions Statistical decision theory and modeling is essential for optimally using prior information and empirical evidence It is very hard to assess the necessary high performance which is required to have a tolerable risk of casualty Combined method are promising to overcome current problems Jan Larsen 58