Introduction to Imprecise Probability and Imprecise Statistical Methods

Similar documents
System reliability using the survival signature

Introduction to Reliability Theory (part 2)

Three-group ROC predictive analysis for ordinal outcomes

Nonparametric Predictive Inference (An Introduction)

Imprecise Probability

Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests

A gentle introduction to imprecise probability models

The structure function for system reliability as predictive (imprecise) probability

Nonparametric predictive inference for ordinal data

Finite Element Structural Analysis using Imprecise Probabilities Based on P-Box Representation

Two examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.

Nonparametric Predictive Inference for Acceptance Decisions. Mohamed A. Elsaeiti. A thesis presented for the degree of Doctor of Philosophy

Intro to Probability. Andrei Barbu

Semi-parametric predictive inference for bivariate data using copulas

Part 3 Robust Bayesian statistics & applications in reliability networks

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

The Fundamental Principle of Data Science

Interval-valued regression and classification models in the framework of machine learning

Whaddya know? Bayesian and Frequentist approaches to inverse problems

Lecture 2: Statistical Decision Theory (Part I)

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Lecture 3: Statistical Decision Theory (Part II)

Lecture 1: Bayesian Framework Basics

Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments

Bayesian Reliability Demonstration

Behavioral Data Mining. Lecture 2

University of Bielefeld

A survey of the theory of coherent lower previsions

Game-Theoretic Probability: Theory and Applications Glenn Shafer

Uncertainty Quantification for Inverse Problems. November 7, 2011

CS-E3210 Machine Learning: Basic Principles

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Uncertain Inference and Artificial Intelligence

Lecture 3: Introduction to Complexity Regularization

COMP90051 Statistical Machine Learning

Naïve Bayes classification

CS-E3210 Machine Learning: Basic Principles

PATTERN RECOGNITION AND MACHINE LEARNING

Logistic Regression. COMP 527 Danushka Bollegala

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Models in Machine Learning

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

A BEHAVIOURAL MODEL FOR VAGUE PROBABILITY ASSESSMENTS

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer

Extension of coherent lower previsions to unbounded random variables

STAT 518 Intro Student Presentation

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Topics we covered. Machine Learning. Statistics. Optimization. Systems! Basics of probability Tail bounds Density Estimation Exponential Families

Introduction into Bayesian statistics

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

SAMPLE CHAPTERS UNESCO EOLSS FOUNDATIONS OF STATISTICS. Reinhard Viertl Vienna University of Technology, A-1040 Wien, Austria

Master of Science in Statistics A Proposal

Generative Techniques: Bayes Rule and the Axioms of Probability

Gaussian Processes for Machine Learning

Machine Learning 4771

Machine Learning using Bayesian Approaches

Uncertainty quantification and calibration of computer models. February 5th, 2014

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

INDEPENDENT NATURAL EXTENSION

CSE 546 Final Exam, Autumn 2013

Bayesian Learning (II)

From Bayes Theorem to Pattern Recognition via Bayes Rule

The Bayes classifier

IMPRECISE RELIABILITY: AN INTRODUCTORY REVIEW

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n

Lecture 9: PGM Learning

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

Decision Making Under Uncertainty. First Masterclass

ECE521 lecture 4: 19 January Optimization, MLE, regularization

Overfitting, Bias / Variance Analysis

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Bayesian Learning Extension

Machine Learning

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Selection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Week 2 Quantitative Analysis of Financial Markets Bayesian Analysis

Step-Stress Models and Associated Inference

Machine Learning, Fall 2009: Midterm

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Coherent Predictive Inference under Exchangeability with Imprecise Probabilities (Extended Abstract) *

ECE521 week 3: 23/26 January 2017

Machine Learning. VC Dimension and Model Complexity. Eric Xing , Fall 2015

Probability Theory for Machine Learning. Chris Cremer September 2015

Least Squares Regression

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Weibull Reliability Analysis

Announcements. Proposals graded

Introduction to Bayesian Learning. Machine Learning Fall 2018

Transcription:

Introduction to Imprecise Probability and Imprecise Statistical Methods Frank Coolen UTOPIAE Training School, Strathclyde University 22 November 2017 (UTOPIAE) Intro to IP 1 / 20

Probability Classical, precise probability: P(A) for each event A of interest, satisfying Kolmogorov s axioms Very successful! Applied probability, stochastic processes, operations research, statistics, decision support, reliability, risk, et cetera Problems? Requires a very high level of precision and consistency of information May be too restrictive to cope carefully with the multi-dimensional nature of uncertainty Quality of underlying knowledge and information cannot be adequately represented (UTOPIAE) Intro to IP 2 / 20

EVERITT WILEY SERIES IN PROBABILITY AND STATISTICS LANDAU dition LEESE STAHL Daniel Stahl ssifying multivariate to such subgroups, tructure or patterns ide range of areas ioinformatics. is includes coverage er dealing with finite rate the application illustrate graphical non-mathematical, hniques, with focus ion, including new nd examples from ecent developments r structured data. is and data analysis Cluster Analysis 5th Edition Introduction to Imprecise Probabilities m b n Edited by Thomas Augustin Frank P. A. Coolen Gert de Cooman Matthias C. M. Troffaes Figure: Edited volume; 2014 (UTOPIAE) Intro to IP 3 / 20

Imprecise Probability Lower Probability P(A) and Upper Probability P(A), with 0 P(A) P(A) 1 If P(A) = P(A) = P(A) for all events A: precise probability If P(A) = 0 and P(A) = 1: complete lack of knowledge about A For disjoint events A and B: P(A B) P(A) + P(B) and P(A B) P(A) + P(B) P(not-A) = 1 P(A) (UTOPIAE) Intro to IP 4 / 20

Closed convex set P of precise probability distributions: P(A) = inf p P p(a) and P(A) = sup p P p(a) Subjective interpretation: P(A): maximum price at which buying gamble paying 1 if A occurs and 0 else is desirable P(A): minimum price at which selling this gamble is desirable P(A) can be interpreted as reflecting the evidence in favour of event A, 1 P(A) as reflecting the evidence against A, hence in favour of not-a Imprecision (A) = P(A) P(A) reflects lack of perfect information about (the probability of) A (UTOPIAE) Intro to IP 5 / 20

Imprecise Probability : an unfortunate misnomer as lower and upper probabilities enable more accurate quantification of uncertainty than precise probability Brings together a variety of different theories (including Dempster-Shafer Theory and Interval Probability) Successful applications start to appear, e.g. in classification, reliability and risk analysis (e.g. P(A) = 0 and P(A) > 0 if A never yet observed) www.sipta.org (UTOPIAE) Intro to IP 6 / 20

Using Imprecise Probabilities P defined directly (e.g. set of (prior) distributions) or indirectly (e.g. constraints from expert elicitation) Inference using Imprecise Probabilities: mostly Generalized Bayesian Can reflect prior-data conflict Bounds on inferences as functions of all p P Robustness; indecision? May require further criteria to reach final decision Exploring the wider opportunities still at early stage Nonparametric Predictive Inference: ( objective ) frequentist theory www.npi-statistics.com (UTOPIAE) Intro to IP 7 / 20

Very many research challenges! Both on theory and applications, ideally combined Includes computational challenges (optimisation!), and e.g. also on simulation The main challenge: use imprecision to do things better or easier, preferably both! (UTOPIAE) Intro to IP 8 / 20

Imprecise statistical methods Example of generalized Bayes (standard precise model with set of prior distributions) in next lecture. I also strongly belief that imprecision, as a wider idea than imprecise probability, can be useful for new methods for dealing with uncertainty and information. For example, they may provide new perspectives on machine learning methods or even be quite ad hoc, but possibly relatively simple and useful. Two examples follow.. (UTOPIAE) Intro to IP 9 / 20

SVR-SRGM A robust weighted support vector regression-based software reliability growth model n training data (examples) {(x 1, y 1 ), (x 2, y 2 ),..., (x n, y n )}, in which x i R m represents a feature vector involving m features and y i R is an output variable. f (x) = a, φ(x) + b, with parameters a = (a 1,..., a m ), b; φ(x) is a feature map R m G such that the data points are mapped into an alternative higher-dimensional feature space G;, denotes the dot product. (UTOPIAE) Intro to IP 10 / 20

R SVR = min max a,b w P ( C ) n w i l(y i, f (x i )) + 1 2 a, a Imprecision comes into this method through the use of a set of weights w i : and and i=1 w i (1 ε)n 1 w 1 w 2... w n w 1 +... + w n = 1 We use ε as a further parameter that we choose for optimal performance for given data. This method works almost always better than standard approach with defined weights 1/n. (UTOPIAE) Intro to IP 11 / 20

Accelerated Life Testing Constant stress testing Challenge: transform information from increased stress level such that it is representative for information about failure times under the normal conditions. The power-weibull model: Weibull(α i, β) for failure times at stress level i = 0(normal), 1, 2,..., k, { ( ) } t β P(T > t) = exp α i (UTOPIAE) Intro to IP 12 / 20

Stress level i quantified by positive measurement V i, increasing in i α i = α ( ) p V0 An observation t i at stress level i, so subject to stress V i, can be interpreted as an observation at the normal stress level. V i ( ) p t i Vi V 0 (UTOPIAE) Intro to IP 13 / 20

Our Approach 1. We estimate (e.g. MLE) the parameters α, β, p 2. We use the estimate ˆp (only!) to transform data to the normal stress level 3. We use all combined (transformed) data at normal stress level for nonparametric predictive inference, providing lower and upper predictive survival functions for the next unit at normal stress level 4. To get further robustness wrt model assumptions and estimation bias, we replace single value of p by an interval; transformed observation becomes an interval Ad hoc? (UTOPIAE) Intro to IP 14 / 20

Data Voltage Data V 0 = 52.5 245, 246, 350, 550, 600, 740, 745, 1010, 1190, 1225, 1390, 1458, 1480, 1690, 1805, 2450, 3000, 4690, 6095, 6200 V 1 = 55.0 114, 132, 144, 162, 222, 258, 300, 312, 396, 444, 498, 520, 745, 772, 1240, 1266, 1464, 1740, 2440, 2600 V 2 = 57.5 168, 174, 234, 252, 288, 288, 294, 348, 390, 408, 444, 510, 528, 546, 558, 690, 696, 714, 900, 1000 Table: Failure times at three voltage levels. (UTOPIAE) Intro to IP 15 / 20

Survival function 0.0 0.2 0.4 0.6 0.8 1.0 S p=15.104 S p=15.104 0 1000 2000 3000 4000 5000 6000 t Figure: Lower and upper survival functions with ˆp = 15.104 (UTOPIAE) Intro to IP 16 / 20

Survival function 0.0 0.2 0.4 0.6 0.8 1.0 S p=14.5 S p=15.5 0 1000 2000 3000 4000 5000 6000 t Figure: Lower and upper survival functions with [p, p] = [14.5, 15.5] (UTOPIAE) Intro to IP 17 / 20

Survival function 0.0 0.2 0.4 0.6 0.8 1.0 S p=13 S p=17 0 1000 2000 3000 4000 5000 6000 t Figure: Lower and upper survival functions with [p, p] = [13.0, 17.0] (UTOPIAE) Intro to IP 18 / 20

Survival function 0.0 0.2 0.4 0.6 0.8 1.0 S p=10 S p=20 0 1000 2000 3000 4000 5000 6000 t Figure: Lower and upper survival functions with [p, p] = [10.0, 20.0] (UTOPIAE) Intro to IP 19 / 20

Collaborators Tahani Coolen-Maturi (Durham) Lev Utkin (St Petersburg) Yi-Chao Yin (China, visiting PhD student Durham 2015-16) (UTOPIAE) Intro to IP 20 / 20