Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments

Similar documents
Predictive inference under exchangeability

Imprecise Probability

Imprecise Bernoulli processes

A survey of the theory of coherent lower previsions

INDEPENDENT NATURAL EXTENSION

Practical implementation of possibilistic probability mass functions

Practical implementation of possibilistic probability mass functions

A gentle introduction to imprecise probability models

Calculating Bounds on Expected Return and First Passage Times in Finite-State Imprecise Birth-Death Chains

Extension of coherent lower previsions to unbounded random variables

Multiple Model Tracking by Imprecise Markov Trees

Reintroducing Credal Networks under Epistemic Irrelevance Jasper De Bock

Coherent Predictive Inference under Exchangeability with Imprecise Probabilities

Asymptotic properties of imprecise Markov chains

Coherent Predictive Inference under Exchangeability with Imprecise Probabilities (Extended Abstract) *

Sklar s theorem in an imprecise setting

Efficient Computation of Updated Lower Expectations for Imprecise Continuous-Time Hidden Markov Chains

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

LEXICOGRAPHIC CHOICE FUNCTIONS

arxiv: v1 [math.pr] 9 Jan 2016

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Optimal control of linear systems with quadratic cost and imprecise input noise Alexander Erreygers

CONDITIONAL MODELS: COHERENCE AND INFERENCE THROUGH SEQUENCES OF JOINT MASS FUNCTIONS

Serena Doria. Department of Sciences, University G.d Annunzio, Via dei Vestini, 31, Chieti, Italy. Received 7 July 2008; Revised 25 December 2008

Decision Making under Uncertainty using Imprecise Probabilities

Durham Research Online

Weak Dutch Books versus Strict Consistency with Lower Previsions

Conservative Inference Rule for Uncertain Reasoning under Incompleteness

On the Complexity of Strong and Epistemic Credal Networks

Introduction to Imprecise Probability and Imprecise Statistical Methods

arxiv: v3 [math.pr] 4 May 2017

Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks

Independence in Generalized Interval Probability. Yan Wang

Irrelevant and Independent Natural Extension for Sets of Desirable Gambles

arxiv: v6 [math.pr] 23 Jan 2015

GERT DE COOMAN AND DIRK AEYELS

Computing Minimax Decisions with Incomplete Observations

Exploiting Bayesian Network Sensitivity Functions for Inference in Credal Networks

Regular finite Markov chains with interval probabilities

A BEHAVIOURAL MODEL FOR VAGUE PROBABILITY ASSESSMENTS

On Conditional Independence in Evidence Theory

CS 188: Artificial Intelligence Fall Recap: Inference Example

Game-Theoretic Probability: Theory and Applications Glenn Shafer

Handling Uncertainty in Complex Systems

A Study of the Pari-Mutuel Model from the Point of View of Imprecise Probabilities

Dominating countably many forecasts.* T. Seidenfeld, M.J.Schervish, and J.B.Kadane. Carnegie Mellon University. May 5, 2011

BIVARIATE P-BOXES AND MAXITIVE FUNCTIONS. Keywords: Uni- and bivariate p-boxes, maxitive functions, focal sets, comonotonicity,

The Limit Behaviour of Imprecise Continuous-Time Markov Chains

Some exercises on coherent lower previsions

Active Elicitation of Imprecise Probability Models

Regression with Imprecise Data: A Robust Approach

Theory of Computation

On Markov Properties in Evidence Theory

Stochastic dominance with imprecise information

Coherence with Proper Scoring Rules

Three contrasts between two senses of coherence

Three-group ROC predictive analysis for ordinal outcomes

Harvard CS121 and CSCI E-121 Lecture 2: Mathematical Preliminaries

arxiv: v1 [math.pr] 26 Mar 2008

{a, b, c} {a, b} {a, c} {b, c} {a}

Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks

p L yi z n m x N n xi

arxiv: v3 [math.pr] 9 Dec 2010

Hidden Markov models 1

Continuous updating rules for imprecise probabilities

CS 133 : Automata Theory and Computability

Introduction to belief functions

Probabilistic Logics and Probabilistic Networks

Independence for Full Conditional Measures, Graphoids and Bayesian Networks

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Birth-death chain. X n. X k:n k,n 2 N k apple n. X k: L 2 N. x 1:n := x 1,...,x n. E n+1 ( x 1:n )=E n+1 ( x n ), 8x 1:n 2 X n.

An extension of chaotic probability models to real-valued variables

STA 4273H: Statistical Machine Learning

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Module 1. Probability

Journal of Mathematical Analysis and Applications

Three grades of coherence for non-archimedean preferences

Artificial Intelligence Markov Chains

Discrete time Markov chains. Discrete Time Markov Chains, Definition and classification. Probability axioms and first results

ON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES

Approximate Credal Network Updating by Linear Programming with Applications to Decision Making

Markov Chains and Stochastic Sampling

2.2 Some Consequences of the Completeness Axiom

Imprecise Continuous-Time Markov Chains: Efficient Computational Methods with Guaranteed Error Bounds

Optimisation under uncertainty applied to a bridge collision problem

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Semigroup presentations via boundaries in Cayley graphs 1

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Nonparametric predictive inference for ordinal data

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Control Theory : Course Summary

Bayesian Networks and Decision Graphs

Basic Probabilistic Reasoning SEG

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Likelihood-Based Naive Credal Classifier

Grundlagen der Künstlichen Intelligenz

Imprecise Filtering for Spacecraft Navigation

Hoeffding s inequality in game-theoretic probability

Theory of Computation

STA 414/2104: Machine Learning

Transcription:

Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments Gert de Cooman Ghent University SYSTeMS NLMUA2011, 7th September 2011

IMPRECISE PROBABILITY MODELS

Imprecise probability models Sets of probability models An imprecise probability model is a set M of precise probability models P More precisely: M is a closed convex set of probabilities.

Imprecise probability models Sets of probability models An imprecise probability model is a set M of precise probability models P More precisely: M is a closed convex set of probabilities. Conditioning To condition M on an event B, we condition each precise model in M : M B := {P( B): P M }

Imprecise probability models Lower and upper previsions Consider, for a function f P(f ) := min{p(f ): P M } P(f ) := max{p(f ): P M } The nonlinear functionals P and P determine M completely.

Imprecise probability models Lower and upper previsions Consider, for a function f P(f ) := min{p(f ): P M } P(f ) := max{p(f ): P M } The nonlinear functionals P and P determine M completely. Conditioning and the Generalised Bayes Rule When P(B) > 0 then P(f B) is the unique solution of the equation in x: P(I B [f x]) = 0.

@ARTICLE{walley2000, author = {Walley, Peter}, title = {Towards a unified theory of imprecise probability}, journal = {International Journal of Approximate Reasoning}, year = 2000, volume = 24, pages = {125--148} }

Imprecise probability models Accepting gambles Consider an exhaustive set Ω of mutually exclusive alternatives ω, exactly one of which obtains. Subject is uncertain about which alternative obtains. A gamble f : Ω R is interpreted as an uncertain reward: if the alternative that obtains is ω, then the reward for Subject is f (ω). Let G (Ω) be the set of all gambles on Ω.

Imprecise probability models Accepting gambles Subject accepts a gamble f if he accepts to engage in the following transaction, where 1. we determine which alternative ω obtains; 2. Subject receives f (ω). We try to model Subject s uncertainty by looking at which gambles in G (Ω) he accepts.

Imprecise probability models Coherent sets of desirable gambles Subject specifies a set D of gambles he accepts, his set of desirable gambles. D is called coherent if it satisfies the following rationality requirements: D1. if f 0 then f D [avoiding partial loss]; D2. if f > 0 then f D [accepting partial gain]; D3. if f 1 D and f 2 D then f 1 + f 2 D [combination]; D4. if f D then λf D for all non-negative real numbers λ [scaling]. Here f > 0 means f 0 and not f = 0. Walley has also argued that sets of desirable gambles should satisfy an additional axiom: D5. D is B-conglomerable for any partition B of Ω: if I B f D for all B B, then also f D [full conglomerability].

Imprecise probability models Natural extension as a form of inference Since D1 D4 are preserved under arbitrary non-empty intersections: Theorem Let A be any set of gambles. Then there is a coherent set of desirable gambles that includes A if and only if f 0 for all f posi(a ) In that case, the natural extension E (A ) of A is the smallest such coherent set, and given by: E (A ) := posi(a G + (Ω)).

Imprecise probability models Lower and upper previsions Given Subject s coherent set D, we can define his upper and lower previsions: so P(f ) = P( f ). P(f ) := inf{α : α f D} P(f ) := sup{α : f α D} P(f ) is the supremum price α for which Subject will buy the gamble f, i.e., accept the gamble f α. the lower probability P(A) := P(I A ) is Subject s supremum rate for betting on the event A.

Imprecise probability models Conditional lower and upper previsions We can also define Subject s conditional lower and upper previsions: for any gamble f and any non-empty subset B of Ω, with indicator I B : P(f B) := inf{α : I B (α f ) D} P(f B) := sup{α : I B (f α) D} so P(f B) = P( f B) and P(f ) = P(f Ω). P(f B) is the supremum price α for which Subject will buy the gamble f, i.e., accept the gamble f α, contingent on the occurrence of B. For any partition B, define the gamble P(f B) as P(f B)(ω) := P(f B), B B,ω B

Imprecise probability models Coherence of conditional lower and upper previsions Suppose you have a number of functionals P( B 1 ),...,P( B n ) These are called coherent if there is some coherent set of desirable gambles D that is B k -conglomerable for all k, such that P(f B k ) = sup{α R: I Bk (f α) D} B k B k,k = 1,...,n

Imprecise probability models Properties of conditional lower and upper previsions Theorem ([10]) Consider a coherent set of desirable gambles, let B be any non-empty subset of Ω, and let f, f 1 and f 2 be gambles on Ω. Then: 1. inf ω B f (ω) P(f B) P(f B) sup ω B f (ω) [positivity]; 2. P(f 1 + f 2 B) P(f 1 B) + P(f 2 B) [super-additivity]; 3. P(λf B) = λp(f B) for all real λ 0 [non-negative homogeneity]; 4. if B is a partition of Ω that refines the partition {B,B c } and D is B-conglomerable, then P(f B) P(P(f B) B) [conglomerative property].

Imprecise probability models Conditional previsions If P(f B) = P(f B) =: P(f B) then P(f B) is Subject s fair price or prevision for f, conditional on B. It is the fixed amount of utility that I am willing to exchange the uncertain reward f for, conditional on the occurrence of B. Related to de Finetti s fair prices [6], and to Huygens s [7] Corollary 1. inf ω B f (ω) P(f B); dit is my so veel weerdt als. 2. P(λf + µg B) = λp(f B) + µp(g B); 3. if B is a partition of Ω that refines the partition {B,B c } and if there is B-conglomerability, then P(f B) = P(P(f B) B).

IMPRECISE STOCHASTIC PROCESSES

@ARTICLE{cooman2008, author = {{d}e Cooman, Gert and Hermans, Filip}, title = {Imprecise probability trees: Bridging two theories of imprecise probability}, journal = {Artificial Intelligence}, year = 2008, volume = 172, pages = {1400--1427}, doi = {10.1016/j.artint.2008.03.001} }

Reality s event tree and move spaces Reality can make a number of moves, where the possible next moves may depend only on the previous moves he has made. We can represent Reality s moves by an event tree. In each non-terminal situation t, Reality has a set of possible next moves { W t := w: tw Ω }, called Reality s move space in situation t. W t may be infinite, but has at least two elements. We assume the event tree has finite horizon.

An event tree and its situations Situations are nodes in the event tree ω terminal initial t non-terminal

An event tree and its situations Situations are nodes in the event tree ω terminal initial t non-terminal

An event tree and its situations Situations are nodes in the event tree ω terminal initial t non-terminal

An event tree and its situations The sample space Ω is the set of all terminal situations

An event tree and its situations The partial order on the set Ω of all situations s s t t s precedes t

An event tree and its situations The partial order on the set Ω of all situations s s t t s strictly precedes t

An event tree and its situations An event A is a subset of the sample space Ω s E(s) = {ω Ω: s ω}

An event tree and its situations Cuts of the initial situation u 1 u 2 u 3 U

Reality s event tree and move spaces Reality can make a number of moves, where the possible next moves may depend only on the previous moves he has made. We can represent Reality s moves by an event tree. In each non-terminal situation t, Reality has a set of possible next moves { W t := w: tw Ω }, called Reality s move space in situation t. W t may be infinite, but has at least two elements. We assume the event tree has finite horizon.

Reality s move space W t in a non-terminal situation t t sw 1 w 3 tw 3 w4 tw 4 w 5 tw 5 W s = {w 1,w 2 } s w 1 W t = {w 3,w 4,w 5 } w2 sw 2

Coherent immediate prediction Immediate prediction models In each non-terminal situation t, Forecaster has beliefs about which move w t W t Reality will make immediately afterwards. Forecaster specifies those local predictive beliefs in the form of a coherent set of desirable gambles D t on G (W t ). This leads to an immediate prediction model D t, t Ω \ Ω. Coherence here means D1 D4.

Coherent immediate prediction Immediate prediction models t sw 1 w 3 tw 3 w4 tw 4 w 5 tw 5 R s G ({w 1,w 2 }) s w 1 R t G ({w 3,w 4,w 5 }) w2 sw 2

Coherent immediate prediction From a local to a global model How to combine the local pieces of information into a global model, i.e., which gambles f on the entire sample space Ω does Forecaster accept? For each non-terminal situation t and each h t D t, Forecaster accepts the gamble I E(t) h t on Ω, where I E(t) h t (ω) := { 0 t ω h t (w) tw ω,w W t I E(t) h t represents the gamble on Ω that is called off unless Reality ends up in situation t, and then depends only on Reality s move immediately after t, and gives the same value h t (w) to all paths ω that go through tw.

Coherent immediate prediction From a local to a global model So Forecaster accepts all gambles in the set { } D := I E(t) h t : h t D t,t Ω \ Ω. Find the natural extension E (D) of D: the smallest subset of G (Ω) that includes D and is coherent, i.e., satisfies D1 D4 and cut conglomerability.

Coherent immediate prediction Cut conglomerability We want predictive models, so we will condition on the E(t), i.e., on the event that we get to situation t. The E(t) are the only events that we can legitimately condition on. The events E(t) form a partition B U of the sample space Ω iff the situations t belong to a cut U. Definition A set of desirable gambles D on Ω is cut-conglomerable (D5 ) if it is B U -conglomerable for all cuts U: ( u U)(I E(u) f D) f D.

Coherent immediate prediction Selections and gamble processes A t-selection S is a process, defined on all non-terminal situations s that follow t, and such that S (s) D s. It selects, in advance, a desirable gamble S (s) from the available desirable gambles in each non-terminal s t. With a t-selection S, we can associate a real-valued t-gamble process I S, which is a t-process such that for all s t and w W s, I S (sw) = I S (s) + S (s)(w), I S (t) = 0.

Coherent immediate prediction Selections and gamble processes w S (t)(w) w 1 +3 w 2 4 W t = {w 1,w 2 } t w 1 tw 1 I S (tw 1 ) = 2 + 3 = 5 w2 tw 2 I S (t) = 2 I S (tw 2 ) = 2 4 = 2

Coherent immediate prediction Marginal Extension Theorem Theorem (Marginal Extension Theorem) There is a smallest set of gambles that satisfies D1 D4 and D5 and includes D. This natural extension of D is given by } E (D) := {g G (Ω): g IΩ S for some -selection S. Moreover, for any non-terminal situation t and any t-gamble g, it holds that I E(t) g E (D) if and only if there is some t-selection S t such that g I S t Ω.

Coherent immediate prediction Predictive lower and upper previsions Use the coherent set of desirable gambles E (D) to define special lower (and upper) previsions P( t) := P( E(t)) conditional on an event E(t). For any gamble f on Ω and for any non-terminal situation t, P(f t) := sup { α : I E(t) (f α) E (D) } } = sup {α : f α IΩ S for some t-selection S. We call such conditional lower previsions predictive lower previsions for Forecaster. For a cut U of t, define the U-measurable t-gamble P(f U) by P(f U)(ω) := P(f u), u U, u ω.

Properties of predictive previsions Concatenation Theorem Theorem (Concatenation Theorem) Consider any two cuts U and V of a situation t such that U precedes V. Then for all t-gambles f on Ω, 1. P(f t) = P(P(f U) t); 2. P(f U) = P(P(f V) U). We can calculate P(f t) by backwards recursion, starting with P(f ω) = f (ω), and using only the local models: P s (g) = sup{α : g α D s }, where the non-terminal s t and g is a gamble on W s.

Properties of predictive previsions Concatenation Theorem P(f u 1 ) U P(f t) = P(P(f U) t) u 1 t u 2 P(f u 2 ) P(f ω 1 ) = f (ω 1 ) P(f ω 2 ) = f (ω 2 ) P(f ω 3 ) = f (ω 3 ) P(f ω 4 ) = f (ω 4 ) P(f ω 5 ) = f (ω 5 )

Properties of predictive previsions Envelope theorems Consider in each non-terminal situation s a compatible precise model P s on G (W s ): P s M s ( g G (W s ))(P s (g) P s (g)) This leads to collection of compatible probability trees in the sense of Huygens (and Shafer). Use the Concatenation Theorem to find the corresponding precise predictive previsions P(f t) for each compatible probability tree. Theorem (Lower Envelope Theorem) For all situations t and t-gambles f, P(f t) is the infimum (minimum) of the P(f t) over all compatible probability trees.

Considering unbounded time What if Reality s event tree no longer has a finite time horizon: how to calculate the lower prices/previsions P(f t)? The Shafer Vovk Ville approach { } sup α : f α limsupi S for some t-selection S. Open question(s): What does natural extension yield in this case, must coherence be strengthened to yield the Shafer Vovk Ville approach, and if so, how?

@ARTICLE{cooman2009, author = {{d}e Cooman, Gert and Hermans, Filip and Quaegehebeur, Erik}, title = {Imprecise {M}arkov chains and their limit behaviour}, journal = {Probability in the Engineering and Informational Sciences}, year = 2009, volume = 23, pages = {597--635}, doi = {10.1017/S0269964809990039} }

Imprecise Markov chains Discrete-time and finite state uncertain process Consider an uncertain process with variables X 1, X 2,..., X n,... Each X k assumes values in a finite set of states X. This leads to a standard event tree with situations s = (x 1,x 2,...,x n ), x k X, n 0 In each situation s there is a local imprecise belief model M s : a closed convex set of probability mass functions p on X. Associated local lower prevision P s : P s (f ) := min{e p (f ): p M s }; E p (f ) := f (x)p(x). x X

Imprecise Markov chains Example of a standard event tree X 1 b a (b,b) (b,a) (a,b) (a,a) X 2 (b,b,b) (b,b,a) (b,a,b) (b,a,a) (a,b,b) (a,b,a) (a,a,b) (a,a,a)

Imprecise Markov chains Precise Markov chain The uncertain process is a (stationary) precise Markov chain when all M s are singletons (precise), and M = {m 1 }, Markov Condition: M (x1,...,x n ) = {q( x n )}.

Imprecise Markov chains Probability tree for a precise Markov chain b q( b) (b,b) (b,a) q( b) q( a) (b,b,b) (b,b,a) (b,a,b) (b,a,a) m 1 a q( a) (a,b) (a,a) q( b) q( a) (a,b,b) (a,b,a) (a,a,b) (a,a,a)

Imprecise Markov chains Definition of an imprecise Markov chain The uncertain process is a (stationary) imprecise Markov chain when the Markov Condition is satisfied: M (x1,...,x n ) = Q( x n ). An imprecise Markov chain can be seen as an infinity of probability trees.

Imprecise Markov chains Probability tree for an imprecise Markov chain b Q( b) (b,b) (b,a) (b,b,b) Q( b) (b,b,a) (b,a,b) Q( a) (b,a,a) M 1 a Q( a) (a,b) (a,a) (a,b,b) Q( b) (a,b,a) (a,a,b) Q( a) (a,a,a)

Imprecise Markov chains Lower and upper transition operators T: G (X) G (X) and T: G (X) G (X) where for any gamble f on X: Tf (x) := min{e p (f ): p Q( x)} Tf (x) := max{e p (f ): p Q( x)} Then the Concatenation Formula yields: P n (f ) = P 1 (T n 1 f ) and P n (f ) = P 1 (T n 1 f ). Complexity is linear in the number of time steps!

Imprecise Markov chains An example with lower and upper mass functions [ ] q(a a) q(b a) q(c a) TI{{a}} TI {{b}} TI {{c}} = q(a b) q(b b) q(c b) q(a c) q(b c) q(c c) = 1 9 9 162 144 18 18 200 9 162 9 [ ] q(a a) q(b a) q(c a) TI{{a}} TI {{b}} TI {{c}} = q(a b) q(b b) q(c b) q(a c) q(b c) q(c c) = 1 19 19 172 154 28 28 200 19 172 19

Imprecise Markov chains An example with lower and upper mass functions n = 1 n = 2 n = 3 n = 4 n = 5 n = 6 n = 7 n = 8 n = 9 n = 10 n = 22 n = 1000

Imprecise Markov chains A Perron Frobenius Theorem Theorem ([4]) Consider a stationary imprecise Markov chain with finite state set X and an upper transition operator T. Suppose that T is regular, meaning that there is some n > 0 such that mint n I {{x}} > 0 for all x X. Then for every initial upper prevision P 1, the upper prevision P n = P 1 T n 1 for the state at time n converges point-wise to the same upper prevision P : lim P n(h) = lim P 1 (T n 1 h) := P (h) n n for all h in G (X). Moreover, the corresponding limit upper prevision E is the only T-invariant upper prevision on G (X), meaning that P = P T.

CREDAL NETWORKS

@ARTICLE{cooman2010, author = {{d}e Cooman, Gert and Hermans, Filip and Antonucci, Alessandro and Zaffalon, Marco}, title = {Epistemic irrelevance in credal nets: the case of imprecise {M}arkov trees}, journal = {International Journal of Approximate Reasoning}, year = 2010, volume = 51, pages = {1029--1052}, doi = {10.1016/j.ijar.2010.08.011} }

Credal trees X 1 X 2 X 5 X 7 X 3 X 4 X 6 X 8 X 9 X 10 X 11

Credal trees Local uncertainty models X m(i) X j... X i... X k the variable X i may assume a value in the finite set X i ; for each possible value x m(i) X m(i) of the mother variable X m(i), we have a conditional lower prevision P i ( x m(i) ): G (X i ) R: P i (f x m(i) ) = lower prevision of f (X i ), given that X m(i) = x m(i) local model P i ( X m(i) ) is a conditional lower prevision operator

Credal trees under epistemic irrelevance Definition Interpretation of graphical structure: Conditional on the mother variable, the non-parent non-descendants of each node variable are epistemically irrelevant to it and its descendants. X 1 X 2 X 5 X 7 X 3 X 4 X 6 X 8 X 9 X 10 X 11

Credal networks under epistemic irrelevance As an expert system When the credal network is a (Markov) tree we can find the joint model from the local models recursively, from leaves to root. Exact message passing algorithm credal tree treated as an expert system linear complexity in the number of nodes Python code written by Filip Hermans testing in cooperation with Marco Zaffalon and Alessandro Antonucci Current (toy) applications in HMMs character recognition [3] air traffic trajectory tracking and identification [1]

Example of application HMMs: character recognition for Dante s Divina Commedia Original text:... V I T A S 1 S 2 S 3 OCR output:... V O 1 I O 2 T O 3 O

Example of application HMMs: character recognition for Dante s Divina Commedia Accuracy 93.96% (7275/7743) Accuracy (if imprecise indeterminate) 64.97% (243/374) Determinacy 95.17% (7369/7743) Set-accuracy 93.58% (350/374) Single accuracy 95.43% (7032/7369) Indeterminate output size 2.97 over 21 Table: Precise vs. imprecise HMMs. Test results obtained by twofold cross-validation on the first two chants of Dante s Divina Commedia and n = 2. Quantification is achieved by IDM with s = 2 and modified Perks prior. The single-character output by the precise model is then guaranteed to be included in the set of characters the imprecise HMM identifies.

@INPROCEEDINGS{cooman2011, author = {De Bock, Jasper and {d}e Cooman, Gert}, title = {Imprecise probability trees: Bridging two theories of imprecise probability}, booktitle = {ISIPTA 09 -- Proceedings of the Sixth International Symposium on Imprecise Probability: Theories and Applications}, year = 2009, editor = {Coolen, Frank P. A. and {d}e Cooman, Gert and Fetz, Thomas and Oberguggenberger, Michael}, address = {Innsbruck, Austria}, publisher = {SIPTA} }

Imprecise Hidden Markov Model (ihmm) marginal model: Q 1 transition model: Q k ( X k 1 ) states: X 1 X 2... X k... X n outputs: O 1 O 2... O k... O n emission model S k ( X k )

Imprecise Hidden Markov Model (ihmm) Interpretation of the graphical structure: traditional credal networks X 1 X 2... X k... X n O 1 O 2... O k... O n non-parent non-descendants node and descendants strongly independent

Imprecise Hidden Markov Model (ihmm) Interpretation of the graphical structure: recent developments X 1 X 2... X k... X n O 1 O 2... O k... O n non-parent non-descendants node and descendants epistemically irrelevant

The notion of optimality we use: maximality We define a partial order on state sequences: ˆx 1:n x 1:n P 1 (I {{ˆx1:n }} I {{x1:n }} o 1:n ) > 0

The notion of optimality we use: maximality We define a partial order on state sequences: ˆx 1:n x 1:n P 1 (I {{ˆx1:n }} I {{x1:n }} o 1:n ) > 0 A state sequence is optimal if it is undominated, or maximal: ˆx 1:n opt(x 1:n o 1:n ) ( x 1:n X 1:n )x 1:n ˆx 1:n ( x 1:n X 1:n )P 1 (I {{x1:n }} I {{ˆx1:n }} o 1:n ) 0 ( x 1:n X 1:n )P 1 (I {{o1:n }}[I {{x1:n }} I {{ˆx1:n }}]) 0,

The notion of optimality we use: maximality We define a partial order on state sequences: ˆx 1:n x 1:n P 1 (I {{ˆx1:n }} I {{x1:n }} o 1:n ) > 0 A state sequence is optimal if it is undominated, or maximal: ˆx 1:n opt(x 1:n o 1:n ) ( x 1:n X 1:n )x 1:n ˆx 1:n ( x 1:n X 1:n )P 1 (I {{x1:n }} I {{ˆx1:n }} o 1:n ) 0 ( x 1:n X 1:n )P 1 (I {{o1:n }}[I {{x1:n }} I {{ˆx1:n }}]) 0, P 1 (I {{o1:n }}[I {{x1:n }} I {{ˆx1:n }}]) can be easily determined recursively!

The notion of optimality we use: maximality We define a partial order on state sequences: ˆx 1:n x 1:n P 1 (I {{ˆx1:n }} I {{x1:n }} o 1:n ) > 0 A state sequence is optimal if it is undominated, or maximal: ˆx 1:n opt(x 1:n o 1:n ) ( x 1:n X 1:n )x 1:n ˆx 1:n ( x 1:n X 1:n )P 1 (I {{x1:n }} I {{ˆx1:n }} o 1:n ) 0 ( x 1:n X 1:n )P 1 (I {{o1:n }}[I {{x1:n }} I {{ˆx1:n }}]) 0, P 1 (I {{o1:n }}[I {{x1:n }} I {{ˆx1:n }}]) can be easily determined recursively! Aim of our EstiHMM algorithm: Determine opt(x 1:n o 1:n ) exactly and efficiently.

Complexity of EstiHMM linear in the length of the chain n cubic in the number of states s linear in the number of maximal elements Comparable to the Viterbi counterpart for precise HMMs. Important conclusion: The complexity depends crucially on how the number of maximal elements increases with the length of the chain n

Number of maximal elements Output sequence (0,1,1,1,0,0) and imprecision 1%

Number of maximal elements Output sequence (0,1,1,1,0,0) and imprecision 5%

Number of maximal elements Output sequence (0,1,1,1,0,0) and imprecision 10%

Number of maximal elements Output sequence (0,1,1,1,0,0) and imprecision 25%

Number of maximal elements Output sequence (0,1,1,1,0,0) and imprecision 40%

Literature I Enrique Miranda, A survey of the theory of coherent lower previsions. International Journal of Approximate Reasoning, 48:628 658, 2008. Matthias Troffaes, Decision making under uncertainty using imprecise probabilities. International Journal of Approximate Reasoning, 45:17 29, 2007. Alessandro Antonucci, Alessio Benavoli, Marco Zaffalon, Gert de Cooman, and Filip Hermans. Multiple model tracking by imprecise Markov trees. In Proceedings of the 12th International Conference on Information Fusion (Seattle, WA, USA, July 6 9, 2009), pages 1767 1774, 2009. Gert de Cooman and Filip Hermans. Imprecise probability trees: Bridging two theories of imprecise probability. Artificial Intelligence, 172(11):1400 1427, 2008.

Literature II Gert de Cooman, Filip Hermans, Alessandro Antonucci, and Marco Zaffalon. Epistemic irrelevance in credal networks: the case of imprecise Markov trees. International Journal of Approximate Reasoning, 51:1029 1052, 2010. Gert de Cooman, Filip Hermans, and Erik Quaeghebeur. Imprecise Markov chains and their limit behaviour. Probability in the Engineering and Informational Sciences, 23(4):597 635, 2009. arxiv:0801.0980. B. de Finetti. Teoria delle Probabilità. Einaudi, Turin, 1970.

Literature III B. de Finetti. Theory of Probability: A Critical Introductory Treatment. John Wiley & Sons, Chichester, 1974 1975. English translation of [5], two volumes. Christiaan Huygens. Van Rekeningh in Spelen van Geluck. 1656 1657. Reprinted in Volume XIV of [8]. Christiaan Huygens. Œuvres complètes de Christiaan Huygens. Martinus Nijhoff, Den Haag, 1888 1950. Twenty-two volumes. Available in digitised form from the Bibliothèque nationale de France (http://gallica.bnf.fr). G. Shafer and V. Vovk. Probability and Finance: It s Only a Game! Wiley, New York, 2001.

Literature IV Peter Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London, 1991. Thomas Augustin, Frank Coolen, Gert de Cooman and Matthias Troffaes (eds.). Introduction to Imprecise Probabilities. Wiley, Chichester, mid 2012. Also look at www.sipta.org