Synthetic Probability Theory

Similar documents
Lecture 7. Sums of random variables

Lecture 11: Random Variables

1.1. MEASURES AND INTEGRALS

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

University of Oxford, Michaelis November 16, Categorical Semantics and Topos Theory Homotopy type theor

Lecture 6 Basic Probability

Discrete Probability Refresher

via Topos Theory Olivia Caramello University of Cambridge The unification of Mathematics via Topos Theory Olivia Caramello

Emanuele Bottazzi, University of Trento. January 24-25, 2013, Pisa. Elementary numerosity and measures. Emanuele Bottazzi. Preliminary notions

18.175: Lecture 2 Extension theorems, random variables, distributions

Probability and Measure

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Synthetic Computability

Math 564 Homework 1. Solutions.

JUSTIN HARTMANN. F n Σ.

1 Stat 605. Homework I. Due Feb. 1, 2011

Lecture 9: Sheaves. February 11, 2018

MATH 101B: ALGEBRA II PART A: HOMOLOGICAL ALGEBRA

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

A D VA N C E D P R O B A B I L - I T Y

Semantic Foundations for Probabilistic Programming

Notes 1 : Measure-theoretic foundations I

Topos Theory. Lectures 17-20: The interpretation of logic in categories. Olivia Caramello. Topos Theory. Olivia Caramello.

Lecture 1: Overview. January 24, 2018

1 Sequences of events and their limits

Part I: Propositional Calculus

Skew Boolean algebras

MA651 Topology. Lecture 9. Compactness 2.

CHAPTER II THE HAHN-BANACH EXTENSION THEOREMS AND EXISTENCE OF LINEAR FUNCTIONALS

cse371/mat371 LOGIC Professor Anita Wasilewska Fall 2018

25.1 Ergodicity and Metric Transitivity

Synthetic Computability (Computability Theory without Computers)

The Locale of Random Sequences

7 Convergence in R d and in Metric Spaces

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

AN ALGEBRAIC APPROACH TO GENERALIZED MEASURES OF INFORMATION

Probability and Measure

Chapter 2 Classical Probability Theories

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events.

Category theory and set theory: algebraic set theory as an example of their interaction

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

MODULI TOPOLOGY. 1. Grothendieck Topology

Measures and Measure Spaces

Information Theory and Statistics Lecture 3: Stationary ergodic processes

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10

STAT 7032 Probability Spring Wlodek Bryc

Statistics for Financial Engineering Session 2: Basic Set Theory March 19 th, 2006

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Part II Logic and Set Theory

A Monad For Randomized Algorithms

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Discrete Random Variables

Probability. Lecture Notes. Adolfo J. Rumbos

Dynamic Programming Lecture #4

Lecture Notes 3 Convergence (Chapter 5)

Reminder Notes for the Course on Measures on Topological Spaces

M17 MAT25-21 HOMEWORK 6

2 Probability, random elements, random sets

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Probability as Logic

Measure Theoretic Probability. P.J.C. Spreij. (minor revisions by S.G. Cox)

Lecture 2: Convergence of Random Variables

Analysis of Probabilistic Systems

Topological vectorspaces

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

1 Presessional Probability

Math-Stat-491-Fall2014-Notes-V

Part II. Logic and Set Theory. Year

MATH 418: Lectures on Conditional Expectation

Integration on Measure Spaces

C2.7: CATEGORY THEORY

Lectures 22-23: Conditional Expectations

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

which is a group homomorphism, such that if W V U, then

five years later Olivia Caramello TACL 2015, 25 June 2015 Université Paris 7 The theory of topos-theoretic bridges, five years later Olivia Caramello

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Propositional and Predicate Logic - VII

Measure Theoretic Probability. P.J.C. Spreij

1 Variance of a Random Variable

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

APPENDIX C: Measure Theoretic Issues

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Probability Review

Part III. 10 Topological Space Basics. Topological Spaces

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

Product measure and Fubini s theorem

AN EXTENSION OF THE PROBABILITY LOGIC LP P 2. Tatjana Stojanović 1, Ana Kaplarević-Mališić 1 and Zoran Ognjanović 2

Introduction to Model Theory

Part II Probability and Measure

Patrick Iglesias-Zemmour

Lecture 11. Probability Theory: an Overveiw

Math-Stat-491-Fall2014-Notes-I

Convergence of Random Variables

Lebesgue Measure on R n

A Non-Topological View of Dcpos as Convergence Spaces

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

EE514A Information Theory I Fall 2013

PROBABILITY LOGICS WITH VECTOR VALUED MEASURES. Vladimir Ristić

Transcription:

Synthetic Probability Theory Alex Simpson Faculty of Mathematics and Physics University of Ljubljana, Slovenia PSSL, Amsterdam, October 2018

Gian-Carlo Rota (1932-1999): The beginning definitions in any field of mathematics are always misleading, and the basic definitions of probability are perhaps the most misleading of all. Twelve Problems in Probability Theory No One Likes to Bring Up, The Fubini Lectures, 1998 (published 2001)

Random variables An A-valued random variable is: where: X : Ω A the value space A is a measurable space (set with σ-algebra of measurable subsets); the sample space Ω is a probability space (measurable space with probability measure P Ω ); and X is a measurable function.

Features of random variables σ-algebras: used to avoid non-measurable sets; used to implement dependencies, conditioning, etc. The sample space Ω : the carrier of all probability. Neither σ-algebras nor sample spaces are inherent in the concept of probability. Technical manipulations involving σ-algebras and sample spaces seem a cumbersome distraction when carrying out probabilistic arguments.

Features of random variables σ-algebras: used to avoid non-measurable sets; used to implement dependencies, conditioning, etc. The sample space Ω : the carrier of all probability. Neither σ-algebras nor sample spaces are inherent in the concept of probability. Technical manipulations involving σ-algebras and sample spaces seem a cumbersome distraction when carrying out probabilistic arguments. Goal: Reformulate probability theory with random variable as a primitive notion, governed by axioms formulated in terms of concepts with direct probabilistic meaning.

David Mumford: The basic object of study in probability is the random variable and I will argue that it should be treated as a basic construct, like spaces, groups and functions, and it is artificial and unnatural to define it in terms of measure theory. The Dawning of the Age of Stochasticity, 2000

Guiding model Let (K, J ω ) be the site with: objects standard Borel probability spaces; morphisms nullset reflecting measurable functions; Grothendieck topology the countable cover topology J ω. Then Ran = Sh(K, J ω ) is a Boolean topos validating DC in which R n carries a translation-invariant measure on all subsets. Internally in Ran, let (P, J at ) be the site with: objects standard Borel probability spaces; morphisms measure preserving functions; Grothendieck topology the atomic topology J at. Then Sh Ran (P, J at ) is again a Boolean topos validating DC in which R n carries a translation-invariant measure on all subsets. It also models primitive random variables.

Axiomatic setting We assume a Boolean elementary topos E validating DC. For any object A there is an object M 1 (A) of all probability measures defined on all subsets. M 1 (A) = {µ: P(A) [0, 1] µ is a probability measure}. The operation A M 1 (A) is functorial with f : A B lifting to the function that sends any probability measure µ: M 1 (A) to its pushforward f [µ]: M 1 (B) : f [µ](b ) = µ(f 1 B ). This functor further carries a (commutative) monad structure.

Primitive random variables Motivation: a random variable determines a probability distribution (but not vice versa). Axiom There is a functor RV : E E together with a natural transformation P: RV M 1. Interpretation: The object RV(A) is the set of A-valued random variables. P A : RV(A) M 1 (A) maps X RV(A) to its probability distribution P X : P(A) [0, 1] (we omit A from the notation). Naturality says that, for X RV(A), f : A B and B B, P f [X ] (B ) = P X (f 1 B )

Probability for a single random variable The equidistribution relation for X, Y RV(A) X Y P X = P Y For X RV [0, ], define the expectation E(X ) [0, ] by: E(X ) = x dp X x = sup n 0, 0<c 1 < <c n< n c i. P X (c i, c i+1 ] (c n+1 = ) i=1 Similarly, define variance, moments, etc.

Restriction of random variables Axiom If m : A B is a monomorphism then the naturality square below is a pullback. RV(A) X P X M 1 (A) RV(m) RV(B) Y PY M 1 (B) M 1 (m) Equivalently: Given Y RV(B) and A B with P Y (A) = 1, there exists a unique X RV(A) such that Y = i[x ], where i is the inclusion function.

Finite limits Proposition The functor RV : E E preserves equalisers. Axiom The functor RV : E E preserves finite products. I.e., every X RV(A) and Y RV(B) determines (X, Y ) RV(A B), and every Z RV(A B) is of this form. Axiom Equality of measures (finite products): P (X,Y ) = P (X,Y ) A A, B B. P (X,Y ) (A B ) = P (X,Y )(A B )

Almost sure equality Proposition For X, Y RV(A): X = Y P (X,Y ) {(x, y) x = y} = 1 (official notation) P(X = Y ) = 1 (informal notation) That is, equality of random variables is given by almost sure equality. This is an extensionality principle for random variables.

Independence Independence between X RV(A) and Y RV(B): X Y A A, B B. P (X,Y ) (A B ) = P X (A ). P Y (B ) Mutual independence X 1,..., X n X 1,..., X n 1 and (X 1,..., X n 1 ) X n Infinite mutual independence (X i ) i 1 n 1. X 1,..., X n

Nonstandard axioms for independence Axiom Preservation of independence: ( Z. Φ(Y, Z)) X. ( X Y Z. ( X Z Φ(Y, Z) ) Axiom Existence of independent random variables: Z. X Z Y Z

Proposition For every random variable X RV(A) there exists an infinite sequence (X i ) i 0 of mutually independent random variables with X i X for every X i.

Proposition For every random variable X RV(A) there exists an infinite sequence (X i ) i 0 of mutually independent random variables with X i X for every X i. Proof Let X 0 = X.

Proposition For every random variable X RV(A) there exists an infinite sequence (X i ) i 0 of mutually independent random variables with X i X for every X i. Proof Let X 0 = X. Given X 0,..., X i 1, the axiom for the existence of independent random variables gives us X i with X X i such that (X 0,..., X i 1 ) X i.

Proposition For every random variable X RV(A) there exists an infinite sequence (X i ) i 0 of mutually independent random variables with X i X for every X i. Proof Let X 0 = X. Given X 0,..., X i 1, the axiom for the existence of independent random variables gives us X i with X X i such that (X 0,..., X i 1 ) X i. This defines a sequence (X i ) i 0 by DC.

Infinite sequence of coin tosses Axiom There exists K RV{0, 1} with P K {0} = 1 2 = P K {1}. By the proposition there exists an infinite sequence (K i ) i 0 of independent random variables identically distributed to K. We would like to be able to prove the strong law of large numbers ( ) 1 n 1 P lim K i = 1 = 1 n n 2 i=0

Infinite sequence of coin tosses Axiom There exists K RV{0, 1} with P K {0} = 1 2 = P K {1}. By the proposition there exists an infinite sequence (K i ) i 0 of independent random variables identically distributed to K. We would like to be able to prove the strong law of large numbers ( ) 1 n 1 P lim K i = 1 = 1 n n 2 i=0 At the moment this cannot even be stated.

Countable products Axiom The functor RV : E E preserves countable products. I.e., every (X i ) i i 0 RV(A i) determines (X i ) i RV( i 0 A i), and every Z RV( i 0 A i) is of this form. Axiom Equality of measures (countable products): P (Xi ) i = P (X i ) i n. A 0 A 0,..., A n 1 A n 1. P (X0,...,X n 1 )(A 0... A n 1) = P (X 0,...,X n 1 ) (A 0... A n 1)

Countable products Axiom The functor RV : E E preserves countable products. I.e., every (X i ) i i 0 RV(A i) determines (X i ) i RV( i 0 A i), and every Z RV( i 0 A i) is of this form. Axiom Equality of measures (countable products): P (Xi ) i = P (X i ) i n. A 0 A 0,..., A n 1 A n 1. P (X0,...,X n 1 )(A 0... A n 1) = P (X 0,...,X n 1 ) (A 0... A n 1) The strong law of large numbers can now be stated.

Let (K i ) i be an infinite sequence of independent coin tosses. Proposition The measure P (Ki ) i : P({0, 1} N ) [0, 1] is translation invariant.

Let (K i ) i be an infinite sequence of independent coin tosses. Proposition The measure P (Ki ) i : P({0, 1} N ) [0, 1] is translation invariant. (I.e., for any A {0, 1} N and α {0, 1} N, it holds that P (Ki ) i (A ) = P (Ki ) i (α A ), where is pointwise xor.)

Let (K i ) i be an infinite sequence of independent coin tosses. Proposition The measure P (Ki ) i : P({0, 1} N ) [0, 1] is translation invariant. (I.e., for any A {0, 1} N and α {0, 1} N, it holds that P (Ki ) i (A ) = P (Ki ) i (α A ), where is pointwise xor.) Proof P (Ki ) i (α A ) = P (αi K i ) i (A )

Let (K i ) i be an infinite sequence of independent coin tosses. Proposition The measure P (Ki ) i : P({0, 1} N ) [0, 1] is translation invariant. (I.e., for any A {0, 1} N and α {0, 1} N, it holds that P (Ki ) i (A ) = P (Ki ) i (α A ), where is pointwise xor.) Proof P (Ki ) i (α A ) = P (αi K i ) i (A ) But P (αi K i ) i = P (Ki ) i by the criterion on previous slide.

Convergence in probability Suppose (A, d) is a metric space. A sequence (X i ) i 0 of A-valued random variables converges in probability to X RV(A) if, for any ɛ > 0, lim n P(d(X n, X ) > ɛ) = 0. Proposition The function (X, Y ) E(min(d(X, Y ), 1)) defines a metric on RV(A) whose convergence relation is convergence in probability. This metric is complete whenever (A, d) is itself a complete metric space.

Conditional expectation We say that Z RV(B) is functionally dependent on Y RV(A) (notation Z Y ) if there exists f : A B such that Z = f [Y ]. Proposition For any X RV [0, ] and Y RV(A), there exists a unique random variable Z RV [0, ] satisfying: Z Y, and for all A A E(Z. 1 A (Y )) = E(X. 1 A (Y )) The unique such Z defines the conditional expectation E(X Y ).

A meta-theoretic property Every sentence of the form X 1,Y 1 RV(A 1 ),..., X n,y n RV(A n ) Φ(X 1,..., X n ) (X 1,..., X n ) (Y 1,..., Y n ) Φ(Y 1,..., Y n ) is true.

A meta-theoretic property Every sentence of the form X 1,Y 1 RV(A 1 ),..., X n,y n RV(A n ) Φ(X 1,..., X n ) (X 1,..., X n ) (Y 1,..., Y n ) Φ(Y 1,..., Y n ) is true. Definable properties are equidistribution invariant.

To do Develop a good amount of probability theory (e.g., including continuous time Markov processes, and stochastic differential equations). Constructive and (hence) computable versions. Type theoretic version.