The Locale of Random Sequences

1/31 The Locale of Random Sequences Alex Simpson LFCS, School of Informatics University of Edinburgh, UK

The Locale of Random Sequences 2/31 The Application Problem in probability What is the empirical meaning of event E having probability p? Standard explanation, cf. Kolmogorov (1933): Perform large number of repeated trials. Proportion of occurrences of event E is practically certain to be close to p. Practical certainty means with probability very close to 1. (Of course, can calculate exact probability that observed value lies within ɛ of p.) What is the empirical meaning of practical certainty??? A vicious circle!

The Locale of Random Sequences 3/31 Random sequences Explaining the empirical meaning of probability amounts to explaining the characteristic properties of sequences of stochastic events. So need a theory of random sequences. Intuitively, a sequence α 2 ω (where 2 := {0, 1}) is random if it could potentially arise by tossing a fair coin (we use 0 for heads and 1 for tails) ad infinitum. We seek a mathematical theory of random sequences R 2 ω. Task is also interesting for more general sequences of stochastic events. E.g., tossing a sequence of biased coins.

The Locale of Random Sequences 4/31 Questions such a theory should answer Is the sequence 0 ω random? Is the bit-sequence of Chaitin s halting probability Ω random? Under what conditions does a transducer on infinite sequences preserve randomness? Suppose we toss a sequence of biased coins, where the i-th coin has probability 1 2 + δ i of coming up 0 and 1 2 δ i of coming up 1. Under what conditions on (δ i ) are we guaranteed to obtain a fair random sequence? (van Lambalgen s brainteaser, 1987) Such questions seem meaningless in probability theory à la Kolmogorov

The Locale of Random Sequences 5/31 von Mises theory of probability Richard von Mises (1919) developed a mathematical theory of probability, based directly on random sequences. He postulated that random sequences should be characterised by principles encapsulating two contrasting aspects of a random α 2 ω : Local irregularity: e.g., cannot predict α n from α 0... α n 1 Global regularity: frequencies converge lim n 1 n n 1 i=0 α i = 1 2 (this embodies the law of large numbers) In von Mises theory probability is defined as the limit of observed frequencies.

The Locale of Random Sequences 6/31 Difficulties von Mises attempt at formalizing random sequences (whence probability) met with mathematical difficulties. Local irregularity is formalized as invariance under admissible place selections. Formulating this precisely is not straightforward. The approach is limited in power; e.g., cannot derive the full law of the iterated logarithm from von Mises principles. The theory is a mathematical outcast. Care is needed to avoid inconsistency. It does not sit easily within set theory. There are also philosophical issues. For example, the law of large numbers is an idealistic rather than empirical assumption, since we can never observe that it holds.

The Locale of Random Sequences 7/31 Observable properties We seek a theory of random sequences based only on our empirical experience of sequences of stochastic events. At any time, we only see a finite prefix of a random sequence (and this might be any finite sequence of digits). A subset U 2 ω is open (in the Cantor/product topology) if it satisfies: α U = n. β 2 ω. (β n = α n = β U) Open subsets correspond to observable properties, cf. Smyth, Abramsky. For example, {α α 0 ω } is observable. But {0 ω } is not observable.

The Locale of Random Sequences 8/31 Empirical properties of randomness An empirical fact: If α is a random sequence, then we will sooner or later observe that α 0 ω, i.e., that α {β 2 ω β 0 ω }. Note that, for any n, the probability that α n = 0 n is 2 n. Thus the probability that the prefix α n fails to show that α 0 ω is 2 n. We soon have to have been very unlucky not to have observed that α 0 ω. The empirical fact is explained by a general principle that the observer is not infinitely unlucky. We turn this principle into a (first) postulate of randomness.

The Locale of Random Sequences 9/31 Postulates of randomness We postulate properties of the desired set R 2 ω of random sequences. Let λ be the uniform/lebesgue measure on 2 ω. (R1) First postulate of randomness For every open U 2 ω, if λ(u) = 1 then R U. Informally: if α is a random sequence, and U is an almost sure observable property then we will (eventually) observe that α U. The first postulate addresses the Application Problem. The theory predicts that any almost sure observation will actually occur. Thus almost sure becomes sure for open sets.

The Locale of Random Sequences 10/31 The second postulate is a nontriviality condition. Its purpose is to ensure that there are enough random sequences. (R2) Second postulate of randomness For every open U 2 ω, if R U then λ(u) = 1. Informally: if U is an observable property with λ(u) < 1 then there has to be a random sequence outside U since there is some finite probability of randomly generating such a sequence. (There is some similarity with Myhill s set-theoretic axioms for randomness, cf. van Lambalgen (1992). Myhill s axioms, however, use measurable sets rather than open sets and invoke set-theoretic definability. I believe it is easier to justify the above postulates as embodying empirical properties of randomness.)

The Locale of Random Sequences 11/31 Inconsistency The two postulates of randomness are inconsistent. For any sequence β, the open set {α α β} has measure 1. Therefore, by the first postulate of randomness: R {α α β}, i.e., β R. Thus R =, but this contradicts the second postulate.

The Locale of Random Sequences 12/31 Conventional wisdom Ideally, a random sequence would be one satisfying all measure 1 properties (irrespective of whether open or not). There are no such sequences. It is therefore necessary to restrict to a a countable family F of (not necessarily open) measure 1 subsets of 2 ω. Define: R F := F, Because F is countable, we have λ(r F ) = 1, hence nontriviality. N.B., our first postulate of randomness is weakened, but the second holds as stated.

The Locale of Random Sequences 13/31 Algorithmic and logical notions of randomness Typically the family F is given using either recursion theoretic or logical notions of definability. Numerous such definitions have been proposed and argued for; for example, by Martin-Löf (1966), Schnorr (1971), Kurtz (1981),... Algorithmic randomness is an important area of recursion theory. Does it, however, provide the right framework for modelling the stochastic phenomenon of randomness? Two criticisms: No one canonical notion of randomness While one can make a (dubious!) case that recursion-theoretic restrictions reflect our observational limitations, there is no reason to believe any recursion-theoretic dependencies to be inherent in the stochastic phenomenon of randomness itself

The Locale of Random Sequences 14/31 Aim of talk The goal of the talk is to present a mathematically natural approach to modelling the stochastic phenomenon of randomness. A notable feature is that our two postulates of randomness will hold as originally formulated. In particular, there is no need to select a countable family F of measure 1 sets. Conventional wisdom is wrong

The Locale of Random Sequences 15/31 Some clues The two postulates of randomness are couched in terms of observable properties (open sets). Thus it is natural to define randomness in a setting in which observable properties play a fundamental role. Further, we never experience a completed infinite random sequence in its entirety only its finite prefixes. Thus it is natural to define randomness in a setting in which completed entities (points) are not the basic ingredient. All this suggests using locale theory (a.k.a. point-free topology)

The Locale of Random Sequences 16/31 Locales A locale X is given by a frame O(X), that is, O(X) is a partially-ordered set with: arbitrary joins (including the empty join ), (hence) finite meets (including the empty meet X), satisfying the distributive law: U ( i I V i ) = i I U V i. Motivating example: O(X) is the lattice of opens of a topological space, joins are unions and finite meets are intersections.

The Locale of Random Sequences 17/31 Locales as generalised spaces Locale theory abstracts away from topological spaces. The frame O(X) does not need to arise as a family of sets under union and intersection. Indeed there need not be any underlying set of points. The generality can be motivated by considering frames as theories in a natural logic of observable properties, see Vickers (1989). Many notions from topology transfer to locales, X, Y. A (continuous) map f : X Y is given by a function f 1 : O(Y ) O(X) that preserves arbitrary joins and finite meets. A map f : X Y is said to be an embedding if the function f 1 : O(Y ) O(X) is surjective. The embeddings determine the notion of sublocale.

The Locale of Random Sequences 18/31 The two postulates interpreted for locales Recall the two postulates. (R1) For every open U 2 ω, if λ(u) = 1 then R U. (R2) For every open U 2 ω, if R U then λ(u) = 1. 2 ω is a topological space, hence a locale. The open subsets U 2 ω are the elements of O(2 ω ), they determine the open sublocales of 2 ω. By interpreting as the sublocale relationship, the postulates make sense verbatim for locales, but the inconsistency vanishes. Theorem and (R2). There exists a unique sublocale R 2 ω satisfying (R1)

The Locale of Random Sequences 19/31 Concrete description of R For U, V 2 ω open, define: U V : λ(u) = λ(u V ) = λ(v ) If U V then intuitively they represent the same observation on random sequences. Define O(R) := O(2 ω )/. (Cf. the measure (σ-)algebra of 2 ω, equivalently of [0, 1].) Proposition The above defines R as a sublocale of 2 ω. This is the sublocale characterised by the previous theorem.

The Locale of Random Sequences 20/31 Outer measure We postulated that random sequences satisfy all measure 1 observable properties. What about other measure 1 properties? The outer measure of a sublocale Y 2 ω is defined by: λ (Y ) := inf{λ(u) U O(2 ω ) and Y U} (R2) asserts that R has outer measure 1. Proposition R is characterised as the smallest sublocale of 2 ω of outer measure 1. In particular, random sequences satisfy all measure 1 properties. In contrast, in algorithmic randomness, the assumption that a sequence satisfies all measure 1 effective open properties (i.e., that it is Kurtz random) does not even imply the law of large numbers.

The Locale of Random Sequences 21/31 What are the random sequences in R? The locale R has no points, that is there are no maps from the one point (i.e., terminal) locale to R. Thus, according to this theory, there is no such thing as a completed infinite random sequence. In particular, 0 ω is not random. Also, the bit-sequence of Chaitin s Ω is not random. Nonethless R is a nontrivial space of random sequences. (More generally, is the only compact sublocale of R.)

The Locale of Random Sequences 22/31 What are the maps from R to R? Intuitively, these correspond to continuous transducers on random sequences. Theorem The maps R R are in one-to-one correspondence with the continuous nonsingular maps from 2 ω to 2 ω modulo almost everywhere equivalence. A continuous nonsingular map from 2 ω to 2 ω is a continuous function from a measure 1 subspace D 2 ω to 2 ω, satisfying: for every null Z 2 ω, it holds that f 1 (Z) is null.

The Locale of Random Sequences 23/31 More generally If we generalise λ to other probability measures (technically, probability valuations), and 2 ω to other locales, we still obtain canonical random sublocales. A (continuous) probability valuation on a locale X is a (necessarily monotone) function µ: O(X) [0, 1] satisfying: µ( ) = 0 µ(u V ) = µ(u) + µ(v ) µ(u V ) µ(x) = 1 µ( U) = sup µ(u) (D O(X) directed) Given (X, µ) define: U D U D U µ V : µ(u V ) = µ(u V ) O(R(µ)) := O(X)/ µ

The Locale of Random Sequences 24/31 Proposition R(µ) is a sublocale of X. Define the outer value of a sublocale Y X by: µ (Y ) := inf{µ(u) U O(X) and Y U} Theorem If the locale X is regular then: 1. R(µ) is the meet of all open sublocales U X with µ(u) = 1 2. R(µ) is the meet of all sublocales Y X with µ (Y ) = 1 3. µ (R(µ)) = 1 Obviously, µ is also a probability valuation on R(µ). Then (R(µ), µ) is random in the sense that: [U] [V ] O(R(µ)) (proper inclusion) µ([u]) < µ([v ])

The Locale of Random Sequences 25/31 When does one measure subsume another? For 2 ω, valuations are in one-one correspondence with Borel measures. Proposition Given two probability measures µ, ν on 2 ω, the following are equivalent: R(µ) R(ν) (as sublocales of 2 ω ) µ ν (i.e., µ is absolutely continuous relative to ν) Recall, µ is absolutely continuous relative to ν if every ν-null Borel set is also µ-null.

The Locale of Random Sequences 26/31 van Lambalgen s brainteaser revisited Suppose we toss a sequence of biased coins, where the i-th coin has probability 1 2 + δ i of coming up 0 and 1 2 δ i of coming up 1. Under what conditions on (δ i ) are we guaranteed to obtain a fair random sequence? We write π for the product measure i ( 1 2 + δ i, 1 2 δ i) on 2 ω. The question asks: When does it hold that R(π) R? Solution: R(π) R π λ δ 2 i < i=0 (by previous proposition) (by Kakutani s Theorem)

The Locale of Random Sequences 27/31 Another characterisation of R The locale of random sequences, R, is: countably based zero dimensional and has no points Moreover, the valuation λ on R is random. Canonicity Theorem If X is a countably-based zero-dimensional locale with no points and X carries a random probability valuation µ then X and R are homeomorphic via valuation preserving maps. (Cf. the measure algebra isomorphism theorem of measure theory.)

The Locale of Random Sequences 28/31 Constructive version? Should be possible to develop the theory within Bishop Constructive Mathematics, and hence in any brand of constructivism. In contrast, the usual theories of algorithmic randomness are not compatible with (formal) Church s Thesis. In a constructive version, an existence proof for a map R R should give rise to an effective transducer of random sequences. Given a non-atomic effective probability valuation µ on 2 ω, a constructive proof of the canonicity theorem should give mutually inverse transducers R(µ) R and R R(µ); the former amounting to an optimal data compression, and the latter its decoding. Thus a constructive canonicity theorem should relate to Shannon s noiseless coding theorem from information theory.

The Locale of Random Sequences 29/31 Random space? If space S is continuous, how does this account for the existence of discrete observables S {0, 1}? If space is discrete, how does this account for its geometry, e.g., straight lines, geodesics, etc.? Perhaps space is random... A motivating idea for this is that any attempt to zoom in indefinitely on the location of a particle must eventually, at some level of granularity, settle down as a sequence of stochastic events.

The Locale of Random Sequences 30/31 Random space! Let R(E n ) be the smallest thick sublocale of E n (Euclidean n-space). Equivalently R(E n ) = n {}}{ R(E)#... #R(E) where # is the independent product of random locales. The space R(E n ) is zero dimensional, so have many nontrivial discrete observables R(E n ) {0, 1}. R(E n ) also possesses a nontrivial geometry. For example, it has random straight lines through independent points: (x, y, λ) λx + (1 λ)y : R(E n )#R(E n )#R(E) R(E n )

The Locale of Random Sequences 31/31 Other directions Ideally, the development presented here should (once constructivised) be part of a general constructive account of probability and measure theory over locales. Aspects of such a theory have been initiated by, amongst others, Coquand, Palmgren, Spitters and Vickers. Much remains to be done. In recent talks, Dana Scott has identified the frame O(R) as a subframe of the measure algebra, and used this to give a sheaf model of a (classical) modal set theory (MZF) in which every proposition has a probability. This opens the possibility of performing modal set-theoretic reasoning about randomness.