Paradigms of Probabilistic Modelling

Paradigms of Probabilistic Modelling Hermann G. Matthies Brunswick, Germany wire@tu-bs.de http://www.wire.tu-bs.de abstract RV-measure.tex,v 4.5 2017/07/06 01:56:46 hgm Exp

Overview 2 1. Motivation challenges and historical review 2. Abstract algebra of random variables, states, and their laws 3. Relation to probability measures (and spectral theory) 4. Fuzzy random variables 5. Conditional expectation

Motivation 3 Traditionally probability theory has the notion of a measure space, σ-algebra of sets, and a probability measure as prime objects. Another possibility is to use random variables (RVs) and expectation as prime objects. This gives a functional analysis view and generalisation of the subject. Leads to tensor products. Duality allows ideal objects. On infinite dimensional (function-) spaces one may use generalised processes (weak distributions / linear processes) probability without measures. Bayesian inversion / conditional expectation becomes a projection. Naturally suited to approximation by Galerkin methods and numerical (multi)-linear algebra (from tensor structure). Probability algebras imply the spectral theory of linear operators.

Random Variables as Algebra 4 Random variables (RVs) as used by the Bernoullis can be a) added to each other, b) multiplied by numbers, c) multiplied by themselves. d) constants are RVs. e) E () is positive, linear. Jakob Bernoulli (1655 1705) Nikolaus (II) Bernoulli (1695 1726) nephew of Jakob Daniel Bernoulli (1700 1782) brother of Nikolaus II a) and b) RVs form a vector space. c) RVs are associative algebra. d) existence of unit. e) expectation is a positive linear functional Definition: Associative Algebra over a Field (K-algebra) Let A be a K-vector space (K here typically R or C) with bi-linear associative multiplication A a, b a b A, i.e. (a b) c = a (b c) distributive: (a + b) c = a c + b c and c (a + b) = c a + c b A is a unital algebra iff! e A (identity element): a e = e a = a.

What is a Random Variable? 5 Classical RV (Kolmogorov): A random variable is a measurable function q on a probability space (Ω, A, P) into another measure space (typically R or C). With probability measure P one may define expectation E (ϕ(q)) := Ω ϕ(q(ω)) P(d ω). Andrej N. Kolmogorov (1903 1987) Algebraic RV (algebraic definition): A RV q A is an element of an associative (usually complex) unital algebra (with unit element e) and a distinguished positive linear functional φ called the state such that φ(e) = 1 state expectation E φ (q) := φ(q). Events defined by projections p P(p) := φ(p) = E φ (p).

Example Probability Algebras (PAs) 6 0) Every field (e.g. Q, R, C) is a commutative / Abelian algebra. 1) n n-matrices with matrix mult.: M(n, C) with φ(a) = tr(a)/n. Commutative sub-algebra: diagonal matrices. 2) Probability space (Ω, A, P) with simple (step) functions χ E = 1 E for events E A gives unital commutative algebra: L 0s (Ω, A) = span C {χ E E A} with pointwise multiplication. Unit is e = χ Ω, and with ξ k C, r(ω) = K k=1 ξ k χ Ek (ω) is a RV. K φ(r) := ξ k P(E k ) =: E (r) = r(ω) P(d ω), E (e) = E (χ Ω ) = 1. k=1 Ω Can one recover P from L 0s (Ω, A) and E ( )? E A : P(E) = E (χ E ).

More on Algebras 7 Let A be an algebra, and a, b A. Powers a n ; n 0, inverse a 1 a = e clear. Defines polynomials p(a); The C-algebra A is a *-algebra iff anti-linear involution (a ) = a: (αa + βb) = ᾱa + βb ; (a b) = b a ; (a ) 1 = (a 1 ) =: a ; in L 0s the involution r := k ᾱk χ Ek is by complex conjugation; in M(n, C) the involution is A := A H = ĀT A is commutative or Abelian iff commutator [, ] vanishes: [a, b] = a b b a = 0 a, b A;

Examples of Abelian/ commutative Algebras 8 unital Abelian UA UA-algebra: C(X, C), X compact uniform multiplication algebra; UA-algebra: bounded random variables (RVs) L (Ω, C) bounded multiplication algebra; sub-algebra L 0s (Ω, C) (step fcts.) Abelian algebra: cont. functions C 0 (R, C) which 0 as x ± UA-algebra: C n with Hadamard product a b = [..., a k b k,... ] T and a = max k a k ; (same as diagonal matrices in M(n, C)); UA-algebra: polynomials C[X 1,..., X m ] in commuting variables X k ; UA-algebra: Wiener s polynomial chaos (PC); generalised PC (gpc); John(Johann) von Neumann János Lajos Neumann (1903 1957)

Examples of non-abelian Algebras 9 unital algebra: the complex n n matrices M(n, C) = C n n with matrix multiplication and M(n, C) A := Ā T ; unital algebra: n n matrices with RVs from L (X, C) as entries: M(n, L ) = L n n (X, C); unital algebra: bounded linear operators L (H) (H Hilbert), concatenation A B as mult., adjoint as involution, operator norm; algebra: compact linear operators L 0 (H), all else like before; unital algebra: full tensor algebra T (H) with tensor product a b; unital algebra: exterior algebra (H) with exterior product a b; Jacques Dixmier (*1924)

Special Classes of Elements and Relations 10 elements a A such that [a, a ] = 0 are called normal (a subspace); normal elements u A with u = u 1 are called unitary (a subgroup); normal elements a A with a = a are called self-adjoint/hermitian; these are the observables in some sense they have real values; in L 0s all real r = k α k χ Ek (α k R, E k A) are self-adjoint; every a A can be written as the sum of two self-adjoint elements; a = a r + i a i, where a r = Ra = (a + a )/2, a i = Ia = (a a )/(2 i); in L 0s real and imaginary part; self-adjoint elements a A such that a = b b are called positive; they have square root a = c c; c = a 1/2 ; 0 and e are positive, and positive elements form a convex cone P; positive elements p A such that p = p p are called projections; in L 0s all characteristic functions χ E (E A) are projections;

Functionals and States Each linear functional β A defines a sesquilinear form b on A A, and in turn a linear map B : A A : β(c a) =: b(a, c) =: Ba, c A,A; a, c A. In case B is Hermitian (self-adjoint), skew-hermitian, positive, the same is attached to b and β A. A Hermitian, strictly positive definite φ A (called a faithful state) defines an inner product H on A: a c H := φ(c a) = Φa, c A,A, with Φ L (A, A ) and Hilbert space completion H φ = H := cl φ (A). Representation: A *-algebra homomorphism A L (K). Regular (left) representation: A a L a L(A), L a b := a b. E φ (a) = φ(a) = φ(e a e) = L a e e H = a e H, (vector state) 11

Probability Algebras (PAs) of Random Variables 12 Definition: Probability Algebra (PA): unital *-algebra A, faithful state φ. Examples of PAs and states: a) non-abelian: M(n, C) with φ(a) := tr(a)/n; or more generally R positive with tr(r) = 1, φ(a) := tr(ra); b) Abelian: Diagonal matrices from a) c) Abelian: L 0s (Ω) and L (Ω) with integral φ(f) := f(ω) P(dω); Ω d) Abelian: Wiener s PC with expected value φ(ψ) := E (ψ) (= a 0 ); e) non-abelian: L (K) with trace: φ(a) := tr(ϱa) = j ρ j Av j v j K ; density matix ϱ = j ρ j v j v j L (K) is nuclear operator with (ρ j 0, v j = 1, j ρ j = 1). Special case vector state (pure) φ v (A) := tr((v v)a) = Av v K with v = 1.

GNS-construction 13 Izrail M. Gel fand (1913 2009) Mark A. Najmark (1909 1978) Irving E. Segal (1918 1998) Typically L (H) L(A), but if all a A are bounded, then A regularly represented as a closed *-sub-algebra of L (H). An Abelian *-algebra with all a 0 invertible is isomorphic to C. Maximal Abelian bdd. PA isomorphic to bdd. mult. alg. L (Ω, P), In special cases isomorphic to normal RVs, but otherwise much more general.

Spectrum and Spectral Values For one element a A: spectrum (spectral values) of a: σ(a) = {λ C a λe not invertible in A}, compact in C. 14 For a self-adjoint σ(a) R, for a positive σ(a) R +, for u unitary σ(u) T = {z : z = 1}, for p projection σ(p) {0, 1}. For a linear operator A L (H) the spectrum is defined similarly: σ(a) = {λ C A λi not invertible in L (H)}, compact in C. Statements about linear operators can be translated to abstract algebras. Spectral theory can actually be built on the theory of algebras (PAs), by the basic fact that the bounded multiplication algebra L with, for k L, M k : L 2 u k u L 2, is fully diagonalised under the *-algebra isomorphism L k M k L (L 2 ).

Laws of algebraic Random Variables Let a A (a probability algebra), the k-th moment is m k (a) = φ(a k ), the law of a is τ a : C[X] C, τ a (P ) := φ(p (a)) for all polynomials P C[X]. Note: τ a = τ b a = b. For a PA A, there is for each self-adjoint a A a unique distribution (probability) measure P a (also called law of a) on R with P C[X] : P (t) P a (dt) = τ a (P ) = φ(p (a)) =: E φ (P (a)). R (Extension from polynomials to continuous functions a measure). Algebraic probability and integration recovers the measure-theoretic path. Let Ā = span{e}, two RVs a 1, a 2 Ā are uncorrelated iff a 1 a 2 = 0. Independence: *-sub-algebras A 1, A 2 Ā are independent iff a 1 A 1, a 2 A 2 : [a 1, a 2 ] = 0; φ(a 1 a 2 ) = 0; i.e. A 1 A 2. For independence of a RV a, consider *-sub-algebra C[a] Ā. 15

Example: Gaussians 16 Observe that for a probability space (Ω, A, P) the vector space L (Ω, A, P) := p 1 L p (Ω, A, P) is a *-probability algebra with, for ξ L, E (ξ) = Ω ξ(ω) P(dω). Consider a family of Gaussian RVs {θ k } k. They form a vector space G = span{θ k } k, which becomes a Gaussian Hilbert space with θ ϑ G := E ( ϑθ ). As for θ G : E ( θ p ) < θ L. This means the Gaussians G generate a sub-probability-algebra A: G A = C[..., θ k,..., ] L (Ω, A, P). Cameron-Martin theorem: the completion cl A = L 2 (Ω, A, P), and G is a closed Gaussian subspace: G L 2 (Ω, A, P).

Observables 17 Only the self-adjoint a A with σ(a) R can be observed. For self-adjoint a A exists a growing projector family p λ with p = 0, p = e, p λ p η if λ η, a = λ dp λ. p λ is more precisely p [,λ]. Possible realisations of a A are the spectral values σ(a) R. All of probability theory can be captured by Abelian unital *-integration algebra with state φ = E φ. No measure theory necessary. Extension to non-abelian / non-commutative case. R

Use in Other Fields Among others: Geometry: The topology (connectedness etc.) and geometry of sets X can be studied by continuous spaces of Abelian algebras C(X, F) of differential forms and their duals (co-homology and homology). This is the realm of C*-algebras. Non-Abelian non-commutative geometry. 18 Probability: Normal probability can be studied by spaces of Abelian algebras L (X, F) of measurable functions. This is the realm of W*-algebras. Non-Abelian non-commutative probability. Needed in quantum theory, random matrices, free probability. Spectral Theory: The algebraic integration theory is closely connected to spectral factorisations for linear operators they are the original example of C*- and W*-algebras.

Probability Algebra σ-algebra 19 In the Abelian bounded multiplication algebra A = L (Ω, A) the projections (which commute with all of a L ) are for E A the p E χ E L (Ω, A) The projections p A generate the whole algebra (seen from the spectral theorem). σ-algebra of sets A shadows the algebra of projections in A: let E j A, and E j χ Ej E 1 E 2 χ E1 χ E2 := χ E1 χ E2 ; Ω \ E = E χ E := 1 χ E ; E 1 E 2 χ E1 χ E2 := χ E1 + χ E2 χ E1 χ E2. Sub-σ-algebras B A correspond to unital sub-*-algebras B A. Basic probability measure of φ P φ on A: let χ E be a projection onto event E a subspace L (E) = {k L : k = χ E k} A: define P φ (E) := E φ (χ E ).

Example: Fuzzy probability 20 For a σ-algebra A, we continue with the identification for events E A via E χ E with the projection χ E L (Ω). Observe that χ E (ω) {0, 1}, i.e. σ(χ E ) {0, 1}. The events χ E L (Ω) are called crisp. Define effects fuzzy sets as η L (Ω) with η : Ω [0, 1], a membership function. Therefore σ(η) [0, 1] One may extend the operations,, to effects for fuzzy set operations. If an effect η = χ E for some E A, the fuzzy set is crisp. For the RV χ E one has P φ (χ E ) = E φ (χ E ), so one may define the probability of an effect: P φ (η) := E φ (η).

Conditional expectation in probability algebras 21 Conditioning with respect to a unital sub-*-algebra B A: let H = cl φ (A) and K = cl φ (B). The conditional expectation (CE) E (a B) of a A has to satisfy a A, b B : E (a B) K. E (E (a B) B) = E (a B), and E (b B) = b. E φ (b a) = E φ (b E (a B)) E (a B) = P K, the orthogonal projection P K : H K Galerkin orthogonality: b B : E φ (b (a E (a B))) = (a E (a B)) b H = 0, and minimisation property E (a B) = P K a = arg min{ a b 2 H : b K}.

Conditional expectation due to an observation 22 For an observation y A, look at sub-*-algebra B := C[y] A generated by y, and Hilbert-subspace K = cl φ B H. Any b B is a polynomial in y, and any h K is a function h = ϖ(y) of y, approximated by polynomials in φ. Hence for a A one has E (a y) := E (a B) = ϖ a (y) K, the pre-conditional expectation, an algebraic RV. After observing ŷ σ(y), one may compute the post-conditional expectation resp. mean ā ŷ = E (a ŷe) = ϖ a (ŷe) span{e}. Define a new state: a A: ˆφ(a) := E (a ŷe) = ϖ a (ŷe), and a new, conditioned *-probability algebra (A, ˆφ). With additional information ŷ, the state (of knowledge) changes.

Conclusion 23 Probability algebras give probability without measures. Expectation operator becomes central object. Algebras (and probability measures integration) have close connection with spectral decomposition. Fuzzy probability appears natural with algebraic RVs. Conditional expectation as projections onto sub-probability algebras.