ST213 Mathematics of Random Events

Size: px
Start display at page:

Download "ST213 Mathematics of Random Events"

Transcription

1 ST213 Mathematics of Random Events Wilfrid S. Kendall version April Introduction The main purpose of the course ST213 Mathematics of Random Events (which we will abbreviate to MoRE) is to work over again the basics of the mathematics of uncertainty. You have already covered this in a rough-and-ready fashion in: (a) ST111 Probability; and possibly in (b) ST114 Games and Decisions. In this course we will cover these matters with more care. It is important to do this because a proper appreciation of the fundamentals of the mathematics of random events (a) gives an essential basis for getting a good grip on the basic ideas of statistics; (b) will be of increasing importance in the future as it forms the basis of the hugely important field of mathematical finance. It is appropriate at this level that we cover the material emphasizing concepts rather than proofs: by-and-large we will concentrate on what the results say and so will on some occasions explain them rather than prove them. The third-year courses MA305 Measure Theory, and ST318 Probability Theory go into the matter of proofs. For further discussion of how Warwick probability courses fit together, see our road-map to probability at Warwick at Books [1] D. Williams (1991) Probability with Martingales CUP. 1.2 Resources (including examination information) The course is composed of 30 lectures, valued at 12 CATS credit. It has an assessed component (20%) as well as an examination in the summer term. The assessed component will be conducted as follows: an exercise sheet will be handed out approximately every fortnight, totalling 4 sheets. In the 10 minutes at the start of the next lecture you produce an answer to one question under examination conditions, specified at the start of the lecture. Model answers will be distributed 1

2 after the test, and an examples class will be held a week after the test. The tests will be marked, and the assessed component will be based on the best 3 out of 4 of your answers. This method helps you learn during the lecture course so should: improve your exam marks; increase your enjoyment of the course; cost less time than end-of-term assessment. Further copies of exercise sheets (after they have been handed out in lectures!) can be obtained at the homepage for the ST213 course: These notes will also be made available at the above URL, chapter by chapter as they are covered in lectures. Notice that they do not cover all the material of the lectures: their purpose is to provide a basic skeleton of summary material to supplement the notes you make during lectures. For example no proofs are included. In particular you will not find it possible to cover the course by ignoring lectures and depending on these notes alone! Further related material (eg: related courses, some pretty pictures of random processes,...) can be obtained by following links from: W.S. Kendall s homepage: Finally, the Library Student Reserve Collection (SRC) will in the summer term hold copies of previous examination papers, and we will run two revision classes for this course at that time. 1.3 Motivating Examples Here are some examples to help us see what are the issues. (1) J. Bernoulli (circa 1692): Suppose that A 1, A 2,... are mutually independent events, each of which has probability p. Define S n = #{ events A k which happen for k n}. Then the probability that S n /n is close to p increases to 1 as n tends to infinity: P [ S n /n p ɛ ] 1 as n for all ɛ > 0. (2) Suppose the random variable U is uniformly distributed over the continuous range [0, 1]. Why is it that for all x in [0, 1] we have P [ U = x ] = 0 and yet P [ a U b ] = b a 2

3 whenever 0 a b 1? Why can t we argue as follows? P [ a U b ] = P = x [a,b] x [a,b] {x} P [ U = x ] = 0? (3) The Banach-Tarski paradox. Consider a sphere S 2. In a certain qualified sense it is possible to do the following curious thing: we can find a subset F S 2 and (for any k 3) rotations τ k 1, τ k 2,..., τ k k such that S 2 = τ k 1 F τ k 2 F... τ k k F. What then should we suppose the surface area of F to be? Since S 2 = τ 3 1 F τ 3 2 F τ 3 3 F we can argue for area(f ) = 1/3. But since S 2 = τ 4 1 F τ 4 2 F τ 4 3 F τ 4 4 F we can equally argue for area(f ) = 1/4. Or similarly for area(f ) = 1/5. Or 1/6, or... (4) Reverting to Bernoulli s example (Example 1 above) we could ask, what is the probability that, when we look at the whole sequence S 1 /1, S 2 /2, S 3 /3,..., we see the sequence tends to p? Is this different from Bernoulli s statement? (5) Here is a question which is apparently quite different, which turns out to be strongly related to the above ideas! Can we generalize the idea of a Riemann integral in such a way as to make sense of rather discontinuous integrands, such as the case given below? where 1 0 f(x) dx f(x) = { 1 when x is a rational number, 0 when x is an irrational number. 2. Probabilities, algebras, and σ-algebras 2.1 Motivation Consider two coins A and B which are tossed in the air so as each to land with either heads or tails upwards. We do not assume the coin-tosses are independent! 3

4 It is often the case that one feels justified in assuming the coins individually are equally likely to come up heads or tails. Using the fact P [ A = T ] = 1 P [ A = H ], etc, we find P [ A comes up heads ] = 1 2 P [ B comes up heads ] = 1 2 To find probabilities such as P [ HH ] = P [ A = H, B = H ] we need to say something about the relationship between the two coin-tosses. It is often the case that one feels justified in assuming the coin-tosses are independent, so P [ A = H, B = H ] = P [ A = H ] P [ B = H ]. However this assumption may be unwise when the person tossing the coin is not experienced! We may decide that some variant of the following is a better model: the event determining [B = H] is C if [A = H], D if [A = T ], where P [ C = H ] = 3 4 P [ D = H ] = 1 4 and A, C, D are independent. There are two stages of specification at work here. Given a collection C of events, and specified probabilities P [ C ] for each C C, we can find P [ C c ] = 1 P [ C ] the probability of the complement C c of C, but not necessarily P [ C D ] for C, D C. 2.2 Revision of sample space and events Remember from ST111 that we can use notation from set theory to describe events. We can think of events as subsets of sample space Ω. If A is an event, then the event that A does not happen is the complement or complementary event A c = {ω Ω : ω A}. If B is another event then the event that both A and B happen is the intersection A B = {ω Ω : ω A and ω B}. The event that either A or B (or both!) happen is the union A B = {ω Ω : ω A or ω B}. 2.3 Algebras of sets This leads us to identify classes of sets for which we want to find probabilities. 4

5 Definition 2.1 (Algebra of sets): An algebra (sometimes called a field) of subsets of Ω is a class C of subsets of a sample space Ω satisfying: (1) closure under complements: if A C then A c C; (2) closure under intersections: if A, B C then A B C; (3) closure under unions: if A, B C then A B C. Definition 2.2 (Algebra generated by a collection): If C is a collection of subsets of Ω then A(C), the algebra generated by C, is the intersection of all algebras of subsets of Ω which contain C. Here are some examples of algebras: (i) the trivial algebra A = {Ω, }; (ii) supposing Ω = {H, T }, another example is A = {Ω = {H, T }, {H}, {T }, } ; (iii) now consider the following class of subsets of the unit interval [0, 1]: A = { finite unions of subintervals }; This is an algebra. For example, if A = (a 0, a 1 ) (a 1, a 2 )... (a 2n, a 2n+1 ) is a non-overlapping union of intervals (and we can always re-arrange matters so that any union of intervals to be non-overlapping!) then A c = [0, a 0 ] [a 1, a 2 ]... [a 2n+1, 1]. This checks point (1) of the definition of an algebra of sets. Point (2) is rather easy, and point (3) is defined by points (1) and (2). (iv) Consider A = {{1, 2, 3}, {1, 2}, {3}, }. This is an algebra of subsets of Ω = {1, 2, 3}. Notice it does not include events such as {1}, {2, 3}. (v) Just to give an example of a collection of sets which is not an algebra, consider {{1, 2, 3}, {1, 2}, {2, 3}, }. (vi) Algebras get very large. It is typically more convenient simply to give a collection C of sets generating the algebra. For example, if C = then A(C) = {, Ω} is the trivial algebra described above! (vii) If Ω = {H, T } and C = {{H}} then A = {{H, T }, {H}, {T }, } as in example (ii) above. (viii) If Ω = [0, 1] and C = { intervals in [0, 1] } then A(C) is the collection of finite unions of intervals as in example (iii) above. (ix) Finally, if Ω = {H, T } and C is the collection of points in [0, 1] then A(C) is the collection of (a) all finite sets in [0, 1] and (b) all complements of finite sets in [0, 1]. 5

6 In realistic examples algebras are rather large : not surprising, since they correspond to the collection of all true-or-false statements you can make about a certain experiment! (If your experiment s results can be summarised as n different yes / no answers such as, result is hot/cold, result is coloured black/white, etc then the relevant algebra is composed of 2 n different subsets!) Therefore it is of interest that the typical element of an algebra can be written down in a rather special form: Theorem 2.3 (Representation of typical element of algebra): If C is a collection of subsets of Ω then the event A belongs to the algebra A(C) generated by C if and only if A = N M i i=1 j=1 where for each i, j either C i,j or its complement Ci,j c belongs to C. Moreover we may write A in this form with the sets C i,j D i = M i j=1 C i,j being disjoint. * We are now in a position to produce our first stab at a set of axioms for probability. Given a sample space and an algebra A of subsets, probability P [ ] assigns a number between 0 and 1 to each event in the algebra A, obeying the rules given below. There is a close analogy to the notion of length of subsets of [0, 1] (and also to notions of area, volume,...): the table below makes this clear: Probability Length of subset of [0, 1] P [ ] = 0 Length ( ) = 0 P [ Ω ] = 1 Length ([0, 1]) = 1 P [ A B ] = P [ A ] + P [ B ] Length ([a, b] [c, d]) = Length ([a, b]) + Length ([c, d]) if A B = if a b < c d * This result corresponds to a basic remark in logic: logical statements, however complicated, can be reduced to statements of the form (A 1 and A 2 and... and A m ) or (B 1 and B 2 and... and B n ) or... or (C 1 and C 2 and... and C p ), where the statements A 1 etc are either basic statements or their negations, and no more than one of the (...) or... or (...) can be true at once. 6

7 There are some consequences of these axioms which are not completely trivial. For example, the law of negation P [ A c ] = 1 P [ A ] ; and the generalized law of addition holding when A B is not necessarily empty P [ A B ] = P [ A ] + P [ B ] P [ A B ] (think of double-counting ); and finally the inclusion-exclusion law P [ A 1 A 2... A n ] = i P [ A i ] i j P [ A i A j ] ( 1) n P [ A 1 A 2... A n ]. 2.4 Limit Sets Much of the first half of ST111 is concerned with calculations using these various rules of probabilistic calculation. Essentially the representation theorem above tells us we can compute the probability of any event in A(C) just so long as we know the probabilities of the various events in C and also of all their intersections, whether by knowing events are independent or whether by knowing various conditional probabilities. * However these calculations can become long-winded and ultimately either infeasible or unrevealing. It is better to know how to approximate probabilities and events, which leads us to the following kind of question: Suppose we have a sequence of events C n which are decreasing (getting harder and harder to satisfy) and which converge to a limit C: C n C. Can we say P [ C n ] converges to P [ C ]? Here is a specific example. Suppose we observe an infinite sequence of coin tosses, and think therefore of the collection C of events A i that the i th coin comes up heads. Consider the probabilities (a) P [ second toss gives heads ] = P [ A 2 ] * We avoid discussing conditional probabilities here for reasons of shortage of time: they have been dealt with in ST111 and figure very largely in 7

8 (b) P [ first n tosses all give heads ] = P [ n i=1 A i ] (c) P [ the first toss which gives a head is even-numbered ] There is a difference! The first two can be dealt with within the algebra. The third cannot: suppose C n is the event the first toss in numbers 1,..., n which gives a head is even-numbered or else all n of these tosses give tails, then C n lies in A(C), and converges down to the event C the first toss which gives a head is even-numbered, but C is not in A(C). We now find a number of problems raise their heads. Problems with everywhere being impossible : Suppose we are running an experiment with an outcome uniformly distributed over [0, 1]. Then we have a problem as mentioned in the second of our motivating examples: under reasonable conditions we are working with the algebra of finite unions of sub-intervals of [0, 1], and the probability measure which gives P [ [a, b] ] = b a, but this means P [ {a} ] = 0. Now we need to be careful, since if we rashly allow ourselves to work with uncountable unions we get P x [0,1] {x} = x [0,1] 0 = 0. But this contradicts P [ [0, 1] ] = 1 and so is obviously wrong. Problems with specification: if we react to the above example by insisting we can only give probabilities to events in the original algebra, then we can fail to give probabilities to perfectly sensible events, such as in examples such as in (c) in the infinite sequence of coin-tosses above. On the other hand if we rashly prescribe probabilities then how can we avoid getting into contradictions such as above? It seems sensible to suppose that at least when we have C n C then we should be allowed to say P [ C n ] P [ C ], and this turns out to be the case as long as the set-up is sensible. Here is an example of a set-up which is not sensible: Ω = {1, 2, 3,...}, C = {{1}, {2},...}, P [ n ] = 1/2 n+1. Then A(C) is the collection of finite and co-finite* subsets of the positive integers, and P [ {1, 2,..., n} ] = n 1/2 m+1 = (1/2) (1 1/2 n+1 ) 1/2 1. m=1 We must now investigate how we can deal with limit sets. * co-finite: complement is finite 8

9 2.5 σ-algebras The first task is to establish a wide range of sensible limit sets. Boldly, we look at sets which can be obtained by any imaginable combination of countable set operations: the collection of all such sets is a σ-algebra.** Definition 2.4 (σ-algebra): A σ-algebra of subsets of Ω is an algebra which is also closed under countable unions. In fact σ-algebras are even larger than ordinary algebras; it is difficult to describe a typical member of a σ-algebra, and it pays to talk about σ-algebras generated by specified collections of sets. Definition 2.5 (σ-algebra generated by a collection): For any collection of subsets C of Ω, we define σ(c) to be the intersection of all σ-algebras of subsets of Ω which contain C: σ(c) = {S : S is a σ-algebra and C S}. Theorem 2.6 (Monotone limits): Note that σ(c) defined above is indeed a σ-algebra. Furthermore, it is the smallest σ-algebra containing C which is closed under monotone limits. Examples of σ-algebras include: all algebras of subsets of finite sets (because then there will be no non-finite countable set operations); the Borel σ-algebra generated by the family of all intervals of the real line; the σ-algebra for the coin-tossing example generated by the infinite family of events A i = [ i th coin is heads ]. 2.6 Countable additivity Now we have established a context for limit sets (they are sets belonging to a σ- algebra) we can think about what sort of limiting operations we should allow for probability measures. Definition 2.7 (Measures): A set-function µ : A [0, ] is said to be a finitely-additive measure if it satisfies: (FA) µ(a B) = µa + µb whenever A, B are disjoint. It is said to be countably-additive (or σ-additive) if in addition (CA) µ i=1 A i = i=1 µa i whenever the A i are disjoint and their union i=1 A i lies in A. We abbreviate finitely-additive to (FA), countably-additive to (CA). We often abbreviate countably-additive measure to measure. Notice that if A were actually a σ-algebra then we wouldn t have to check the condition i=1 A i lies in A in the third property. ** σ stands for countable 9

10 Definition 2.8 (Probability measures): A set-function P : A [0, 1] is said to be a finitely-additive probability measure if it is a (FA) measure such that P [ Ω ] = 1. It is a (CA) probability measure (we often just say probability measure if in addition it is (CA). Notice various consequences for probability measures: µ( ) = 0, condition (ii) follows from condition (iii) if condition (iii) holds, we always have µ ( i=1 (A i)) i=1 µ(a i) even when the union is not disjoint, etc. CA is a kind of continuity condition. A similar continuity condition is that of monotone limits. Definition 2.9 (Monotone limits): A set-function µ : A [0, 1] is said to obey the monotone limits property (ML) if it satisfies: µa i µa i whenever the A i increase upwards to a limit set A which lies in A. (ML) is simpler to check than (CA) but is equivalent for finitely-additive measures. Theorem 2.10 (Equivalence for countable additivity): (F A) + (ML) (CA) Lemma 2.11 (Another equivalence): Suppose P is a finitely additive probability measure on (Ω, F), where F is an algebra of sets. Then P is countably additive if and only if lim n P [ A n ] = 1 whenever the sequence of events A n belongs to the algebra F and moreover A n Ω. 2.7 Uniqueness of probability measures To illustrate the next step, consider the notion of length/area. (To avoid awkward alternatives, we talk about the measure instead of length/area /volume/...) It is easy to define the area of very regular sets. But for a stranger, more fractal-like, set A we would need to define something like an outer-measure { } µ(a) = inf µ(bi ) : where the B i cover A to get at least an upper bound for what it would be sensible to call the measure of A. Of course we must give equal priority to considering what is the measure of the complement A c. Suppose for definiteness that A is contained in a simple set 10

11 Q of finite measure (a convenient interval for length, a square for area, a cube for volume,...) so that A c = Q \ A. Then consideration of µ(a c ) leads us directly to consideration of inner-measure for A: µ(a) = µ(q) µ(a c ). Clearly µ(a) µ(a): moreover we can only expect a truly sensible definition of measure on the set F = { A : µ(a) = µ(a) }. The fundamental theorem of measure theory states that this works out all right! Theorem 2.12 (Extension theorem): If µ is a measure on an algebra A which is σ-additive on A then it can be extended uniquely to a countable additive measure on F defined as above: moreover σ(a) F. The proof of this remarkable theorem is too lengthy to go into here. Notice that it can be paraphrased very simply: if your notion of measure (probability, length, area, volume,...) can be defined consistently on an algebra in such a way that it is σ-additive whenever the two sides ( ) µ A i = µa i i=1 make sense (whenever the disjoint union i=1 A i actually belongs to the algebra), then it can be extended uniquely to the (typically much larger) σ-algebra generated by the original algebra, so as again to be a (σ-additive) measure. There is an important special part of this theorem which is worth stating separately. Definition 2.13 (Π-system): A Π-system of subsets of Ω is a collection of subsets including Ω itself and closed under finite intersections. Theorem 2.14 (Uniqueness for probability measures): Two finite measures which agree on a π-system Π also agree on the generated σ-algebra σ(π). i=1 2.8 Lebesgue measure and coin tossing The extension theorem can be applied to the uniform probability space Ω = [0, 1], A given by finite unions of intervals, P given by lengths of intervals. It turns out P is indeed σ-additive on A (showing this is non-trivial!) and so the extension theorem tells us there is a unique countably additive extension P on 11

12 the σ-algebra B = σ(a) (the Borel σ-algebra restricted to [0, 1]). We call this Lebesgue measure. There is a significant connection between infinite sequences of coin tosses and numbers in [0, 1]. Briefly, we can expand a number x [0, 1] in binary (as opposed to decimal!): we write x as.ω 1 ω 2 ω 3... where ω i equals 1 or 0 according as 2 i x is greater than 1 or not. The coin-tossing σ-algebra can be viewed as generated by the sequence {ω 1, ω 2, ω 3,...} with 0 standing for tails, 1 for heads. In effect we get a map from coin-tossing space 2 N to number space [0, 1] with the slight cautionary note that this map very occasionally maps two sequences onto one number (think of and ). In particular [ω 1 = a 1, ω 2 = a 2,..., ω d = a d ] = [x, x + 2 d ) where x is the number corresponding to (a 1, a 2,..., a d ). Remarkably, we can now use the uniqueness theorem to show that the map T : (a 1, a 2,..., a d ) x preserves probabilities, in the sense that Lebesgue measure is exactly the same as we get by finding the probability of the event T 1 (A) as a coin-tossing event, if the coins are independent and fair. It is reasonable to ask whether there are any non-measurable sets, since σ- algebras are so big! It is indeed very hard to find any. Here is the basic example, which is due in essence to Vitali. Consider the following equivalence relation on (Ω,B,P): we say x y if x y is a rational number. Now construct a set A by choosing exactly one member from each equivalence class. So for any x [0, 1] there is one and only one y A such that x y is a rational number. If A were Lebesgue measurable then it would have a value P [ A ]. What would this value be? Imagine [0, 1] folded round into a circle. It is the case that P [ A ] does not change when one turns this circle. In particular we can now consider A q = {a + q : a A} for rational q. By construction A q and A r are disjoint for different rational q, r. Now we have A q = [0, 1] q rational 12

13 and since there are only countably many rational q, and P [ A q ] doesn t depend on q, we determine P [ [0, 1] ] = P [ A q ] = P [ A ]. q rational q rational But this cannot make sense if P [ [0, 1] ] = 1! We are forced to conclude that A cannot be Lebesgue measurable. This example has a lot to do with the Banach-Tarski paradox described in one of our motivating examples above. 3. Independence and measurable functions 3.1 Independence In ST111 we formalized the idea of independence of events. Essentially we require a multiplication law to hold: Definition 3.15 (Independence of an infinite sequence of events): We say the events A i (for i = 1, 2,...) are independent if, for any finite subsequence i 1 < i 2 <... < i k we have P [ A i1... A ik ] = P [ A i1 ]... P [ A ik ] Notice we require all possible multiplication laws to hold: it is possible to build interesting examples where events are independent pair-by-pair, but altogether give non-trivial information about each other. We need to talk about infinite sequences of events (often independent). We often have in the back of our minds a sense that the sequence is revealed to us progressively over time (though this need not be so!), suggesting two natural questions. First, will we see events occur in the sequence right into the indefinite future? Second, will we after some point see all events occur? Definition 3.16 ( Infinitely often and Eventually ): Given a sequence of events B 1, B 2,... we say B i holds infinitely often ([B i i.o.]) if there are infinitely many different i for which the statement B i is true: in set-theoretic terms [B i i.o.] = i=1 j=i B j. 13

14 B i holds eventually ([B i ev.]) if for all large enough i the statement B i is true: in set-theoretic terms [B i ev.] = B j. i=1 j=i Notice these two concepts ev. and i.o. make sense even if the infinite sequence is just a sequence, with no notion of events occurring consecutively in time! Notice (you should check this yourself!) [B i i.o.] = [B c i ev.] c. 3.2 Borel-Cantelli lemmas The multiplication laws appearing above in Section 2.1 force a kind of infinite multiplication law. Lemma 3.17 (Probability of infinite intersection): If the events A i (for i = 1, 2,...) are independent then [ ] P A i = P [ A i ] i=1 We have to be careful what we mean by the infinite product i=1 P [ A i ]: we mean of course the limiting value lim n n i=1 P [ A i ]. We can now prove a remarkable pair of facts about P [ A i i.o. ] (and hence its twin P [ A i ev. ]!). It turns out it is often easy to tell whether these events have probability 0 or 1. Theorem 3.18 (Borel-Cantelli lemmas): Suppose the events A i (for i = 1, 2,...) form an infinite sequence. Then (i) if i=1 P [ A i ] < then P [ A i holds infinitely often ] = P [ A i i.o. ] = 0 ; (ii) if i=1 P [ A i ] = and the A i are independent then i=1 P [ A i holds infinitely often ] = P [ A i i.o. ] = 1. Note the two parts of the above result are not quite symmetrical: the second part also requires independence. It is a good exercise to work out a counterexample to part (ii) if independence fails. 3.3 Law of large numbers for events As a consequence of these ideas it can be shown that limiting frequencies exist for sequences of independent trials with the same success probability. 14

15 Theorem 3.19 (Law of large numbers for events): Suppose that we have a sequence of independent events A i each with the same probability p. Let S n count the number of events A 1,...,, A n which occur. Then for all positive ɛ. [ ] S n P n p ɛ ev. = Independence and classes of events The idea of independence stretches beyond mere sequences of events. For example, consider (a) a set of events concerning a football match between Coventry City and Aston Villa at home for Coventry, and (b) a set of events concerning a cricket test between England and Australia at Melbourne, both happening on the same day. At least as a first approximation, one might assume that any combination of events concerning (a) is independent of any combination concerning (b). Definition 3.20 (Independence and classes of events): Suppose C 1, C 2 are two classes of events. We say they are independent if A and B are independent whenever A C 1, B C 2. Here our notion of Π-systems becomes important. Lemma 3.21 (Independence and Π-systems): If two Π-systems are independent, then so are the σ-algebras they generate. Returning to sequences, the above is the reason why we can jump immediately from assumptions of independence of events to deducing that their complements are independent. Corollary 3.22 (Independence and complements): If a sequence of events A i is independent, then so is the sequence of complementary events A c i. 3.5 Measurable functions Mathematical work often becomes easier if one moves from sets to functions. Probability theory is no different. Instead of events (subsets of sample space) we can often find it easier to work with random variables (real-valued functions defined on sample space). You should think of a random variable as involving lots of different events, namely those events defined in terms of the random variable taking on different sets of values. Accordingly we need to take care that the random variable doesn t produce events which fall outwith our chosen σ-algebra. To do this we need to develop the idea of a measurable function. 15

16 Definition 3.23 (Measurable space): (Ω, F) is a measurable space if F is a σ-algebra of subsets of Ω. Definition 3.24 (Borel σ-algebra): The Borel σ-algebra B is the σ-algebra of subsets of R generated by the collection of intervals of R. In fact we don t need all the intervals of R. It is enough to take the closed half-infinite intervals (, x]. Definition 3.25 (Measurable function): Suppose that (Ω, F), (Ω, F ) are both measurable spaces. We say the function f : Ω Ω is measurable if f 1 (A) = {ω : f(ω) A} belongs to F whenever A belongs to F. Definition 3.26 (Random variable): Suppose that X : Ω R is measurable as a mapping from (Ω, F) to (R, B). Then we say X is a random variable. As we have said, to each random variable there is a class of related events. This actually forms a σ-algebra. Definition 3.27 (σ-algebra generated by a random variable): If X : Ω R is a random variable then the σ-algebra generated by X is the family of events σ(x) = {X 1 (A) : A B}. 3.6 Independence of random variables Random variables can be independent too! Essentially here independence means that a event generated by one of the random variables cannot be used to give useful predictions about an event generated by the other random variable. Definition 3.28 (Independence of random variables): We say random variables X and Y are independent if their σ-algebras σ(x), σ(y ) are independent. Theorem 3.29 (Criterion for independence of random variables): Let X and Y be random variables, and let P be the Π-system of R formed by all halfinfinite closed intervals (, x]. Then X and Y are independent if and only if the collections of events X 1 P, Y 1 P are independent*. 3.7 Distributions of random variables We often need to talk about random variables on their own, without reference to other random variables or events. In such cases all we are interested in is the probabilities they have of taking values in various regions: * Here we define X 1 P = {X 1 (A) : A P} = {X 1 ((, x]) : x R} 16

17 Definition 3.30 (Distribution of a random variable): Suppose that X is a random variable. Its distribution is the probability measure P X on R given by whenever B B. P X [B] = P [ X B ] 4. Integration One of the main things to do with functions is to integrate them (find the area under the curve). One of the main things to do with random variables is to take their expectations (find their average values). It turns out that these are really the same idea! We start with integration. 4.1 Simple functions and Indicators Begin by thinking of the simplest possible function to integrate. That is an indicator function, which only takes two possible values, 0 or 1: Definition 4.31 (Indicator function): If A is a measurable set then its indicator function is defined by I [A] (x) = { 0 if x A; 1 if x A. The next stage up is to consider a simple function taking only a finite number of values, since it can be regarded as a linear combination of indicator functions. Definition 4.32 (Simple functions): A simple function h is a measurable function h : Ω R which only takes finitely many values. Thus we can represent it as h(x) = c 1 I [A1 ](x) +...c n I [An ](x) for some finite collection A 1,..., A n of measurable sets and constants c 1,..., c n. It is easy to integrate simple functions... Definition 4.33 (Integration of simple functions): The integral of a simple function h with respect to a measure µ is given by h dµ = h(x)µ( dx) = n c i µ(a i ) i=1 where h(x) = c 1 I [A1 ](x) +...c n I [An ](x) 17

18 as above. Note that one really should prove that the definition of h dµ does not depend on exactly how one represents h as the sum of indicator functions. Integration for such functions has a number of basic properties which one uses all the time, almost unconsciously, when trying to find integrals. Theorem 4.34 (Properties of integration for simple functions): (1) if µ(f g) = 0 then f dµ = g dµ; (2) Linearity: (af + bg) dµ = a f dµ + b g dµ; (3) Monotonicity: f g means f dµ g dµ; (4) min{f, g} and max{f, g} are simple. Simple functions are rather boring. For more general functions we use limiting arguments. We have to be a little careful here, since some functions will have integrals built up from + where they are integrated over one part of the region, and over another part. Think for example of 1 x dx = x dx + 1 dx equals? x So we first consider just non-negative functions. Definition 4.35 (Integration for non-negative measurable functions): If f 0 is measurable then we define { f dµ = sup } g dµ : for simple g such that 0 g f. 4.2 Integrable functions For general functions we require that we don t get into this situation of. Definition 4.36 (Integration for general measurable functions): If f is measurable and we can write f = g h for two non-negative measurable functions g and h, both with finite integrals, then f dµ = g dµ h dµ. We then say f is integrable. 18

19 One really needs to prove that the integral f dµ does not depend on the choice f = g h. In fact if there is any choice which works then the easy choice g = max{f, 0} h = max{ f, 0} will work. One can show that the integral on integrable functions agrees with its definition on simple functions and is linear. What starts to make the theory very easy is that the integral thus defined behaves very well when studying limits. Theorem 4.37 (Monotone convergence theorem (MON)): If f n f (all being non-negative measurable functions) then f n dµ f dµ. Corollary 4.38 (Integrability and simple functions): if f is non-negative and measurable then for any sequence of non-negative simple functions f n such that f n f we have f n dµ f dµ. Definition 4.39 (Integration over a measurable set): if A is measurable and f is integrable then f dµ = (I[A]f ) dµ. A 4.3 Expectation of random variables The above notions apply directly to random variables, which may be thought of simply as measurable functions defined on the sample space! Definition 4.40 (Expectation): if P is a probability measure then we define expectation (with respect to this probability measure) for all integrable random variables X by E [ X ] = X dp = X(ω)P( dω). The notion of expectation is really only to do with the random variable considered on its own, without reference to any other random variables. Accordingly it can be expressed in terms of the distribution of the random variable. 19

20 Theorem 4.41 (Change of variables): Let X be a random variable and let g : R R be a measurable function. Assuming that the random variable g(x) is integrable, E [ g(x) ] = g(x)p X ( dx). R 4.4 Examples You need to work through examples such as the following to get a good idea of how the above really works out in practice. See the material covered in lectures for more on this. Evaluate 1 xleb( dx) = x. 0 Consider Ω = {1, 2, 3,...}, P [ {i} ] = p i where i=1 p i = 1. Evaluate f dp = i=1 f(i)p i. Evaluate y 0 ex Leb( dx). Evaluate n f(x)leb( dx) where 0 1 if 0 x < 1, 2 if 1 x < 2, f(x) =... n if n 1 x < n. Evaluate I [0,θ] (x) sin(x)leb( dx). 5. Convergence Approximation is a fundamental key to making mathematics work in practice. Instead of being stuck, unable to do a hard problem, we find an easier problem which has almost the same answer, and do that instead! The notion of convergence (see first-year analysis) is the formal structure giving us the tools to do this. For random variables there are a number of different notions of convergence, depending on whether we need to approximate a whole sequence of actual random values, or just a particular random value, or even just probabilities. 5.1 Convergence of random variables Definition 5.42 (Convergence in probability): The random variables X n converge in probability to Y, X n Y in prob, 20

21 if for all positive ɛ we have P [ X n Y > ɛ ] 0. Definition 5.43 (Convergence almost surely / almost everywhere): The random variables X n converge almost surely to Y, if we have X n Y a.s., P [ X n Y ] = 0. The (measurable) functions f n converge almost everywhere to f if the set is of Lebesgue measure zero. {x : f n (x) f(x) fails } The difference is that convergence in probability deals with just a single random value X n for large n. Convergence almost surely deals with the behaviour of the whole sequence. Here are some examples to think about. Consider random variables defined on ([0, 1], B, Leb) by X n (ω) = I [[0,1/n]] (ω), Then X n 0 a.s.. Consider the probability space above and the events A 1 = [0, 1], A 2 = [0, 1/2], A 3 = [1/2, 1], A 4 = [0, 1/4],..., A 7 = [3/4, 1],... Then X n = I [An ] converges to zero in probability but not almost surely. Suppose in the above that X n = n k=1 (k/n)i [[(k 1)/n,k/n]]. Then X n X a.s., where X(ω = ω [0, 1]. Suppose in the above that X n a for all n. Let Y n = max m n X m. Then Y n Y a.s. for some Y. Suppose in the above that the X n are not bounded, but are independent, and furthermore Then Y n Y a.s. where lim a i=1 P [ Y a ] = P [ X n a ] = 1. P [ X n a ]. As one might expect, the notion of almost sure convergence implies that of convergence in probability. i=1 21

22 Theorem 5.44 (Almost sure convergence implies convergence in probability): X n X a.s. implies X n X in prob. ALmost sure convergence allows for various theorems telling us when it is OK to exchange integrals and limits. Generally this doesn t work: consider the example 1 = 0 λ exp( λt) dt 0 lim λ λ exp( λt) dt = 0 dt = 0. However we have already seen one case where it does work: when the limit in monotonic. In fact we only need this to hold almost everywhere (i.e. when the convergence is almost sure). Theorem 5.45 (MON): if the functions f n, f are non-negative and if f n f µ a.e. then f n dµ f dµ. It is often the case that the following simple inequalities are crucial to figuring out whether convergence holds. Lemma 5.46 (Markov s inequality): if f : R R is increasing and nonnegative and X is a random variable then for all a such that f(a) > 0. P [ X a ] E [ f(x) ] /f(a) Corollary 5.47 (Chebyshev s inequality): if E [ X 2 ] < then for all a > 0. P [ X E [ X ] a ] Var(X)/a 2 In particular we can get a lot of mileage by combining with the fact, that while in general the variance of a random variable is not additive, it is additive in the case of independence. Lemma 5.48 (variance and independence): if a sequence of random variables X i is independent then ( n ) n Var X i = Var (X i ). i=1 i=1 5.2 Laws of large numbers for random variables An important application of these ideas is to show that the law of large numbers extends from events to random variables. 22

23 Theorem 5.49 (Weak law of large numbers): if a sequence of random variables X i is independent, and if the random variables all have the same finite mean and variance E [ X i ] = µ and Var(X i ) = σ 2 <, then S n /n µ in prob. where S n = (X X n )/n is the partial sum of the sequence. As you will see, the proof is really rather easy when we use Chebyshev s inequality above. Indeed it is also quite easy to generalize to the case when the random variables are correlated, as long as the covariances are small... However the corresponding result for almost sure convergence, rather than convergence, is rather harder to prove. Theorem 5.50 (Strong law of large numbers): if a sequence of random variables X i is independent and identically distributed, and if E [ X i ] = µ then S n /n µ a.s. where S n = (X X n )/n is the partial sum of the sequence. 5.3 Convergence of integrals and expectations We already know a way to relate integrals to limits (MON). What about a general sequence of non-negative measurable functions? Theorem 5.51 (Fatou s lemma (FATOU)): If the functions f n : R R are actually non-negative then lim inf f n dµ lim inf f n dµ. We can also go the other way : Theorem 5.52 ( Reverse Fatou ): If the functions f n : R R are bounded above by g µ a.e. and g is integrable then lim sup f n dµ lim sup f n dµ. 5.4 Dominated convergence theorem Although in general one can t interchange limits and integrals, this can be done if all the functions (equivalently, random variables) involved are bounded in absolute value by a single non-negative function (random variable) which has finite integral. 23

24 Corollary 5.53 (Dominated convergence theorem (DOM)): If the functions f n : R R are bounded above in absolute value by g µ a.e. (so f n < g a.e.) and g is integrable and also f n f then lim f n dµ = f dµ. This is a very powerful result Examples If the X n form a bounded sequence random variable and they converge almost surely to X then E [ X n ] E [ X ]. Suppose that U is a random variable uniformly distributed over [0, 1] and X n = 2 n 1 k=0 k2 n I [k2 n U<(k+1)2 n ]. Then E [ log(1 X n ) ] 1. Suppose that the X n are independent and X 1 = 1 while for n 2 P [ X n = n + 1 ] = P [ X n = 1/(n + 1) ] = 1/n 3 P [ X n = 1 ] = 1 2/n 3 and Z n = n i=1 X i. Then the Z n form an almost surely convergent sequence with limit Z, and E [ Z n ] = E [ Z ]. 6. Product measures 6.1 Product measure spaces The idea here is, given two measure spaces (Ω, F, µ) and (Ω, F, ν), we build a meaasure space Ω Ω by using rectangle sets A B with measures µ(a) ν(b). As you might guess from the product form µ(a) ν(b), in the context of probability this is related to independence. 24

25 Definition 6.54 (Product measure space): define the product measure µ ν on the Π-system R of rectangle sets A B as above. Let A(R) be the algebra generated by R. Lemma 6.55 (Representation of A(R)): every member of A(R) can be expressed as a finite disjoint union of rectangle sets. It is now possible to apply the Extension Theorem (we need to check σ- additivity this is non-trivial but works) to define the product measure µ ν on the whole σ-algebra σ(r). 6.2 Fubini s theorem There are three big results on integration. We have already met two: MON and DOM, which tell us cases when we can exchange integrals and limits. The other result arises in the situation where we have a product measure space. In such a case we can integrate any function in one of three possible ways: either using the product measure, or by first doing a partial integration holding one coordinate fixed, and then integrating with respect to that one. We call this alternative iterated integration, and obviously there are two ways to do it depending on which variable we fix first. The final big result is due to Fubini, and tells us that as long as the function is modestly well-behaved it doesn t matter which of the three ways we do the integration, we still get the same answer: Theorem 6.56 (Fubini s theorem): Suppose f is a real-valued function defined on the product measure space above which is either (a) non-negative or (b) µ ν- integrable. Then f d(µ ν) = Ω ( ) f(ω, ω )µ( dω) ν( dω ) Ω Notice the two alternative conditions. Non-negativity (sometimes described as Tonelli s condition, is easy to check but can be limited. Think carefully about Fubini s theorem and especially Tonelli s condition, and you will see that the only thing which can go wrong is when in the product form you have an problem! 6.3 Relationship with independence Suppose X and Y are independent random variables. Then the distribution of the pair (X, Y ), a measure on R R given by µ (A) = P [ (X, Y ) A ], is exactly the product measure µ ν where µ is the distribution of X, and ν is the distribution of Y. End of outline notes 25

Measure and integration

Measure and integration Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

More information

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures 36-752 Spring 2014 Advanced Probability Overview Lecture Notes Set 1: Course Overview, σ-fields, and Measures Instructor: Jing Lei Associated reading: Sec 1.1-1.4 of Ash and Doléans-Dade; Sec 1.1 and A.1

More information

Measures. Chapter Some prerequisites. 1.2 Introduction

Measures. Chapter Some prerequisites. 1.2 Introduction Lecture notes Course Analysis for PhD students Uppsala University, Spring 2018 Rostyslav Kozhan Chapter 1 Measures 1.1 Some prerequisites I will follow closely the textbook Real analysis: Modern Techniques

More information

Lebesgue measure and integration

Lebesgue measure and integration Chapter 4 Lebesgue measure and integration If you look back at what you have learned in your earlier mathematics courses, you will definitely recall a lot about area and volume from the simple formulas

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

18.175: Lecture 2 Extension theorems, random variables, distributions

18.175: Lecture 2 Extension theorems, random variables, distributions 18.175: Lecture 2 Extension theorems, random variables, distributions Scott Sheffield MIT Outline Extension theorems Characterizing measures on R d Random variables Outline Extension theorems Characterizing

More information

I. ANALYSIS; PROBABILITY

I. ANALYSIS; PROBABILITY ma414l1.tex Lecture 1. 12.1.2012 I. NLYSIS; PROBBILITY 1. Lebesgue Measure and Integral We recall Lebesgue measure (M411 Probability and Measure) λ: defined on intervals (a, b] by λ((a, b]) := b a (so

More information

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias Notes on Measure, Probability and Stochastic Processes João Lopes Dias Departamento de Matemática, ISEG, Universidade de Lisboa, Rua do Quelhas 6, 1200-781 Lisboa, Portugal E-mail address: jldias@iseg.ulisboa.pt

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Continuum Probability and Sets of Measure Zero

Continuum Probability and Sets of Measure Zero Chapter 3 Continuum Probability and Sets of Measure Zero In this chapter, we provide a motivation for using measure theory as a foundation for probability. It uses the example of random coin tossing to

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

REAL ANALYSIS I Spring 2016 Product Measures

REAL ANALYSIS I Spring 2016 Product Measures REAL ANALSIS I Spring 216 Product Measures We assume that (, M, µ), (, N, ν) are σ- finite measure spaces. We want to provide the Cartesian product with a measure space structure in which all sets of the

More information

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989), Real Analysis 2, Math 651, Spring 2005 April 26, 2005 1 Real Analysis 2, Math 651, Spring 2005 Krzysztof Chris Ciesielski 1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer

More information

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6 MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION Extra Reading Material for Level 4 and Level 6 Part A: Construction of Lebesgue Measure The first part the extra material consists of the construction

More information

Construction of a general measure structure

Construction of a general measure structure Chapter 4 Construction of a general measure structure We turn to the development of general measure theory. The ingredients are a set describing the universe of points, a class of measurable subsets along

More information

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure.

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure. 0 0 0 -: Lecture How is this course different from your earlier probability courses? There are some problems that simply can t be handled with finite-dimensional sample spaces and random variables that

More information

1.1. MEASURES AND INTEGRALS

1.1. MEASURES AND INTEGRALS CHAPTER 1: MEASURE THEORY In this chapter we define the notion of measure µ on a space, construct integrals on this space, and establish their basic properties under limits. The measure µ(e) will be defined

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction Chapter 1 Probability Theory: Introduction Basic Probability General In a probability space (Ω, Σ, P), the set Ω is the set of all possible outcomes of a probability experiment. Mathematically, Ω is just

More information

1 Measurable Functions

1 Measurable Functions 36-752 Advanced Probability Overview Spring 2018 2. Measurable Functions, Random Variables, and Integration Instructor: Alessandro Rinaldo Associated reading: Sec 1.5 of Ash and Doléans-Dade; Sec 1.3 and

More information

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ). Connectedness 1 Motivation Connectedness is the sort of topological property that students love. Its definition is intuitive and easy to understand, and it is a powerful tool in proofs of well-known results.

More information

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland. Measures These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland. 1 Introduction Our motivation for studying measure theory is to lay a foundation

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

2 Measure Theory. 2.1 Measures

2 Measure Theory. 2.1 Measures 2 Measure Theory 2.1 Measures A lot of this exposition is motivated by Folland s wonderful text, Real Analysis: Modern Techniques and Their Applications. Perhaps the most ubiquitous measure in our lives

More information

Countability. 1 Motivation. 2 Counting

Countability. 1 Motivation. 2 Counting Countability 1 Motivation In topology as well as other areas of mathematics, we deal with a lot of infinite sets. However, as we will gradually discover, some infinite sets are bigger than others. Countably

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Notes on the Lebesgue Integral by Francis J. Narcowich November, 2013

Notes on the Lebesgue Integral by Francis J. Narcowich November, 2013 Notes on the Lebesgue Integral by Francis J. Narcowich November, 203 Introduction In the definition of the Riemann integral of a function f(x), the x-axis is partitioned and the integral is defined in

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Measurable functions are approximately nice, even if look terrible.

Measurable functions are approximately nice, even if look terrible. Tel Aviv University, 2015 Functions of real variables 74 7 Approximation 7a A terrible integrable function........... 74 7b Approximation of sets................ 76 7c Approximation of functions............

More information

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M. 1. Abstract Integration The main reference for this section is Rudin s Real and Complex Analysis. The purpose of developing an abstract theory of integration is to emphasize the difference between the

More information

MATH 521, WEEK 2: Rational and Real Numbers, Ordered Sets, Countable Sets

MATH 521, WEEK 2: Rational and Real Numbers, Ordered Sets, Countable Sets MATH 521, WEEK 2: Rational and Real Numbers, Ordered Sets, Countable Sets 1 Rational and Real Numbers Recall that a number is rational if it can be written in the form a/b where a, b Z and b 0, and a number

More information

Measures and Measure Spaces

Measures and Measure Spaces Chapter 2 Measures and Measure Spaces In summarizing the flaws of the Riemann integral we can focus on two main points: 1) Many nice functions are not Riemann integrable. 2) The Riemann integral does not

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Compendium and Solutions to exercises TMA4225 Foundation of analysis

Compendium and Solutions to exercises TMA4225 Foundation of analysis Compendium and Solutions to exercises TMA4225 Foundation of analysis Ruben Spaans December 6, 2010 1 Introduction This compendium contains a lexicon over definitions and exercises with solutions. Throughout

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

One-to-one functions and onto functions

One-to-one functions and onto functions MA 3362 Lecture 7 - One-to-one and Onto Wednesday, October 22, 2008. Objectives: Formalize definitions of one-to-one and onto One-to-one functions and onto functions At the level of set theory, there are

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

Math 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras

Math 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras Math 4121 Spring 2012 Weaver Measure Theory 1. σ-algebras A measure is a function which gauges the size of subsets of a given set. In general we do not ask that a measure evaluate the size of every subset,

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Sequence convergence, the weak T-axioms, and first countability

Sequence convergence, the weak T-axioms, and first countability Sequence convergence, the weak T-axioms, and first countability 1 Motivation Up to now we have been mentioning the notion of sequence convergence without actually defining it. So in this section we will

More information

Chapter One. The Real Number System

Chapter One. The Real Number System Chapter One. The Real Number System We shall give a quick introduction to the real number system. It is imperative that we know how the set of real numbers behaves in the way that its completeness and

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Chapter 4. The dominated convergence theorem and applications

Chapter 4. The dominated convergence theorem and applications Chapter 4. The dominated convergence theorem and applications The Monotone Covergence theorem is one of a number of key theorems alllowing one to exchange limits and [Lebesgue] integrals (or derivatives

More information

In N we can do addition, but in order to do subtraction we need to extend N to the integers

In N we can do addition, but in order to do subtraction we need to extend N to the integers Chapter 1 The Real Numbers 1.1. Some Preliminaries Discussion: The Irrationality of 2. We begin with the natural numbers N = {1, 2, 3, }. In N we can do addition, but in order to do subtraction we need

More information

Measure Theory and Lebesgue Integration. Joshua H. Lifton

Measure Theory and Lebesgue Integration. Joshua H. Lifton Measure Theory and Lebesgue Integration Joshua H. Lifton Originally published 31 March 1999 Revised 5 September 2004 bstract This paper originally came out of my 1999 Swarthmore College Mathematics Senior

More information

POL502: Foundations. Kosuke Imai Department of Politics, Princeton University. October 10, 2005

POL502: Foundations. Kosuke Imai Department of Politics, Princeton University. October 10, 2005 POL502: Foundations Kosuke Imai Department of Politics, Princeton University October 10, 2005 Our first task is to develop the foundations that are necessary for the materials covered in this course. 1

More information

CHAPTER 8: EXPLORING R

CHAPTER 8: EXPLORING R CHAPTER 8: EXPLORING R LECTURE NOTES FOR MATH 378 (CSUSM, SPRING 2009). WAYNE AITKEN In the previous chapter we discussed the need for a complete ordered field. The field Q is not complete, so we constructed

More information

2.23 Theorem. Let A and B be sets in a metric space. If A B, then L(A) L(B).

2.23 Theorem. Let A and B be sets in a metric space. If A B, then L(A) L(B). 2.23 Theorem. Let A and B be sets in a metric space. If A B, then L(A) L(B). 2.24 Theorem. Let A and B be sets in a metric space. Then L(A B) = L(A) L(B). It is worth noting that you can t replace union

More information

Tools from Lebesgue integration

Tools from Lebesgue integration Tools from Lebesgue integration E.P. van den Ban Fall 2005 Introduction In these notes we describe some of the basic tools from the theory of Lebesgue integration. Definitions and results will be given

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

In N we can do addition, but in order to do subtraction we need to extend N to the integers

In N we can do addition, but in order to do subtraction we need to extend N to the integers Chapter The Real Numbers.. Some Preliminaries Discussion: The Irrationality of 2. We begin with the natural numbers N = {, 2, 3, }. In N we can do addition, but in order to do subtraction we need to extend

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 851 Probability Lecture Notes Winter 2008 Michael Kozdron kozdron@stat.math.uregina.ca http://stat.math.uregina.ca/ kozdron References [1] Jean Jacod and Philip Protter.

More information

Lecture 4: Constructing the Integers, Rationals and Reals

Lecture 4: Constructing the Integers, Rationals and Reals Math/CS 20: Intro. to Math Professor: Padraic Bartlett Lecture 4: Constructing the Integers, Rationals and Reals Week 5 UCSB 204 The Integers Normally, using the natural numbers, you can easily define

More information

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

02. Measure and integral. 1. Borel-measurable functions and pointwise limits (October 3, 2017) 02. Measure and integral Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2017-18/02 measure and integral.pdf]

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3

More information

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration? Lebesgue Integration: A non-rigorous introduction What is wrong with Riemann integration? xample. Let f(x) = { 0 for x Q 1 for x / Q. The upper integral is 1, while the lower integral is 0. Yet, the function

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

CONVERGENCE OF RANDOM SERIES AND MARTINGALES

CONVERGENCE OF RANDOM SERIES AND MARTINGALES CONVERGENCE OF RANDOM SERIES AND MARTINGALES WESLEY LEE Abstract. This paper is an introduction to probability from a measuretheoretic standpoint. After covering probability spaces, it delves into the

More information

Module 1. Probability

Module 1. Probability Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive

More information

Notes on the Lebesgue Integral by Francis J. Narcowich Septemmber, 2014

Notes on the Lebesgue Integral by Francis J. Narcowich Septemmber, 2014 1 Introduction Notes on the Lebesgue Integral by Francis J. Narcowich Septemmber, 2014 In the definition of the Riemann integral of a function f(x), the x-axis is partitioned and the integral is defined

More information

REAL AND COMPLEX ANALYSIS

REAL AND COMPLEX ANALYSIS REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

ABSTRACT INTEGRATION CHAPTER ONE

ABSTRACT INTEGRATION CHAPTER ONE CHAPTER ONE ABSTRACT INTEGRATION Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any form or by any means. Suggestions and errors are invited and can be mailed

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration

Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration Random experiment: uncertainty in outcomes Ω: sample space: a set containing all possible outcomes Definition

More information

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Matija Vidmar February 7, 2018 1 Dynkin and π-systems Some basic

More information

1.4 Techniques of Integration

1.4 Techniques of Integration .4 Techniques of Integration Recall the following strategy for evaluating definite integrals, which arose from the Fundamental Theorem of Calculus (see Section.3). To calculate b a f(x) dx. Find a function

More information

Metric spaces and metrizability

Metric spaces and metrizability 1 Motivation Metric spaces and metrizability By this point in the course, this section should not need much in the way of motivation. From the very beginning, we have talked about R n usual and how relatively

More information

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.) MAS 108 Probability I Notes 1 Autumn 2005 Sample space, events The general setting is: We perform an experiment which can have a number of different outcomes. The sample space is the set of all possible

More information

Probability Theory. Richard F. Bass

Probability Theory. Richard F. Bass Probability Theory Richard F. Bass ii c Copyright 2014 Richard F. Bass Contents 1 Basic notions 1 1.1 A few definitions from measure theory............. 1 1.2 Definitions............................. 2

More information

µ (X) := inf l(i k ) where X k=1 I k, I k an open interval Notice that is a map from subsets of R to non-negative number together with infinity

µ (X) := inf l(i k ) where X k=1 I k, I k an open interval Notice that is a map from subsets of R to non-negative number together with infinity A crash course in Lebesgue measure theory, Math 317, Intro to Analysis II These lecture notes are inspired by the third edition of Royden s Real analysis. The Jordan content is an attempt to extend the

More information

University of Sheffield. School of Mathematics & and Statistics. Measure and Probability MAS350/451/6352

University of Sheffield. School of Mathematics & and Statistics. Measure and Probability MAS350/451/6352 University of Sheffield School of Mathematics & and Statistics Measure and Probability MAS350/451/6352 Spring 2018 Chapter 1 Measure Spaces and Measure 1.1 What is Measure? Measure theory is the abstract

More information

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Introduction to Proofs in Analysis updated December 5, 2016 By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Purpose. These notes intend to introduce four main notions from

More information

The strictly 1/2-stable example

The strictly 1/2-stable example The strictly 1/2-stable example 1 Direct approach: building a Lévy pure jump process on R Bert Fristedt provided key mathematical facts for this example. A pure jump Lévy process X is a Lévy process such

More information

Building Infinite Processes from Finite-Dimensional Distributions

Building Infinite Processes from Finite-Dimensional Distributions Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional

More information

DR.RUPNATHJI( DR.RUPAK NATH )

DR.RUPNATHJI( DR.RUPAK NATH ) Contents 1 Sets 1 2 The Real Numbers 9 3 Sequences 29 4 Series 59 5 Functions 81 6 Power Series 105 7 The elementary functions 111 Chapter 1 Sets It is very convenient to introduce some notation and terminology

More information

MA554 Assessment 1 Cosets and Lagrange s theorem

MA554 Assessment 1 Cosets and Lagrange s theorem MA554 Assessment 1 Cosets and Lagrange s theorem These are notes on cosets and Lagrange s theorem; they go over some material from the lectures again, and they have some new material it is all examinable,

More information

MATH & MATH FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING Scientia Imperii Decus et Tutamen 1

MATH & MATH FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING Scientia Imperii Decus et Tutamen 1 MATH 5310.001 & MATH 5320.001 FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING 2016 Scientia Imperii Decus et Tutamen 1 Robert R. Kallman University of North Texas Department of Mathematics 1155

More information

Reminder Notes for the Course on Measures on Topological Spaces

Reminder Notes for the Course on Measures on Topological Spaces Reminder Notes for the Course on Measures on Topological Spaces T. C. Dorlas Dublin Institute for Advanced Studies School of Theoretical Physics 10 Burlington Road, Dublin 4, Ireland. Email: dorlas@stp.dias.ie

More information

2. Two binary operations (addition, denoted + and multiplication, denoted

2. Two binary operations (addition, denoted + and multiplication, denoted Chapter 2 The Structure of R The purpose of this chapter is to explain to the reader why the set of real numbers is so special. By the end of this chapter, the reader should understand the difference between

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE CHRISTOPHER HEIL 1.4.1 Introduction We will expand on Section 1.4 of Folland s text, which covers abstract outer measures also called exterior measures).

More information

Lebesgue Measure. Dung Le 1

Lebesgue Measure. Dung Le 1 Lebesgue Measure Dung Le 1 1 Introduction How do we measure the size of a set in IR? Let s start with the simplest ones: intervals. Obviously, the natural candidate for a measure of an interval is its

More information

STA 711: Probability & Measure Theory Robert L. Wolpert

STA 711: Probability & Measure Theory Robert L. Wolpert STA 711: Probability & Measure Theory Robert L. Wolpert 6 Independence 6.1 Independent Events A collection of events {A i } F in a probability space (Ω,F,P) is called independent if P[ i I A i ] = P[A

More information

The Lebesgue Integral

The Lebesgue Integral The Lebesgue Integral Brent Nelson In these notes we give an introduction to the Lebesgue integral, assuming only a knowledge of metric spaces and the iemann integral. For more details see [1, Chapters

More information

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................

More information

Mathematical Methods for Physics and Engineering

Mathematical Methods for Physics and Engineering Mathematical Methods for Physics and Engineering Lecture notes for PDEs Sergei V. Shabanov Department of Mathematics, University of Florida, Gainesville, FL 32611 USA CHAPTER 1 The integration theory

More information

CONSTRUCTION OF THE REAL NUMBERS.

CONSTRUCTION OF THE REAL NUMBERS. CONSTRUCTION OF THE REAL NUMBERS. IAN KIMING 1. Motivation. It will not come as a big surprise to anyone when I say that we need the real numbers in mathematics. More to the point, we need to be able to

More information

CS 246 Review of Proof Techniques and Probability 01/14/19

CS 246 Review of Proof Techniques and Probability 01/14/19 Note: This document has been adapted from a similar review session for CS224W (Autumn 2018). It was originally compiled by Jessica Su, with minor edits by Jayadev Bhaskaran. 1 Proof techniques Here we

More information

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define 1 Measures 1.1 Jordan content in R N II - REAL ANALYSIS Let I be an interval in R. Then its 1-content is defined as c 1 (I) := b a if I is bounded with endpoints a, b. If I is unbounded, we define c 1

More information