An Introduction to Laws of Large Numbers

Size: px
Start display at page:

Download "An Introduction to Laws of Large Numbers"

Transcription

1 An to Laws of John CVGMI Group

2 Contents 1

3 Contents 1 2

4 Contents 1 2 3

5 Contents

6 Intuition We re working with random variables. What could we observe? {X n } n=1

7 Intuition We re working with random variables. What could we observe? {X n } n=1... More specifically Bernoulli sequence, looking at probability of average of the sum of N random events. P(x 1 < N i=1 X i < x 2 )

8 Intuition We re working with random variables. What could we observe? {X n } n=1... More specifically Bernoulli sequence, looking at probability of average of the sum of N random events. P(x 1 < N i=1 X i < x 2 ) Coin Flipping anybody?

9 Intuition We re working with random variables. What could we observe? {X n } n=1... More specifically Bernoulli sequence, looking at probability of average of the sum of N random events. P(x 1 < N i=1 X i < x 2 ) Coin Flipping anybody? as N increases we see that the probability of observing an equal amount of 0 or 1 is equal

10 Intuition We re working with random variables. What could we observe? {X n } n=1... More specifically Bernoulli sequence, looking at probability of average of the sum of N random events. P(x 1 < N i=1 X i < x 2 ) Coin Flipping anybody? as N increases we see that the probability of observing an equal amount of 0 or 1 is equal This is intuitive: As the number of samples increases the average observation should tend toward the theoretical mean

11 Weak Law Let s try to work out this most basic fundamental theorem of random variables arising from repeatedly observing a random event. How do we build such a theorem?

12 Weak Law Let s try to work out this most basic fundamental theorem of random variables arising from repeatedly observing a random event. How do we build such a theorem? In blackbox scenario we want to be able to use independence.

13 Weak Law Let s try to work out this most basic fundamental theorem of random variables arising from repeatedly observing a random event. How do we build such a theorem? In blackbox scenario we want to be able to use independence. We want to be able to use variance and expectation: so let s make X n L 2 for all n, and associate with each X n it s mean, we ll denote by µ n, and variance, by σ n.

14 Weak Law Let s try to work out this most basic fundamental theorem of random variables arising from repeatedly observing a random event. How do we build such a theorem? In blackbox scenario we want to be able to use independence. We want to be able to use variance and expectation: so let s make X n L 2 for all n, and associate with each X n it s mean, we ll denote by µ n, and variance, by σ n. We will create new random variables from the sequence and work with these; aim for results in terms of them.

15 Weak Law Let s try to work out this most basic fundamental theorem of random variables arising from repeatedly observing a random event. How do we build such a theorem? In blackbox scenario we want to be able to use independence. We want to be able to use variance and expectation: so let s make X n L 2 for all n, and associate with each X n it s mean, we ll denote by µ n, and variance, by σ n. We will create new random variables from the sequence and work with these; aim for results in terms of them. This seems like a start, let s try to prove a theorem...

16 Weak Law Theorem Let {X n } be a sequence of independent L 2 random variables with means µ n and variances σ n. Then n i=1 X i µ i 0. So a sequence of functions on Ω, the sample space, are going towards a function = zero...

17 Weak Law Theorem Let {X n } be a sequence of independent L 2 random variables with means µ n and variances σ n. Then n i=1 X i µ i 0. So a sequence of functions on Ω, the sample space, are going towards a function = zero... Can we prove this?

18 Weak Law Theorem Let {X n } be a sequence of independent L 2 random variables with means µ n and variances σ n. Then n i=1 X i µ i 0. So a sequence of functions on Ω, the sample space, are going towards a function = zero... Can we prove this? We haven t used (σ n ): we ll clearly need these since otherwise the X i are wildly unpredictable.

19 Weak Law Theorem Let {X n } be a sequence of independent L 2 random variables with means µ n and variances σ n. Then n i=1 X i µ i 0. So a sequence of functions on Ω, the sample space, are going towards a function = zero... Can we prove this? We haven t used (σ n ): we ll clearly need these since otherwise the X i are wildly unpredictable. IDEA: Constrain lim n σi 2 = 0 i=1

20 Counterexample

21 Weak Law: Pitfalls We haven t specified a mode of convergence: this is a technical point, but if you recall uniform, pointwise, a.e., normed convergence from analysis then the goal of the proof can change radically. IDEA: We want our rule to hold with high probability, that is it should hold for essentially all values of ω Ω. in the normed space is pretty ambitious at this point.

22 Weak Law: Pitfalls We haven t specified a mode of convergence: this is a technical point, but if you recall uniform, pointwise, a.e., normed convergence from analysis then the goal of the proof can change radically. IDEA: We want our rule to hold with high probability, that is it should hold for essentially all values of ω Ω. in the normed space is pretty ambitious at this point. We haven t specified a rate of convergence. The rate of convergence of the σ n should imply something about the rate of convergence of the X n µ n. The X n µ n should converge slower than the σ n.

23 Weak Law: Pitfalls We haven t specified a mode of convergence: this is a technical point, but if you recall uniform, pointwise, a.e., normed convergence from analysis then the goal of the proof can change radically. IDEA: We want our rule to hold with high probability, that is it should hold for essentially all values of ω Ω. in the normed space is pretty ambitious at this point. We haven t specified a rate of convergence. The rate of convergence of the σ n should imply something about the rate of convergence of the X n µ n. The X n µ n should converge slower than the σ n. IDEA: Weaken constraint to lim n 2 n i=1 σ 2 i = 0

24 Weak Law: Pitfalls We haven t specified a mode of convergence: this is a technical point, but if you recall uniform, pointwise, a.e., normed convergence from analysis then the goal of the proof can change radically. IDEA: We want our rule to hold with high probability, that is it should hold for essentially all values of ω Ω. in the normed space is pretty ambitious at this point. We haven t specified a rate of convergence. The rate of convergence of the σ n should imply something about the rate of convergence of the X n µ n. The X n µ n should converge slower than the σ n. IDEA: Weaken constraint to lim n 2 n i=1 The X n µ n should converge on average σ 2 i = 0

25 Law of 2.0 Theorem Let {X n } be a sequence of independent L 2 random variables n with means µ n and variances σn. 2 Then 1 n X i µ i 0 in measure if lim 1 n 2 n σi 2 = 0. i=1 i=1 Let Y n (ω) = 1/n n (X i (ω) µ i ). Then E(Y n ) = 0 and σ 2 (Y n ) = 1/n 2 n i=1 σi 2 i=1 by independence. Now we can use the limit of σ 2 (Y n ) from the hypothesis, and need to prove P({ω : Y n (ω) > ɛ}) σ2 (Y n ) ɛ 2 0 as n

26 Aside: L p bounds What does actually say? P({ω : Y n (ω) > ɛ}) σ2 (Y n ) ɛ 2 0 as n Part of Y n > ɛ is bounded by the integral of Y 2 n divided by epsilon squared... Thinking about this makes it obvious. Better bounds can be obtained: Chernoff bounds

27 Law of 2.1 Theorem Let {X n } be a sequence of independent L 2 random variables n with means µ n and variances σn. 2 Then 1 n X i µ i 0 in measure if lim n 2 n i=1 σ 2 i = 0. weakened; use P(max k k i=1 X i µ i ɛ) ɛ 2 n i=1 σ2 i Theorem Let {X n } be a sequence of independent L 2 random variables n with means µ n and variances σn. 2 Then 1 n X i µ i 0 a.e. if lim n 2 n i=1 σ 2 i <. i=1 i=1

28 Khinchine s Law Theorem Let {X n } be a sequence of independent L 1 random variables, n identically distributed, with means µ. Then 1 n X i µ i 0 a.e. as n. i=1

29 Sample Size in Coding Now we need to use the laws. Start with an example in source coding. C : X (Ω) Σ encoding events Interesting to know how complex C must be to encode (Y i ) N i=1 Entropy: H(x) = E[ lg(p(x))]; representation of how uncertain a r.v. is Problem: p( ) is unknown to the function and the distribution needs to be learned. WLLN can be used to answer: how uncertain is the distribution?

30 Asymptotic Equipartition Property Theorem i.i.d. Asymptotic Equipartition Property If X 1,..., X n p(x) then i.i.d. X i 1 n lg(p(x 1,..., x n )) H(x). p(x) so lgp(x i) are i.i.d. The weak law of large numbers says that P( 1 n (lg( (p(x i ))) E[ lg(p(x)))] > ɛ) 0 i i P( 1 lg((p(x i ))) E[ lg(p(x))] > ɛ) 0 n i so the sample entropy approaches the true entropy in probability as the sample size increases.

31 Want to minimize R(f ) = E(l(x, y, f (x))) eg l = (x, y, f ) 1/2 f (x) y Stuck minimizing over all f, under a distribution we don t know... hopeless... IDEA: Take Law of numbers and apply it to this framework, and hopefully R emp (f ) = 1/m i l(x i, y i, f (x i )) R(f ) Use L p bounds to prove convergence results on testing error.

32 Mini : Measure I won t go into too much detail in this regard. If we ve made it this far then great. (X, A) is a measurable space if A 2 X is closed under countable unions and complements. These correspond to measurable events. (X, A) with µ is a measure space if µ : A [0, ) is countably additive over disjoint sets: µ( i A i ) = i µ(a i) if A i A j = if i j. More great properties fall out of this quickly: measure of the emptyset is zero, measures are monotonic under the containment (partial) ordering, even dominated and monotone convergence (of sets) come out from this. s Inequality - Let f L p (R) and denote E α = {x : f > α} and now f p p E α f p α p µ(e α ) by monotonicity and positivity of measures.

33 L 2 : Definition We say that a function f : X R belongs to the function space L p (X ) if f p = ( f (x) p dx) 1/p < and say f has finite L p norm. Definition The fundamental object being considered in statistics is the random variable. Given a measure space (Ω, B, µ) X : Ω R is an L 2 random variable if µ(ω) = 1 and ( {X < r} B r R X (ω) 2 dµ(ω)) 1/2 <. We may also say that a random variable with finite 2nd 2

34 L 2 : Questions Why B? Physical paradoxes, eg. Banach-Tarski Want to talk about physical events measurable sets B is as descriptive as we d like most of the time What does µ do here? µ weights the events measurable sets Puts values to {ω Ω : X (ω) < r}; Ω is the sample space X maps random events (patterns occuring in data) to the reals which have a good standard notion of measure. So X induces the distribution P(B) = µ(x 1 (B)) which gives P({X (ω) B}) = µ(x 1 (B)) as expected. This is usually written without the sample variable. For more sophisticated questions/answers see MAA 6616 or STA 7466, but discussion is encouraged

35 Statistics: Functions of Measured Events Now we can start building Theorem If f : R R is measurable, then f (x) dp = f (X (ω)) dµ ie. the random variable pushes forward a measure onto R, and the integrals of measurable functions of random variables are therefore ultimately determined by real integration. This may be proved using Dirac measures and measure properties. Following these lines we may develop the basics of the theory of probability from measure theory. (ok, Homework)

36 Statistics: Moments Definition Define the first moment as the linear functional E(X ) = t dp. Then the p th central moment is is a functional given by m p (X ) = (t E(X )) p dp. Note that when X L 2 these are well defined by the theorem above. Pullbacks are omitted. Since we re talking about L 2 let s define the second central moment as σ 2 for convenience. If we need higher moments, sadly, mathematics says no.

37 Statistics: Now that we know what random variables are, let s try to define how a collection of them interact in the most basic way possible. Definition Let X = {X α } be a collection of random variables. Then X is independent if P( α X 1 α (B α )) = P((X 1,..., X n ) 1 (B 1... B n )) = (µ(x 1 1 )... µ(x 1 n ))(B 1... B n ). (Head Explodes) Really this just means that the X α are independent if their joint distributions are given by the product measure over the induced measures.

38 Limits: Repeating Independent Trials The significance of the belabored definition of independence is that when a joint distribution, just a distribution over several random variables, contains zero information about how the variables are related. If we have an infinite collection of random variables, independent of each other - in spite of independence - we would like to be able to infer information about the lower order moments from the higher order moments and vice versa. If we have ideal conditions on the the random variables then we should be able to deduce information about moments from a very large sequence. The type of convergence we get should be sensitive to the hypotheses.

39 We clearly need to specify how the means are converging, right?

40 We clearly need to specify how the means are converging, right? Why?

41 We clearly need to specify how the means are converging, right? Why? So how could N i=1 (X i µ i )/N 0?

42 We clearly need to specify how the means are converging, right? Why? So how could N i=1 (X i µ i )/N 0? In measure : lim N P({ N i=1 (X i µ i )/N ɛ}) 0 In norm: lim N ( ( N i=1 (X i µ i )/N) p dµ(ω)) 1/p 0 A.E : P({lim N N i=1 (X i µ i )/N > 0}) 0 f n f nk AE µ L p L p i f n f nk

43

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

µ (X) := inf l(i k ) where X k=1 I k, I k an open interval Notice that is a map from subsets of R to non-negative number together with infinity

µ (X) := inf l(i k ) where X k=1 I k, I k an open interval Notice that is a map from subsets of R to non-negative number together with infinity A crash course in Lebesgue measure theory, Math 317, Intro to Analysis II These lecture notes are inspired by the third edition of Royden s Real analysis. The Jordan content is an attempt to extend the

More information

1 Review of The Learning Setting

1 Review of The Learning Setting COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Lecture 11: Continuous-valued signals and differential entropy

Lecture 11: Continuous-valued signals and differential entropy Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

2 Measure Theory. 2.1 Measures

2 Measure Theory. 2.1 Measures 2 Measure Theory 2.1 Measures A lot of this exposition is motivated by Folland s wonderful text, Real Analysis: Modern Techniques and Their Applications. Perhaps the most ubiquitous measure in our lives

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Lecture 4: September Reminder: convergence of sequences

Lecture 4: September Reminder: convergence of sequences 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused

More information

Some Concepts in Probability and Information Theory

Some Concepts in Probability and Information Theory PHYS 476Q: An Introduction to Entanglement Theory (Spring 2018) Eric Chitambar Some Concepts in Probability and Information Theory We begin this course with a condensed survey of basic concepts in probability

More information

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy Banach Spaces These notes provide an introduction to Banach spaces, which are complete normed vector spaces. For the purposes of these notes, all vector spaces are assumed to be over the real numbers.

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

More on Distribution Function

More on Distribution Function More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function F X. Theorem: Let X be any random variable, with cumulative distribution

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

x 0 + f(x), exist as extended real numbers. Show that f is upper semicontinuous This shows ( ɛ, ɛ) B α. Thus

x 0 + f(x), exist as extended real numbers. Show that f is upper semicontinuous This shows ( ɛ, ɛ) B α. Thus Homework 3 Solutions, Real Analysis I, Fall, 2010. (9) Let f : (, ) [, ] be a function whose restriction to (, 0) (0, ) is continuous. Assume the one-sided limits p = lim x 0 f(x), q = lim x 0 + f(x) exist

More information

1.1. MEASURES AND INTEGRALS

1.1. MEASURES AND INTEGRALS CHAPTER 1: MEASURE THEORY In this chapter we define the notion of measure µ on a space, construct integrals on this space, and establish their basic properties under limits. The measure µ(e) will be defined

More information

Basics on Probability. Jingrui He 09/11/2007

Basics on Probability. Jingrui He 09/11/2007 Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability

More information

Measure and integration

Measure and integration Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

More information

Homework 11. Solutions

Homework 11. Solutions Homework 11. Solutions Problem 2.3.2. Let f n : R R be 1/n times the characteristic function of the interval (0, n). Show that f n 0 uniformly and f n µ L = 1. Why isn t it a counterexample to the Lebesgue

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction Chapter 1 Probability Theory: Introduction Basic Probability General In a probability space (Ω, Σ, P), the set Ω is the set of all possible outcomes of a probability experiment. Mathematically, Ω is just

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 14: Continuous random variables Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/

More information

Lebesgue measure and integration

Lebesgue measure and integration Chapter 4 Lebesgue measure and integration If you look back at what you have learned in your earlier mathematics courses, you will definitely recall a lot about area and volume from the simple formulas

More information

ABSTRACT INTEGRATION CHAPTER ONE

ABSTRACT INTEGRATION CHAPTER ONE CHAPTER ONE ABSTRACT INTEGRATION Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any form or by any means. Suggestions and errors are invited and can be mailed

More information

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias Notes on Measure, Probability and Stochastic Processes João Lopes Dias Departamento de Matemática, ISEG, Universidade de Lisboa, Rua do Quelhas 6, 1200-781 Lisboa, Portugal E-mail address: jldias@iseg.ulisboa.pt

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

Confidence Intervals and Hypothesis Tests

Confidence Intervals and Hypothesis Tests Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.

More information

Basic Measure and Integration Theory. Michael L. Carroll

Basic Measure and Integration Theory. Michael L. Carroll Basic Measure and Integration Theory Michael L. Carroll Sep 22, 2002 Measure Theory: Introduction What is measure theory? Why bother to learn measure theory? 1 What is measure theory? Measure theory is

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

Joint Distribution of Two or More Random Variables

Joint Distribution of Two or More Random Variables Joint Distribution of Two or More Random Variables Sometimes more than one measurement in the form of random variable is taken on each member of the sample space. In cases like this there will be a few

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

Lecture 7. Sums of random variables

Lecture 7. Sums of random variables 18.175: Lecture 7 Sums of random variables Scott Sheffield MIT 18.175 Lecture 7 1 Outline Definitions Sums of random variables 18.175 Lecture 7 2 Outline Definitions Sums of random variables 18.175 Lecture

More information

1 Sequences of events and their limits

1 Sequences of events and their limits O.H. Probability II (MATH 2647 M15 1 Sequences of events and their limits 1.1 Monotone sequences of events Sequences of events arise naturally when a probabilistic experiment is repeated many times. For

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Almost Sure Convergence of a Sequence of Random Variables

Almost Sure Convergence of a Sequence of Random Variables Almost Sure Convergence of a Sequence of Random Variables (...for people who haven t had measure theory.) 1 Preliminaries 1.1 The Measure of a Set (Informal) Consider the set A IR 2 as depicted below.

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Measures. Chapter Some prerequisites. 1.2 Introduction

Measures. Chapter Some prerequisites. 1.2 Introduction Lecture notes Course Analysis for PhD students Uppsala University, Spring 2018 Rostyslav Kozhan Chapter 1 Measures 1.1 Some prerequisites I will follow closely the textbook Real analysis: Modern Techniques

More information

A random variable is a quantity whose value is determined by the outcome of an experiment.

A random variable is a quantity whose value is determined by the outcome of an experiment. Random Variables A random variable is a quantity whose value is determined by the outcome of an experiment. Before the experiment is carried out, all we know is the range of possible values. Birthday example

More information

University of Sheffield. School of Mathematics & and Statistics. Measure and Probability MAS350/451/6352

University of Sheffield. School of Mathematics & and Statistics. Measure and Probability MAS350/451/6352 University of Sheffield School of Mathematics & and Statistics Measure and Probability MAS350/451/6352 Spring 2018 Chapter 1 Measure Spaces and Measure 1.1 What is Measure? Measure theory is the abstract

More information

Tips and Tricks in Real Analysis

Tips and Tricks in Real Analysis Tips and Tricks in Real Analysis Nate Eldredge August 3, 2008 This is a list of tricks and standard approaches that are often helpful when solving qual-type problems in real analysis. Approximate. There

More information

Measures and Measure Spaces

Measures and Measure Spaces Chapter 2 Measures and Measure Spaces In summarizing the flaws of the Riemann integral we can focus on two main points: 1) Many nice functions are not Riemann integrable. 2) The Riemann integral does not

More information

Chapter 4. Measure Theory. 1. Measure Spaces

Chapter 4. Measure Theory. 1. Measure Spaces Chapter 4. Measure Theory 1. Measure Spaces Let X be a nonempty set. A collection S of subsets of X is said to be an algebra on X if S has the following properties: 1. X S; 2. if A S, then A c S; 3. if

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017. Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 017 Nadia S. Larsen 17 November 017. 1. Construction of the product measure The purpose of these notes is to prove the main

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Some Basic Concepts of Probability and Information Theory: Pt. 1

Some Basic Concepts of Probability and Information Theory: Pt. 1 Some Basic Concepts of Probability and Information Theory: Pt. 1 PHYS 476Q - Southern Illinois University January 18, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and

More information

1 Measurable Functions

1 Measurable Functions 36-752 Advanced Probability Overview Spring 2018 2. Measurable Functions, Random Variables, and Integration Instructor: Alessandro Rinaldo Associated reading: Sec 1.5 of Ash and Doléans-Dade; Sec 1.3 and

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland. Measures These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland. 1 Introduction Our motivation for studying measure theory is to lay a foundation

More information

ST213 Mathematics of Random Events

ST213 Mathematics of Random Events ST213 Mathematics of Random Events Wilfrid S. Kendall version 1.0 28 April 1999 1. Introduction The main purpose of the course ST213 Mathematics of Random Events (which we will abbreviate to MoRE) is to

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13 Discrete Mathematics & Mathematical Reasoning Chapter 7 (continued): Markov and Chebyshev s Inequalities; and Examples in probability: the birthday problem Kousha Etessami U. of Edinburgh, UK Kousha Etessami

More information

Lecture Notes 3 Convergence (Chapter 5)

Lecture Notes 3 Convergence (Chapter 5) Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Machine Learning. Instructor: Pranjal Awasthi

Machine Learning. Instructor: Pranjal Awasthi Machine Learning Instructor: Pranjal Awasthi Course Info Requested an SPN and emailed me Wait for Carol Difrancesco to give them out. Not registered and need SPN Email me after class No promises It s a

More information

Probability. Carlo Tomasi Duke University

Probability. Carlo Tomasi Duke University Probability Carlo Tomasi Due University Introductory concepts about probability are first explained for outcomes that tae values in discrete sets, and then extended to outcomes on the real line 1 Discrete

More information

MA 3280 Lecture 05 - Generalized Echelon Form and Free Variables. Friday, January 31, 2014.

MA 3280 Lecture 05 - Generalized Echelon Form and Free Variables. Friday, January 31, 2014. MA 3280 Lecture 05 - Generalized Echelon Form and Free Variables Friday, January 31, 2014. Objectives: Generalize echelon form, and introduce free variables. Material from Section 3.5 starting on page

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6 MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION Extra Reading Material for Level 4 and Level 6 Part A: Construction of Lebesgue Measure The first part the extra material consists of the construction

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016 Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall

More information

2.1 Lecture 5: Probability spaces, Interpretation of probabilities, Random variables

2.1 Lecture 5: Probability spaces, Interpretation of probabilities, Random variables Chapter 2 Kinetic Theory 2.1 Lecture 5: Probability spaces, Interpretation of probabilities, Random variables In the previous lectures the theory of thermodynamics was formulated as a purely phenomenological

More information

Then A n W n, A n is compact and W n is open. (We take W 0 =(

Then A n W n, A n is compact and W n is open. (We take W 0 =( 9. Mon, Nov. 4 Another related concept is that of paracompactness. This is especially important in the theory of manifolds and vector bundles. We make a couple of preliminary definitions first. Definition

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

1 Probability space and random variables

1 Probability space and random variables 1 Probability space and random variables As graduate level, we inevitably need to study probability based on measure theory. It obscures some intuitions in probability, but it also supplements our intuition,

More information

Building Infinite Processes from Finite-Dimensional Distributions

Building Infinite Processes from Finite-Dimensional Distributions Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional

More information

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2 IEOR 316: Introduction to Operations Research: Stochastic Models Professor Whitt SOLUTIONS to Homework Assignment 2 More Probability Review: In the Ross textbook, Introduction to Probability Models, read

More information

Multivariate Distributions (Hogg Chapter Two)

Multivariate Distributions (Hogg Chapter Two) Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous

More information

1 Stat 605. Homework I. Due Feb. 1, 2011

1 Stat 605. Homework I. Due Feb. 1, 2011 The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due

More information

Chapter 8. General Countably Additive Set Functions. 8.1 Hahn Decomposition Theorem

Chapter 8. General Countably Additive Set Functions. 8.1 Hahn Decomposition Theorem Chapter 8 General Countably dditive Set Functions In Theorem 5.2.2 the reader saw that if f : X R is integrable on the measure space (X,, µ) then we can define a countably additive set function ν on by

More information

A course in. Real Analysis. Taught by Prof. P. Kronheimer. Fall 2011

A course in. Real Analysis. Taught by Prof. P. Kronheimer. Fall 2011 A course in Real Analysis Taught by Prof. P. Kronheimer Fall 20 Contents. August 3 4 2. September 2 5 3. September 7 8 4. September 9 5. September 2 4 6. September 4 6 7. September 6 9 8. September 9 22

More information

Week 2: Review of probability and statistics

Week 2: Review of probability and statistics Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED

More information

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 202 Vazirani Note 4 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected in, randomly

More information

Refresher on Discrete Probability

Refresher on Discrete Probability Refresher on Discrete Probability STAT 27725/CMSC 25400: Machine Learning Shubhendu Trivedi University of Chicago October 2015 Background Things you should have seen before Events, Event Spaces Probability

More information

18.175: Lecture 2 Extension theorems, random variables, distributions

18.175: Lecture 2 Extension theorems, random variables, distributions 18.175: Lecture 2 Extension theorems, random variables, distributions Scott Sheffield MIT Outline Extension theorems Characterizing measures on R d Random variables Outline Extension theorems Characterizing

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Inference for Stochastic Processes

Inference for Stochastic Processes Inference for Stochastic Processes Robert L. Wolpert Revised: June 19, 005 Introduction A stochastic process is a family {X t } of real-valued random variables, all defined on the same probability space

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time: Chapter 1 Random Variables 1.1 Elementary Examples We will start with elementary and intuitive examples of probability. The most well-known example is that of a fair coin: if flipped, the probability of

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information